🔍

The standard error, Clearly Explained!!! - YouTube

Channel: StatQuest with Josh Starmer

[0]

step quest step quest stack quest

[7]

hello and welcome to stab quest this

[10]

time we're gonna talk about standard

[12]

errors and we're also gonna have a

[14]

bootstrapping bonus we'll start by

[17]

talking about error bars which are very

[19]

closely related to standard errors for

[23]

example you might collect measurements

[25]

from three samples labeled a B and C and

[29]

plot them on a scatter plot just like we

[31]

see here

[33]

you could then calculate the means for

[36]

the three data sets and we Illustrated

[38]

those here with three green horizontal

[41]

bars approximately halfway up the

[44]

clusters of data points after that we

[48]

could calculate the standard deviations

[50]

and add those to the graph and we've

[51]

shown those here with red error bars in

[55]

manuscripts and presentations people

[58]

often don't display the original data

[60]

but instead just show the mean in the

[62]

standard deviation and what's called a

[64]

dynamite plot because each column in the

[67]

plot looks like it's the igniter for a

[69]

stick of dynamite

[72]

there are three common types of error

[75]

bars the first type of standard

[77]

deviations which we just saw and I'm

[80]

sure you're all familiar with these tell

[82]

you how the data are distributed around

[84]

the mean big standard deviations tell

[87]

you that some of the data points were

[88]

pretty far from the mean in most cases

[91]

you want to use standard deviations in

[93]

your graphs since it tells us about your

[95]

data the data points that you collected

[98]

yourself the second type of error bar

[101]

comes from standard errors these tell

[104]

you how the mean is distributed not just

[107]

the data but the means which sounds

[109]

crazy but it'll become clear once I draw

[112]

some pictures the third common type of

[115]

error bar are confidence intervals and

[118]

these are related to standard errors

[120]

confidence intervals will be explained

[122]

more in a future stat quest since this

[125]

stat quest is all about standard errors

[128]

that's what we're going to talk about

[131]

let's start by considering a normal

[134]

distribution in this case we can imagine

[137]

that we weighed a lot of mice and

[138]

plotted the distribution of differences

[141]

from the mean the y-axis is the

[145]

proportion of the mice that we weighed

[146]

in the x-axis is the difference from the

[149]

mean most of the mice had weights close

[153]

to the average a few of the mice weighed

[156]

much less than the average Mouse and a

[159]

few other mice weighed much more than

[161]

the average Mouse usually you can't

[165]

afford to measure the weight of all the

[167]

mice so you just take a sample in this

[170]

example we'll just assume we took five

[172]

measurements from the population rather

[175]

than measuring all the mice since most

[178]

of the mice have weights close to the

[180]

average most of our samples are going to

[183]

be close to zero now just like we always

[187]

do we can calculate the mean and

[189]

standard deviation from our sample in

[191]

this case the mean of our sample is

[195]

minus point two and the standard

[197]

deviation is one point nine to three and

[199]

we can plot the mean and standard

[202]

deviation on our graph as the mean

[205]

plus or minus the standard deviation

[207]

around the mean and for all you stat

[210]

Questers out there here's a rule of

[212]

thumb remember that one standard

[215]

deviation on each side of the mean is

[217]

supposed to cover about 68% of the data

[219]

two standard deviations on each side of

[222]

the mean is supposed to cover about 95

[224]

percent of the data this will come in

[226]

handy later

[228]

the mean is now a lighter color because

[231]

we're going to take additional samples

[233]

and overlay additional means and

[235]

standard deviations on this same graph

[237]

here we've taken another five

[240]

measurements and from those five

[242]

measurements we've calculated the mean

[244]

and the standard deviation and here

[247]

we've plotted that mean plus or minus

[249]

one standard deviation on each side and

[252]

now we take another five measurements

[254]

this is the first sample or one of the

[257]

measurements is relatively extreme

[259]

however that one measurement doesn't

[262]

sway the mean that far from zero that is

[267]

to say the means are relatively close to

[270]

each other compared to the raw data this

[273]

is because for a mean to be far from the

[275]

middle most if not all of the raw data

[277]

points would have to be in a single

[280]

cluster that is far away from the middle

[282]

for example the sample of purple points

[285]

all form a cluster that are far from the

[287]

middle this could happen but very rarely

[292]

what's much more likely is to have a

[295]

sample where most of the points are

[297]

close to zero and only one or two are

[300]

far away so far we've shown that you can

[304]

calculate the standard deviations for

[306]

each sample but now that we have three

[309]

means we can also calculate the standard

[312]

deviation of those means because one

[315]

standard deviation will cover 68% of the

[318]

values and two will cover 95% of the

[321]

values the standard deviation of the

[323]

means won't be as wide as the standard

[325]

deviations of the data here we've

[329]

plotted the mean of the means plus or

[331]

minus one standard deviation of the

[333]

means notice that this standard

[336]

deviation is much smaller than the

[338]

standard deviations we got from the

[340]

individual samples the standard

[344]

deviation of the mean is called the

[346]

standard error of the mean or more

[348]

simply the standard error the standard

[351]

error gives us a sense of how much

[353]

variation we can expect in our means if

[356]

we took a bunch of independent five

[357]

measurement samples so to review this is

[361]

how we calculate this

[362]

an error of the mean first you take a

[366]

bunch of samples each with the same

[369]

number of measurements or in in this

[371]

case in equals five the second step is

[376]

to calculate the mean for each sample

[378]

here we calculated the mean and standard

[380]

deviation for each sample but for the

[383]

standard error all we need to do is

[385]

calculate the mean once we've calculated

[389]

the means for each sample we can

[391]

calculate the standard deviation of the

[393]

means in this case the standard error

[396]

equals zero point eight six here we

[401]

notice that the standard error is much

[403]

less than the standard deviations

[404]

because the means aren't as widely

[406]

dispersed as the raw data we've shown

[410]

how to calculate the standard error of

[412]

the mean but there are other standard

[414]

errors for example we can also take the

[418]

standard deviation of the standard

[420]

deviations this is called the standard

[422]

error of the standard deviations which I

[424]

guess is to avoid a tongue-twister it

[427]

tells us how the standard deviations of

[429]

multiple samples are dispersed you can

[432]

calculate the standard deviation of any

[434]

statistic for example the median the

[436]

mode percentiles are anything anything

[439]

that you can calculate for multiple

[441]

samples you just calculate the standard

[444]

deviation and then you have the standard

[446]

error of that so if we calculated many

[448]

mediums we could calculate the standard

[451]

deviation of those mediums and we have

[453]

the standard error of those mediums to

[457]

summarize everything we've talked about

[459]

so far know that the standard error is

[462]

just the standard deviation of multiple

[465]

means taken from the same population so

[468]

if there's a population and we can take

[470]

a bunch of different samples from it all

[473]

we have to do to get the standard error

[474]

is to calculate the standard deviation

[476]

of the means of each sample well at this

[481]

point you might be wondering if we can

[482]

calculate standard errors without

[484]

spending a lot of time and money on

[486]

doing the same experiment a bunch of

[488]

times the good news is the answer is yes

[492]

in rare cases there's a formula you can

[495]

use to estimate

[496]

the standard error for the mean is 1 the

[499]

formula for that is very simple it's

[501]

just the standard deviation divided by

[503]

the square root of the sample size

[504]

however there aren't many other cases

[508]

the good news again is that we can use

[511]

something called bootstrapping for

[513]

everything else

[514]

every time we don't have a simple

[516]

formula we can bootstrap it the nice

[518]

thing about bootstrapping is it's very

[520]

simple conceptually and it's easy to

[522]

make a computer do this work here's a

[526]

bootstrapping example just like before

[529]

we have an experiment where we took 5

[531]

measurements as an aside usually for

[534]

bootstrapping it's good to have 10 or

[537]

more measurements in a single experiment

[539]

now we bootstrap our data with the

[544]

following steps first we pick a random

[547]

measurement from the sample that we just

[549]

took this random measurement isn't a new

[552]

measurement that we haven't taken before

[554]

it's not a new number that we haven't

[556]

seen it's part of the sample that we

[558]

already have now we just write that

[561]

value down in this case it's one point

[565]

four three in step three we just go back

[569]

to step one and pick a new random

[571]

measurement and write that value down

[573]

and we do that five times our second

[578]

measurement is minus one point three

[580]

eight the third measurement is minus

[583]

three point one one our fourth

[587]

measurement is one point four three

[590]

we've already picked that measurement

[592]

before but that's okay when you're

[595]

bootstrapping you just pick five

[597]

measurements from your sample and it

[600]

doesn't matter if you've picked the same

[601]

one before our last measurement is minus

[606]

zero point one zero step four in

[611]

bootstrapping is to calculate the mean

[613]

median mode or whatever the statistic it

[616]

is we're interested in understanding the

[618]

standard error of and we calculate that

[620]

with our sample in this case we're

[623]

interested in the standard error of the

[625]

mean so all we do is calculate the mean

[628]

from our new bootstrap sample the fifth

[631]

step is to go all the way back to the

[633]

beginning step one and repeat that until

[636]

you have a lot of means or medians or

[639]

whatever you're interested in

[640]

calculating the standard error of the

[643]

sixth and final step in the

[645]

bootstrapping procedure is to simply

[647]

calculate the standard deviation of all

[650]

the means that we generated in steps one

[652]

through five that's all there is to it

[655]

in this case we calculated the standard

[657]

error of the mean and we've plotted it

[660]

as a black line in the graph so if

[663]

there's no fancy formula to help us

[665]

calculate the standard error we can do

[667]

it ourselves from scratch we can just

[669]

use bootstrapping and get the job done

[671]

and that's it in this static quest we

[676]

learned that the standard error is a

[677]

measure of how we might expect the means

[680]

from many different samples to vary from

[683]

one sample to another we also learned

[686]

that if we don't have a fancy formula

[688]

for calculating the standard error we

[690]

can do it ourselves using bootstrapping

[693]

okay so tune in next time and we'll talk

[696]

about how to use bootstrapping to

[698]

calculate confidence intervals and

[700]

that's when things get really cool

Most Recent Videos:

WE KILLED 6 HEROIC BOSSES! - YouTube

¿Quién inventó el dinero? - YouTube

Cuándo se inventó el dinero y cómo el dólar se convirtió en la principal moneda del mundo - YouTube

This Citizenship Program is Failing - YouTube

Candida Treatment Protocol w/ Dr. DiNezza - YouTube

$500M investor reacts to Real Estate Tik Toks 2 - YouTube

You can go back to the homepage right here: Homepage