🔍

Central limit theorem | Inferential statistics | Probability and Statistics | Khan Academy - YouTube

Channel: Khan Academy

[0]

In this video, I want to talk about what is easily

[3]

one of the most fundamental and profound concepts in statistics

[6]

and maybe in all of mathematics.

[8]

And that's the central limit theorem.

[16]

And what it tells us is we can start off

[18]

with any distribution that has a well-defined mean and

[21]

variance-- and if it has a well-defined variance,

[23]

it has a well-defined standard deviation.

[25]

And it could be a continuous distribution or a discrete one.

[27]

I'll draw a discrete one, just because it's easier

[29]

to imagine, at least for the purposes of this video.

[33]

So let's say I have a discrete probability distribution

[36]

function.

[37]

And I want to be very careful not

[38]

to make it look anything close to a normal distribution.

[41]

Because I want to show you the power of the central limit

[43]

theorem.

[44]

So let's say I have a distribution.

[45]

Let's say it could take on values 1 through 6.

[47]

1, 2, 3, 4, 5, 6.

[50]

It's some kind of crazy dice.

[52]

It's very likely to get a one.

[54]

Let's say it's impossible-- well,

[55]

let me make that a straight line.

[56]

You have a very high likelihood of getting a 1.

[58]

Let's say it's impossible to get a 2.

[60]

Let's say it's an OK likelihood of getting a 3 or a 4.

[63]

Let's say it's impossible to get a 5.

[64]

And let's say it's very likely to get a 6 like that.

[67]

So that's my probability distribution function.

[70]

If I were to draw a mean-- this the symmetric,

[72]

so maybe the mean would be something like that.

[74]

The mean would be halfway.

[76]

So that would be my mean right there.

[77]

The standard deviation maybe would

[79]

look-- it would be that far and that

[80]

far above and below the mean.

[82]

But that's my discrete probability distribution

[85]

function.

[86]

Now what I'm going to do here, instead of just taking

[88]

samples of this random variable that's

[90]

described by this probability distribution function,

[93]

I'm going to take samples of it.

[95]

But I'm going to average the samples

[97]

and then look at those samples and see

[99]

the frequency of the averages that I get.

[101]

And when I say average, I mean the mean.

[104]

Let me define something.

[105]

Let's say my sample size-- and I could put any number here.

[108]

But let's say first off we try a sample size of n is equal to 4.

[117]

And what that means is I'm going to take four samples from this.

[120]

So let's say the first time I take four samples--

[123]

so my sample sizes is four-- let's say I get a 1.

[125]

Let's say I get another 1.

[127]

And let's say I get a 3.

[129]

And I get a 6.

[130]

So that right there is my first sample of sample size 4.

[134]

I know the terminology can get confusing.

[136]

Because this is the sample that's made up of four samples.

[139]

But then when we talk about the sample mean and the sampling

[143]

distribution of the sample mean, which we're

[145]

going to talk more and more about over the next few videos,

[147]

normally the sample refers to the set of samples

[152]

from your distribution.

[153]

And the sample size tells you how many you actually

[155]

took from your distribution.

[157]

But the terminology can be very confusing,

[159]

because you could easily view one of these as a sample.

[162]

But we're taking four samples from here.

[164]

We have a sample size of four.

[165]

And what I'm going to do is I'm going to average them.

[168]

So let's say the mean-- I want to be very careful when

[170]

I say average.

[171]

The mean of this first sample of size 4 is what?

[175]

1 plus 1 is 2.

[176]

2 plus 3 is 5.

[178]

5 plus 6 is 11.

[179]

11 divided by 4 is 2.75.

[186]

That is my first sample mean for my first sample of size 4.

[191]

Let me do another one.

[192]

My second sample of size 4, let's say that I get a 3, a 4.

[199]

Let's say I get another 3.

[200]

And let's say I get a 1.

[201]

I just didn't happen to get a 6 that time.

[203]

And notice I can't get a 2 or a 5.

[205]

It's impossible for this distribution.

[207]

The chance of getting a 2 or 5 is 0.

[208]

So I can't have any 2s or 5s over here.

[211]

So for the second sample of sample size 4,

[217]

my second sample mean is going to be 3 plus 4 is 7.

[222]

7 plus 3 is 10 plus 1 is 11.

[226]

11 divided by 4, once again, is 2.75.

[229]

Let me do one more, because I really

[231]

want to make it clear what we're doing here.

[233]

So I do one more.

[233]

Actually, we're going to do a gazillion more.

[235]

But let me just do one more in detail.

[237]

So let's say my third sample of sample size 4--

[241]

so I'm going to literally take 4 samples.

[243]

So my sample is made up of 4 samples

[245]

from this original crazy distribution.

[248]

Let's say I get a 1, a 1, and a 6 and a 6.

[253]

And so my third sample mean is going to be 1 plus 1 is 2.

[258]

2 plus 6 is 8.

[260]

8 plus 6 is 14.

[261]

14 divided by 4 is 3 and 1/2.

[269]

And as I find each of these sample

[272]

means-- so for each of my samples of sample size 4,

[275]

I figure out a mean.

[276]

And as I do each of them, I'm going

[278]

to plot it on a frequency distribution.

[280]

And this is all going to amaze you in a few seconds.

[284]

So I plot this all on a frequency distribution.

[286]

So I say, OK, on my first sample,

[289]

my first sample mean was 2.75.

[292]

So I'm plotting the actual frequency of the sample

[294]

means I get for each sample.

[295]

So 2.75, I got it one time.

[298]

So I'll put a little plot there.

[299]

So that's from that one right there.

[302]

And the next time, I also got a 2.75.

[304]

That's a 2.75 there.

[306]

So I got it twice.

[308]

So I'll plot the frequency right there.

[310]

Then I got a 3 and 1/2.

[311]

So all the possible values, I could have a three,

[313]

I could have a 3.25, I could have a 3 and 1/2.

[316]

So then I have the 3 and 1/2, so I'll plot it right there.

[319]

And what I'm going to do is I'm going

[320]

to keep taking these samples.

[322]

Maybe I'll take 10,000 of them.

[325]

So I'm going to keep taking these samples.

[327]

So I go all the way to S 10,000.

[329]

I just do a bunch of these.

[331]

And what it's going to look like over time is each of these--

[333]

I'm going to make it a dot, because I'm

[335]

going to have to zoom out.

[337]

So if I look at it like this, over time-- it still

[341]

has all the values that it might be able to take on,

[343]

2.75 might be here.

[345]

So this first dot is going to be-- this one

[348]

right here is going to be right there.

[350]

And that second one is going to be right there.

[352]

Then that one at 3.5 is going to look right there.

[356]

But I'm going to do it 10,000 times.

[357]

Because I'm going to have 10,000 dots.

[359]

And let's say as I do it, I'm going just keep plotting them.

[361]

I'm just going to keep plotting the frequencies.

[364]

I'm just going to keep plotting them

[365]

over and over and over again.

[368]

And what you're going to see is, as I take

[369]

many, many samples of size 4, I'm

[372]

going to have something that's going

[374]

to start kind of approximating a normal distribution.

[378]

So each of these dots represent an incidence of a sample mean.

[382]

So as I keep adding on this column right here,

[384]

that means I kept getting the sample mean 2.75.

[387]

So over time.

[388]

I'm going to have something that's

[390]

starting to approximate a normal distribution.

[392]

And that is a neat thing about the central limit theorem.

[399]

So an orange, that's the case for n is equal to 4.

[402]

This was a sample size of 4.

[405]

Now, if I did the same thing with a sample size of maybe

[408]

20-- so in this case, instead of just taking 4 samples

[411]

from my original crazy distribution, every sample

[415]

I take 20 instances of my random variable,

[418]

and I average those 20.

[420]

And then I plot the sample mean on here.

[422]

So in that case, I'm going to have

[424]

a distribution that looks like this.

[426]

And we'll discuss this in more videos.

[428]

But it turns out if I were to plot 10,000 of the sample

[432]

means here, I'm going to have something

[434]

that, two things-- it's going to even more closely approximate

[437]

a normal distribution.

[438]

And we're going to see in future videos,

[440]

it's actually going to have a smaller-- well,

[442]

let me be clear.

[443]

It's going to have the same mean.

[445]

So that's the mean.

[446]

This is going to have the same mean.

[448]

So it's going to have a smaller standard deviation.

[451]

Well, I should plot these from the bottom

[453]

because you kind of stack it.

[454]

One you get one, then another instance and another instance.

[457]

But this is going to more and more approach

[458]

a normal distribution.

[460]

So this is what's super cool about the central limit

[464]

theorem.

[465]

As your sample size becomes larger--

[473]

or you could even say as it approaches infinity.

[475]

But you really don't have to get that close

[476]

to infinity to really get close to a normal distribution.

[478]

Even if you have a sample size of 10 or 20,

[481]

you're already getting very close to a normal distribution,

[484]

in fact about as good an approximation

[486]

as we see in our everyday life.

[487]

But what's cool is we can start with some crazy distribution.

[491]

This has nothing to do with a normal distribution.

[495]

This was n equals 4, but if we have a sample size of n

[497]

equals 10 or n equals 100, and we

[499]

were to take 100 of these, instead of four here,

[502]

and average them and then plot that average,

[504]

the frequency of it, then we take 100 again, average them,

[507]

take the mean, plot that again, and if we

[508]

do that a bunch of times, in fact,

[510]

if we were to do that an infinite time,

[512]

we would find that we, especially

[513]

if we had an infinite sample size,

[515]

we would find a perfect normal distribution.

[517]

That's the crazy thing.

[519]

And it doesn't apply just to taking the sample mean.

[522]

Here we took the sample mean every time.

[524]

But you could have also taken the sample sum.

[526]

The central limit theorem would have still applied.

[528]

But that's what's so super useful about it.

[531]

Because in life, there's all sorts of processes out there,

[534]

proteins bumping into each other, people doing

[537]

crazy things, humans interacting in weird ways.

[540]

And you don't know the probability distribution

[542]

functions for any of those things.

[544]

But what the central limit theorem

[545]

tells us is if we add a bunch of those actions

[548]

together, assuming that they all have the same distribution,

[551]

or if we were to take the mean of all of those actions

[553]

together, and if we were to plot the frequency of those means,

[556]

we do get a normal distribution.

[558]

And that's frankly why the normal distribution shows up

[562]

so much in statistics and why, frankly, it's

[565]

a very good approximation for the sum

[568]

or the means of a lot of processes.

[571]

Normal distribution.

[573]

What I'm going to show you in the next video is I'm actually

[576]

going to show you that this is a reality, that as you increase

[579]

your sample size, as you increase your n,

[581]

and as you take a lot of sample means,

[583]

you're going to have a frequency plot that looks very, very

[585]

close to a normal distribution.

Most Recent Videos:

WE KILLED 6 HEROIC BOSSES! - YouTube

¿Quién inventó el dinero? - YouTube

Cuándo se inventó el dinero y cómo el dólar se convirtió en la principal moneda del mundo - YouTube

This Citizenship Program is Failing - YouTube

Candida Treatment Protocol w/ Dr. DiNezza - YouTube

$500M investor reacts to Real Estate Tik Toks 2 - YouTube

You can go back to the homepage right here: Homepage