T-statistic confidence interval | Inferential statistics | Probability and Statistics | Khan Academy - YouTube

Channel: Khan Academy

[0]
This is the same problem that we had in the last video.
[3]
But instead of trying to figure out whether the data
[5]
supplies sufficient evidence to conclude that the engines
[8]
meet the actual emissions requirement, and all of the
[10]
hypothesis testing, I thought I would also use the same data
[14]
that we had in the last video to actually come up with a 95%
[18]
confidence interval.
[19]
So you could ignore the question right here.
[21]
You can ignore all of this.
[23]
I'm just using that same data to come up with a 95%
[26]
confidence interval for the actual mean emission for this
[29]
new engine design.
[31]
So we want to find a 95% confidence interval.
[40]
And as you could imagine, because we only have 10
[43]
samples right here, we're going to want to use a
[46]
T-distribution.
[47]
And right down here I have a T-table.
[51]
And we want a 95% confidence interval.
[53]
So we want to think about the range of T-values that 95-- or
[59]
the range that 95% of T-values will fall under.
[63]
So let's think about this way.
[64]
So let me draw a
[65]
T-distribution right over here.
[71]
So a T-distribution looks very similar to a normal
[74]
distribution but it has fatter tails.
[78]
This end and this end will be fatter than in a normal
[80]
distribution.
[81]
And then we want to find an interval, so if this is a
[85]
normalized T-distribution the mean is going to be 0.
[89]
And we want to find interval of T-values between some
[92]
negative value here and some positive value here that
[95]
contains 95% of the probability.
[101]
So this right here has to be 95%.
[105]
And to figure what these critical T-values are at this
[108]
end and this end, we can just use a T-table.
[110]
And we're going to use the two-sided version of this
[113]
because we're symmetric around the center.
[116]
So you look at the two-sided, we want a 95% confidence
[119]
interval, so we're going to look right over here, 95%
[123]
confidence interval.
[124]
We have 10 data points, which means we have
[127]
9 degrees of freedom.
[129]
So 9 degrees of freedom for our 10 data points.
[132]
We just took 10 minus 1.
[133]
So if we look over here, so for a T-distribution with 9
[139]
degrees of freedom, you're going to have 95% of the
[142]
probability is going to be contained within a T-value
[146]
of-- so the T-value is going to be between negative, so
[149]
this value right here is 2.262, and this value right
[154]
here is negative 2.262.
[157]
That's what this right here tells us.
[159]
That if you contain all the values that are less than
[162]
2.262 away from the center of your T-distribution, you will
[168]
contain 95% of the probability.
[172]
So that is our T-distribution right there.
[174]
Let me make it very clear.
[175]
This is our T-distribution.
[180]
So if you randomly pick a T-value from this
[184]
T-distribution, it has a 95% chance of being within this
[190]
far from the mean.
[190]
Or maybe we should write this way.
[192]
If I pick a random T-value, if I take a random T-statistic--
[197]
let me write it this way-- there's a 95% chance that a
[203]
random T-statistic is going to be less than 2.262, and
[213]
greater than negative 2.262.
[220]
95% percent chance.
[223]
Now when we took this sample, we could also derive a random
[229]
T-statistic from this.
[230]
We have our sample mean and our sample standard deviation,
[234]
our sample mean here is
[237]
17.17-- figured that out in the last video, just add these
[241]
up, divide by 10-- and our sample standard
[243]
deviation here is 2.98.
[247]
So the T-statistic that we can derive from this information
[251]
right over here-- so let me write it over here-- the
[253]
T-statistic that we could derive from this, and you can
[255]
view this T-statistic as being a random sample from a
[258]
T-distribution.
[260]
A T-distribution with 9 degrees of freedom.
[263]
So the T-statistic that we could derive from that is
[266]
going to be our mean, 17.17 minus the true mean of our
[274]
population.
[275]
Or actually you would say the true mean of our sampling
[279]
distribution, which is also going to be the same as the
[281]
true mean of our population, because that's our population
[285]
mean over there, divided by s, which is 2.98 over the square
[292]
root of our number of samples.
[294]
We've seen this multiple times.
[296]
This right here is the T-statistic.
[298]
So by taking this sample you can say that we've randomly
[301]
sampled a T-statistic from this 9 degree of freedom
[305]
T-distribution.
[306]
So there's a 95% chance that this thing right over here is
[311]
going to be between-- is going to be less than 2.262 and
[317]
greater than negative 2.262.
[320]
So the 95% probability still applies to this right here.
[330]
Now we just have to do some math, calculate these things.
[333]
So let me get my calculator out.
[339]
And so let me just calculate this
[341]
denominator right over here.
[342]
So we have 2.98 divided by the square root of 10.
[350]
So that's 0.9423.
[355]
So what I'm going to do is I'm going to multiply both sides
[358]
of this equation by this expression right over here.
[361]
So if I do that-- so let me just do that right over-- so
[365]
if I multiply this entire-- this is really two equations
[368]
or two inequalities I should say.
[371]
That this quantity is greater than this quantity and that
[373]
this quantity's greater than that quantity.
[374]
But we can operate on all of them at the same time, this
[377]
entire inequality.
[378]
So what we want to do is multiply this entire
[380]
inequality by this value right over here.
[384]
And we just calculated it at that value-- let me write it
[387]
over here-- that 2.98-- I'll write it right over here--
[391]
2.98 over the square root of 10 is equal to 0.942.
[399]
So if I multiplied this entire inequality by 0.942 I get, on
[405]
this left-hand side over here I have negative 2.262 times
[413]
0.942-- and it's a positive number that we're multiplying
[418]
the whole inequality by, so the inequality signs are still
[420]
going to be in the same direction-- is less than--
[423]
we're multiplying this whole expression by the same
[427]
expression in the denominator so it'll cancel out.
[430]
So we're just going to be less than 17.17 minus our
[436]
population mean, which is going to be less than 2.262
[442]
times, once again, 0.942.
[447]
Let me scroll over to the right a little bit.
[450]
0.942.
[453]
Just be clear, I'm just multiplying all three sides of
[457]
this inequality by this number right over here.
[460]
In the middle this cancels out.
[461]
So if I multiply-- I'll just write it over here-- 0.942,
[465]
0.942, 0.942.
[471]
This and this is the same number so that's why those
[473]
cancel out.
[474]
And now let's get the calculator to figure out what
[476]
these numbers are.
[477]
So if we have the 0.942 times 2.262.
[482]
So we're going to say times 2.262 is 2.13.
[490]
So this number right over here on the
[494]
right-hand side is 2.13.
[501]
This number on the left is just the negative of that.
[503]
So it's negative 2.13.
[507]
And then we still have our inequalities-- is going to be
[511]
less than 17.17 minus the mean, which is less than 2.13.
[519]
Now what I want to do is I actually want to
[520]
solve for this mean.
[522]
And I don't like that negative sign in the mean.
[524]
I'd rather have this swapped around.
[525]
I'd rather have the mean minus 17.17.
[528]
So what I'm going to do is multiply this entire
[530]
inequality by negative 1.
[532]
If you do that, if you multiply the entire thing
[535]
times negative 1, this quantity right here, this
[538]
negative 2.13 will become a positive 2.13.
[542]
But since we are multiplying an inequality by a negative
[545]
number you have to swap the inequality sign.
[548]
So this less than will become a greater than.
[550]
This negative mu will become a positive mu.
[554]
This positive 17.17 will become a negative 17.17.
[560]
We're going to have to swap this inequality sign as well,
[562]
and this positive 2.13 will become a negative 2.13.
[568]
And we're almost there.
[568]
We just want to solve for mu.
[571]
Have this inequality expressed in terms of mu.
[573]
So what we can do is now just add 17.17 to all three sides
[577]
of this inequality, and we are left with 2.13 plus 17.17 is
[584]
greater than mu minus 17.17 plus 17.17 is just going to be
[589]
mu, which is greater than-- so this is greater than mu, which
[594]
is greater than negative 2.13 plus 17.17.
[600]
Or a more natural way to write it since we actually have a
[603]
bunch of greater than signs, that this is actually the
[605]
largest number and this-- oh sorry, this is actually the
[608]
smallest number and this over here is actually the largest
[612]
number, is actually flipped-- you can just re-write this
[614]
inequality the other way.
[615]
So now we can write-- actually let's just figure out what
[619]
these values are.
[620]
So we have 2.13 plus 17.17.
[627]
So that is the high end of our range.
[632]
So that is 19.3.
[634]
So this value right over here, so this is 19-- let me do it
[640]
in that same color-- this value right here is 19.3 is
[646]
going to be greater than mu, which is going to be greater
[651]
than-- and this is negative 2.13 plus 17.17.
[654]
Or we could have 17.17 minus 2.13, which gives us 15.04.
[668]
And remember, the whole thing, all of this, we started with,
[671]
there was a 95% chance that a random T-statistic will fall
[676]
in this interval.
[678]
We had a random T-statistic, and all we did
[681]
is a bunch of math.
[682]
So there's a 95% chance that any of these steps are true.
[686]
So there's a 95% chance that this is true.
[689]
There's a 95% chance that the true population mean, which is
[693]
the same thing as the mean of the sampling distribution of
[696]
the sample mean, there's a 95% chance, or that we are
[700]
confident that there's a 95% chance, that it will fall in
[703]
this interval.
[704]
And we're done.