馃攳
What is a p-value? (Updated and extended version) - YouTube
Channel: unknown
[2]
In this video I'm going to look at the
question: what is a p-value?
[5]
I'm going to do one simple example
of finding the p-value,
[8]
but this video is mainly about the
concept of the p-value.
[14]
The p-value is a measure of the strength
of the evidence against the null hypothesis
[19]
that is provided by our sample data.
[24]
The p-value is the probability of
getting the observed value of the test statistic,
[29]
or a value with even greater evidence
against the null hypothesis,
[34]
if the null hypothesis is in fact true.
[37]
So first of all the p-value is a probability
[40]
and it is a probability calculated conditional
on the null hypothesis being true.
[49]
The definition is a bit of a mouthful, so
let's look at an example.
[55]
Suppose we wish to carry out a test of the
null hypothesis
[58]
that mu, the population mean, is equal to
some hypothesized value.
[62]
And suppose that we are sampling from a
normally distributed population,
[66]
where sigma is known.
[68]
If that's the case, the appropriate test
statistic is this Z test statistic.
[74]
If this null hypothesis is in fact true,
[77]
then mu is equal to mu_0,
[80]
and this statistic will have the standard normal distribution.
[84]
And so over here I have the distribution
of that Z test statistic, if the null hypothesis is true,
[92]
and that is the standard normal distribution.
[96]
Suppose for the sake of illustration
[99]
that we get a sample and we find that
the value of the test statistic is 2.05.
[106]
2.05 is right about here on the curve.
[111]
In this case our alternative hypothesis
[113]
is that the population mean is greater
than the hypothesized value,
[118]
and so large values of this test statistic
[121]
are going to give us evidence against
the null hypothesis.
[125]
The farther out in the right tail that
this test statistic is,
[129]
the greater the evidence against the
null hypothesis.
[136]
And recall that the p-value
[138]
is the probability of getting the observed
value of the test statistic,
[141]
or something with even greater evidence
against the null hypothesis,
[145]
if the null hypothesis is true.
[149]
So in this case, that's going to be the
probability, under the null hypothesis,
[153]
of getting the observed value of the test statistic
[157]
or something even farther out in the
right tail.
[161]
Or in other words, the area to the right of the observed test statistic.
[167]
That is going to be be the p-value here.
[172]
And if we went to software or a standard
normal table here
[175]
we'd see that this is approximately 0.020.
[183]
In this particular setting, the farther
out in the right tail
[186]
the observed value of the test statistic is,
[189]
the smaller the p-value, and the greater the evidence
against the null hypothesis.
[198]
And this is true in the more general setting.
[202]
The smaller the p-value, the greater the
evidence against the null hypothesis.
[210]
If we have a given significance level alpha,
[213]
then we reject the null hypothesis
[216]
if the p-value is less than or equal to
the significance level alpha.
[222]
Or we could say that the evidence against
the null hypothesis is significant
[228]
at the alpha level of significance.
[231]
And so we could consider alpha to be a cut-off level for significance.
[240]
In the real world, we're
not always going to have an alpha level given to us,
[243]
and then the situation is not quite so simple.
[248]
If we do have a given significance level, then
situation is not as cut and dried.
[255]
But it might help us come up with
a reasonable conclusion
[258]
if we understand the distribution of the p-value.
[264]
For continuous test statistics, under the assumptions of the model,
[268]
if the null hypothesis is true the p-value
[272]
will have a uniform distribution between
0 and 1.
[278]
And a little loosely speaking, any value
[280]
between 0 and 1 is equally likely to occur,
[285]
if the null hypothesis is true.
[287]
And so if the null hypothesis is true,
[290]
on average we're going to get a p-value of 0.5.
[296]
But it also might help with our
interpretation if we know something
[300]
about the distribution of the p-value when the null hypothesis is false.
[306]
So let's simulate a million samples
[309]
to investigate the distribution of the
p-value in different scenarios.
[315]
Here I've decided to test the null hypothesis that mu=0,
[320]
against a two-sided alternative.
[322]
The samples have 20 observations in them, and sigma is equal to 5.
[328]
And we are sampling from a normally distributed population,
[331]
so we'll be using as Z test.
[333]
And on this slide, there's three scenarios illustrated.
[337]
In this first one, mu=0,
[340]
so the true value of the population mean
is equal to the hypothesized mean,
[345]
or in other words, the null hypothesis is true.
[350]
Here's a histogram of the million p-values,
[353]
corresponding to those million different samples that I've simulated.
[356]
And as we can see, that p-value seems to
have that uniform distribution.
[362]
The theoretical average value for our
p-value when the null hypothesis is true
[366]
is 0.5, and here in the simulation,
we also get a value of 0.50 to 2 decimal places.
[377]
Down here I'm going to look at two
situations where the null hypothesis is false.
[382]
In this first one, the true value of mu is 1,
[386]
and we're still hypothesizing
that it's zero,
[388]
so the null hypothesis is false.
[391]
I've simulated a million samples where
mu is actually 1,
[395]
and find out what the p-value is, and
plot it in this histogram.
[402]
And here the distribution is not uniform anymore,
[405]
the distribution is moved towards 0.
[409]
And in this situation, for those million p-values,
[412]
we find an average p-value of 0.39.
[419]
In this other situation, the null hypothesis is still false,
[424]
but mu is even farther from the hypothesized value.
[429]
And this histogram of p-values is
shifted even farther over toward 0,
[435]
and the average p-value we got here is 0.18.
[442]
And so we can see here that when the
null hypothesis is true,
[446]
the p-value has a uniform distribution
between 0 and 1,
[450]
and when the null hypothesis is false,
the distribution of the p-value moves more toward zero.
[455]
So we're going to be more likely to get p-values near 0
[460]
when the null hypothesis is false
[463]
than when the null hypothesis is true.
[470]
We saw here that the distribution of the
p-value
[473]
depends on what the true value of the population mean is.
[477]
It's also going to depend on the sample size, and the standard deviation in this case.
[486]
To illustrate that,
[487]
let's see what happens when we increase
the sample size to 50.
[492]
The only thing that's been changed in
this simulation
[496]
is that the sample size has been
increased to 50.
[499]
We can see here when the null hypothesis
is true
[502]
that that distribution of the p-value is
still uniform between 0 and 1,
[509]
and when the null hypothesis is false,
[512]
the distribution of the p-value still, again, moves toward zero.
[518]
But because of the increased sample size,
[521]
down here when the null hypothesis is false,
[524]
the p-value distribution has shifted
even farther towards 0.
[529]
Over here the average p-value is now 0.27,
[534]
and over here the average p-value is now 0.04.
[538]
When we have a greater sample size,
[540]
our tests are going to have greater power,
[543]
and the distribution of the p-value
is going to be shifted more toward 0.
[550]
The overall lesson I'm trying to get at here
[552]
is that we're more likely to get p-values
close to 0
[555]
when the null hypothesis is wrong than
when the null hypothesis is right.
[560]
And so the smaller the p-value, the
greater the evidence against the null hypothesis.
[571]
I'm going to give a very rough guideline here.
[574]
It has to be a very rough guideline
[576]
because what we feel is strong evidence
against the null hypothesis
[580]
depends on the situation at hand as well
as the p-value.
[585]
But as a very rough guideline,
if the p-value is less than 0.01
[590]
we can say there is very strong evidence
against the null hypothesis.
[594]
If the p-value lies between point 0.01 and 0.05,
[598]
well, there's starting to be strong
evidence against the null hypothesis.
[604]
If the p-value is between 0.05 and 0.10,
[607]
there's some weak evidence against the
null hypothesis.
[611]
And if the p-value is greater than 0.10 we
say there's
[614]
little or no evidence against the null
hypothesis.
[618]
This does depend on the setting.
If the p-value is close to 0.10,
[622]
some situations we may feel that that
has a hint of evidence against the null hypothesis.
[626]
For example, if our p-value is 0.11 or 0.13 or something like that,
[631]
in some situations we might view that as a
hint of evidence against the null hypothesis.
[636]
But for most practical cases once the
p-value starts getting up into the 0.2
[639]
and 0.3 range and greater,
[641]
we say that there is no evidence against
the null hypothesis.
Most Recent Videos:
You can go back to the homepage right here: Homepage





