🔍

ANOVA Part III: F Statistic and P Value | Statistics Tutorial #27 | MarinStatsLectures - YouTube

Channel: MarinStatsLectures-R Programming & Statistics

[0]

Let's build up the test statistic for one-way analysis of variance so recall

[5]

we were working with this example comparing the weight loss on one of four

[9]

diets A B C or D and we can see the observations here as well as the summary

[15]

statistics the mean weight loss and standard deviation of weight loss for

[18]

each of the four diets, we're working with this null assuming that all means are

[23]

equal and an alternative at least one differs from the rest. We previously

[28]

talked about how we can take the total variability in weight loss or the total

[32]

sum of squares and separate it into two parts that which is explained by the

[36]

diet and that which is not explained by the diet, so let's look at how we can use

[42]

that to build up the test statistic: first just a quick note on notation and

[45]

again we want to focus on the concepts not on plugging into formulas but this

[51]

helps us with understanding the formula and what's written in the notation so i is

[57]

used to index the group: Group one two three or four k is to signify the

[65]

number of groups J is used to represent the observation number within a group

[71]

Yij tells us the individual observations in group i, observation number j, so for

[78]

example Y1,3 is the observed value for Group 1 person number 3

[86]

Yi-bar is the mean for group i, Y-bar with no subscript is the overall or grand

[94]

mean: the mean weight loss for everyone in the study; Si is the standard

[99]

deviation for people in the group i, and ni is the sample size for people in

[104]

group i, so we saw that we can take the total variability and separate it into two

[112]

parts that which is explained by diet and that we signified as the variance

[119]

between diets or sometimes called the mean square between and that was the sum

[126]

of squares between divided by its degrees of freedom

[133]

the degrees of freedom between groups and looking at the formula and again we don't want to

[138]

get stuck on this but this helps us see the concepts we just learned from a

[141]

slightly different angle you can think of this as we're going to sum over all the

[146]

groups so Group one two three four what's the sample size in each group and

[152]

how far is the group specific mean from the overall mean squared divided by its

[158]

degrees of freedom if we work that out for example we'd find that the sum of

[165]

squares between groups or the explained sum of squares it's 97.3

[169]

degrees of freedom 3 right four groups minus one and that's going

[174]

to come out to 32.4t, we also saw we can think of the unexplained

[179]

and again this is the variability that's not explained by diet or not explained

[185]

by X, this is the variability going on within a group or the mean square within

[193]

and again this is the sum of squares within groups divided by the degrees of

[200]

freedom within and formulaically we can think of summing over all observations

[211]

how far is each individual from their group specific mean square divided by

[219]

three degrees of freedom. Right again we have n observations and

[223]

we lose K degrees of freedom by estimating the K group means, we can also

[228]

express this as summing over the groups

[235]

each group sample size minus one times the variance the

[242]

sample variance of each group divided by its degrees of freedom and the reason

[247]

why I write it this way is you can take a moment yourself to note that this here

[252]

is the exact formula for the pooled variance that we talked about in the two

[256]

sample t-test assuming equal variance in the two groups we're taking a the sample

[263]

variance of each group weighted by their degrees of freedom. okay so you can take

[268]

a moment yourself to work your way through and convince yourself that this

[273]

within group variance is the exact same as the pooled variance in the two sample

[278]

t-test assuming equal variance if you work this out for example you're going

[283]

to find the sum of squares within a group is 297 its degrees of freedom 56

[289]

and this comes out to be 5.3 so as noted we want to compare these two to each

[295]

other the mean square between groups to the mean square within groups and

[300]

the average sum of squares that can be explained by diet the average sum of

[304]

squares that cannot be explained by diet so let's try and think our way through

[308]

some stuff first suppose if the alternative hypothesis is true if at

[317]

least one mean differs if not all the means are the same how would we expect

[326]

statistically how we expect the mean squares between groups to compare to the

[333]

mean square within group if diets are different we'd expect this one should be

[338]

larger than this okay there should be much more variability that's explained

[342]

by diet than not explained by diet if we take a ratio of these and this is going

[349]

to be what we call our F statistic and or our test statistic

[353]

it's a mean squared between groups over the mean square within groups if we

[359]

expect the top to be larger than the bottom we expect this test statistic to

[363]

be larger than 1, if on the other hand our null hypothesis is

[370]

true if all the means are equal at the population level what would we expect to

[378]

see we'd expect the mean squares between, okay the variability that's explained by

[385]

diet, to be roughly the same as the mean square within or the variability that's

[391]

not explained by diet when looking at an F statistic you're taking the ratio of

[399]

these two we'd expect that to come out to be roughly 1 if we do this for our

[408]

set of data our F statistic 32.4 over the 5.3 that's going to come out to be

[418]

6.1 okay so the larger our F statistic gets the more evidence we have that the

[425]

alternative is likely true or the null is false well we don't want to get too

[430]

caught up on looking things up in tables it's important to note that this F

[436]

statistic follows what's called an F distribution it has degrees of freedom

[445]

for the numerator and degrees of freedom for the denominator so it has degrees of

[451]

freedom for the numerator which are K minus 1 right those are the degrees of

[457]

freedom of what's in the numerator and has degrees of freedom for the

[461]

denominator n-k, ok so again piece of software can do all these

[469]

calculations for you we don't want to focus on an F table and looking up an

[473]

exact p-value from that table so let's just jump to the interpretation

[478]

if we were going to work this out looking at a table or using a piece of

[482]

software p value is going to tell us like it always does what's the

[487]

probability of our observed test statistic or one even more extreme if the

[492]

null is true and we'd expect it to be 1 so what's the probability of getting an

[498]

F stat greater or equal to 6.1 if it should

[508]

be roughly equal to 1 if our null is true we'd expect our test that to be

[513]

roughly 1, what's the chance of seeing an estimate of 6.1 or more

[518]

you'll find that this comes out to 0.0011

[522]

okay roughly 0.1 percent so again if our null is true if all these diets are the

[527]

same the chance of seeing an F stat like this or the differences we saw or even

[532]

larger is only going to happen about 0.1 percent of the time that gives us

[537]

evidence to reject our null hypothesis we have evidence to believe the

[541]

alternative is likely true we have evidence to believe at least one diet

[545]

differs from the rest so now we need to decide which diets might differ from the

[550]

others and to do that we're going to compare all possible pairwise means

[553]

that's a topic we're going to get to talking about in a moment

[558]

Thanks! for more videos please subscribe to marinstatslectures

Most Recent Videos:

WE KILLED 6 HEROIC BOSSES! - YouTube

¿Quién inventó el dinero? - YouTube

Cuándo se inventó el dinero y cómo el dólar se convirtió en la principal moneda del mundo - YouTube

This Citizenship Program is Failing - YouTube

Candida Treatment Protocol w/ Dr. DiNezza - YouTube

$500M investor reacts to Real Estate Tik Toks 2 - YouTube

You can go back to the homepage right here: Homepage