馃攳
How to Calculate and Interpret a Correlation (Pearson's r) - YouTube
Channel: unknown
[0]
okay now let's take a look at
calculating a correlation coefficient
[4]
suppose we have the following values
with scores for five people on the
[9]
variables x and y and it's a small
example because we're going to calculate
[13]
this by hand so I want to keep it
relatively simple here we have for
[17]
example the first person has a score of
1 on X and a score of 600 I the second
[23]
person to on X 4 on Y and so on so we
have 5 different people here and the 5
[30]
people produced a score on both x and y
so let's go ahead and calculate the
[36]
correlation or Pearson's R on this
example the first thing we want to do
[41]
though is state our null and alternative
hypotheses so the null hypothesis States
[46]
this is called Rho here it's the
population correlation so Rho XY equals
[52]
0 or in other words the correlation
between x and y in the population equals
[58]
0
and that means that there is no
[61]
relationship between x and y in the
population and once again this means row
[67]
and the alternative hypothesis is really
just the opposite of that notice
[70]
everything's the same except equals four
null not equals four alternative so the
[76]
alternative hypothesis is that the
population correlation between x and y
[80]
is not equal to zero as you can see here
and this is really saying that there is
[87]
a relationship between x and y in the
population it could be positive
[90]
relationship or a negative relationship
or in other words we're conducting a
[95]
two-tailed test here in terms of the
calculations first we'll begin with
[100]
calculating the mean of X and the mean
of Y so to find the mean of X and wife
[105]
starting with X we're just going to add
all of these values together and we're
[111]
going to divide by the number of values
that there are so a 1 plus 2 Plus 3 plus
[115]
4 plus 5 divided by 5 total gives us a
mean of 3 for X and then we'll do the
[122]
same for y so we add all these values
and divide by 5 and that gives us a mean
[127]
of 4 okay next we need to calculate what
are known as deviation scores for each
[134]
variable deviation scores subtract the
mean from each variable they're called
[140]
deviation scores because they indicate
how far each value deviates or departs
[146]
from the mean so let's take a look at
that now you know there's a lot of
[152]
information here on the screen but let
me walk you through it it's really not
[156]
that bad
recall that the mean of X was 3 in the
[160]
mean of Y was 4 so here are my X values
and all I'm doing here notice the 3 for
[166]
the mean is taking the x value and
subtracting the mean from it as I
[171]
mentioned a few moments ago so here
we're just going to go 1 minus 3 so you
[176]
see that here 1 minus 3 is equal to
negative 2 then we take 2 minus 3 or
[183]
mean again equal to negative 1 3 minus 3
0 4 minus the mean of 3 gives us 1 and 5
[192]
minus the mean of 3 gives us 2 an
interesting thing here if your
[196]
calculations are correct the
these deviation scores should all add up
[201]
to zero
so notice here we have negative 2
[204]
negative 1 that's negative 3 0 we can
ignore that then we have positive 3 so
[210]
negative 3 plus positive 3 is 0 notice
how these all add up to 0 and that's by
[216]
definition because the mean of 3 here is
a balance point in the distribution and
[222]
it always balances out the deviation
scores below the mean with the deviation
[228]
scores above the mean so this should
equal 0 when you add these up if it does
[233]
not that means that an error was made in
the calculations okay so now let's find
[238]
the deviation scores for y so once again
we just take the value for Y and
[243]
subtract the mean which is 4 in this
case from each value so 6 minus 4 is 2 4
[251]
minus 4 is 0 5 minus 4 is 1 3 minus 4 is
negative 1 and 2 minus 4 is negative 2
[259]
now let's take a look at this we have
positive 3 negative 3 notice how those
[265]
add up to 0 again so that looks good ok
so that's it for the deviation scores so
[272]
next what we need to do is we need to
square each of these values we need to
[276]
square the deviation scores and the
reason for that is it gets rid of the
[279]
negative numbers remember how when I
showed you that the deviation score is
[283]
always sum or add up to 0 we can't
really do much with it if our answer is
[288]
0 so when we square them that gets rid
of the negative values and then later
[292]
we'll take care of that square by taking
the square root at the end I'll show you
[297]
that later but for now step 3 is
squaring the deviation scores so these
[304]
were our deviation scores from earlier
that we calculated so all we're doing
[308]
now for X is squaring each one and that
gives us these values and for y we're
[314]
squaring each of those deviation scores
and we get these values here okay so
[318]
that's done we squared all those now all
we want to do is add up those squared
[324]
values now technically speaking we do
call this the sum
[328]
of the squared deviation scores and you
may see this in your text or in other
[333]
places usually short-handed as SS or
some of the squares it's called as
[340]
shorthand okay so some of the squares or
SS is equal to adding up all of the
[346]
squared deviation scores so that's what
we'll do here so we have our square
[352]
deviation scores here for X and for y so
all we do now is just add those up and
[358]
SS or some of the squares for X when we
add these values together gives us 10
[364]
and SS for Y also gives us 10 now it's
not always the case when you calculate a
[371]
correlation that SS X and SS Y will be
equal to each other that just happened
[376]
to occur in this example so don't expect
that you have to see these as equal they
[381]
absolutely do not have to be equal
whatsoever but they are on occasion okay
[386]
so finally what we need to do is find
what are called the cross products now
[391]
cross products just multiply that word
product right multiply the two deviation
[398]
scores together so let's take a look at
this next now when you first see this it
[404]
may look a little intimidating but let
me walk you through it it's really not
[407]
that bad because we've done most of this
already
[410]
recall we found the deviation score for
X right these values here and we found
[416]
the deviation scores for y we've already
done that now all we do to find the sum
[421]
of the products is we multiply the
deviation score for X by the deviation
[428]
score for y for a given person so here
we have negative two times positive 2
[434]
you see that here this product is
negative for multiplying the deviation
[439]
scores for the second person negative
one times zero you see that here that's
[443]
zero third person 0 times 1 is 0 the
fourth person one times negative one is
[450]
negative one and then finally for the
last person two times negative two is
[455]
negative four
now those are the cross products and
[459]
what we have to do is find the sum of
them so we have to add them up okay and
[465]
that's what we do right here we have
negative 4 0 0 negative 1 negative 4 so
[470]
that's what you see here all add it
together and that gives us the sum of
[473]
the products of negative 9 a very quick
review we found the mean then we found
[480]
the deviation scores these here and then
we squared them added them together and
[486]
then our last step here we found the
cross products and then we sum those up
[492]
so we've found the sum of the products
or SP so we have everything we need to
[497]
calculate a correlation
so here's our formula for the
[501]
correlation coefficient of Pearson's are
some of the products divided by square
[505]
root of SS x times square root of SS y
and we found all of these values so
[511]
we're ready to go recall that SS x and y
we're both 10 in this example and the SP
[517]
was negative 9 so we'll just go ahead
and plug these values in we have
[521]
negative 9 over square root of 10 times
square root of 10 and that gives us when
[528]
we work it out an R of negative 0.9 okay
so once again our value of Pearson's R
[535]
is negative 0.9 we can stop right here
and just report that our R was negative
[541]
0.9 and we can be done however if we
want to know whether this value is
[546]
statistically significant that is
whether it's significantly different
[550]
from zero then we need to conduct a
hypothesis test we'll do that next
Most Recent Videos:
You can go back to the homepage right here: Homepage





