馃攳
Correlations and Covariance in R with Example | R Tutorial 4.12 | MarinStatsLectures - YouTube
Channel: MarinStatsLectures-R Programming & Statistics
[0]
hi I'm Mike Marin and in this video
[2]
we'll talk about calculating correlation
[5]
and covariance using our Pearson's
[8]
correlation is a parametric measure of
[10]
the linear association between two
[13]
numeric variables Spearman's rank
[16]
correlation is a nonparametric measure
[19]
of the monotonic association between two
[22]
numeric variables Kendall's rank
[24]
correlation is another nonparametric
[27]
measure of Association based on
[30]
concordance or discordance of XY pairs
[33]
we will be working with the lung
[35]
capacity data that was introduced
[37]
earlier in this series of videos I've
[40]
already gone ahead and imported the data
[41]
into R and attached it we will explore
[45]
the relationship between age and lung
[47]
capacity we will use the cor CoV and cor
[53]
test commands to access the help menus
[56]
in R you can type help and in brackets
[59]
the name of the command you'd like help
[60]
for or simply place a question mark in
[63]
front of the command name the first
[66]
thing that we'll want to do is produce a
[68]
scatter plot of lung capacity verse age
[71]
we can use the plot command and on the
[73]
x-axis place age on the y-axis lung
[77]
capacity we can add a title as well as
[82]
rotate the values on the y-axis you can
[86]
see my series two video on making
[88]
scatter plots or modifying plots in R to
[92]
learn more about how to change the look
[93]
of this plot we can see there is a
[95]
positive association between lung
[97]
capacity and age we can calculate the
[101]
correlation using the COR command here
[104]
we'll calculate the correlation between
[105]
age and lung capacity we can set the
[109]
method argument equal to Pearson to have
[112]
Pearson's correlation return you can
[116]
also note that Pearson is the default so
[119]
if you leave the method argument out of
[122]
this command by default Pearson's
[124]
correlation will be returned you can
[127]
also note that the order that these
[129]
variables are entered in does not make a
[131]
difference the correlation between a
[133]
lung capacity is the same as the
[135]
correlation between lung capacity and
[137]
age if you'd like to calculate
[139]
Spearman's correlation we can set the
[142]
method argument equal to Spearman and
[145]
similarly if you'd like to calculate
[148]
Kendall's rank correlation we can set
[150]
the method argument equal to Kendall if
[154]
we'd like we can have a confidence
[156]
interval returned for the correlation as
[158]
well as test the hypothesis that the
[160]
correlation is equal to zero using the
[163]
core test command here we'll do this for
[166]
Pearson's correlation and we use the COR
[168]
test command we can see we returned the
[172]
estimate of the correlation we can also
[175]
see a 95% confidence interval for the
[178]
correlation as well as the test
[180]
statistic and p-value for the test that
[183]
the correlation equals zero again we can
[187]
do the same for Spearman's correlation
[192]
we can see that our returns a hypothesis
[195]
test although it does not return any
[197]
form of nonparametric confidence
[199]
interval for the correlation we can also
[202]
see we're given a warning message that
[204]
our cannot compute an exact p-value when
[207]
there are ties meaning we have a few
[209]
people of the exact same age in this
[211]
data set this isn't really a big deal
[213]
but if we like we can use the exact
[216]
argument and set this equal to false
[218]
letting our know to only approximate a
[221]
p-value for us now let's go back again
[224]
to the Pearson's correlation test we did
[226]
as we've seen with other tests in R we
[230]
can use the alt argument to change the
[232]
alternative hypothesis here we can set
[234]
it to greater have an alternative
[237]
hypothesis that the correlation is
[238]
greater than zero by default the
[241]
alternative will be a two-sided test we
[244]
can also use the comfort level command
[246]
to change the confidence level we're
[248]
using for example we may want a 99%
[250]
confidence interval returned while
[254]
covariance is often of less interest in
[257]
applied statistics we can calculate this
[259]
using the CoV command here we like the
[262]
covariance between age and lung capacity
[267]
we can produce all possible pairwise
[269]
plots using the Paris command here if we
[272]
ask for Paris plot of the lungcapdata
[275]
this will produce all possible pairwise
[277]
plots you'll notice that a scatter plot
[282]
is not really appropriate for
[284]
categorical variables you can also
[286]
notice that the first three variables in
[288]
our data set are the numeric ones so
[291]
let's produce a Paris plot only for
[293]
those variables here we will produce a
[295]
Paris plot for the length cap data
[297]
subsetting taking only columns 1 2 3 to
[302]
learn more about subsetting using square
[304]
brackets in our check out my video and
[307]
series one on subsetting using square
[309]
brackets taking a look at the plots we
[312]
can see this plot here is a scatter plot
[315]
with lung capacity on the x-axis and age
[318]
on the y-axis this plot here is a
[322]
scatter plot with age on the x-axis and
[324]
height on the y-axis the core command
[328]
can also be used to produce a
[330]
correlation matrix for all of the
[332]
variables here we can try and calculate
[334]
a correlation matrix for the lungcapdata
[337]
if we enter this command you'll notice
[340]
our returns an error this is because our
[342]
will not calculate a correlation for
[344]
categorical variables or factors like we
[347]
just did with the pair's plot we can
[349]
subset the data and make a correlation
[352]
matrix for only variables in column 1 2
[355]
3 or our numeric variables we can see
[358]
Pearson's correlation between age and
[361]
lung capacity here we can see Pearson's
[364]
correlation between height and age as
[367]
before we can also ask for Spearman's
[370]
correlation using the method argument
[373]
and setting this equal to Spearman
[376]
if desired we can also go ahead and
[379]
produce the covariance matrix using the
[382]
CoV diamond in the next video in this
[386]
series we'll talk about fitting a simple
[388]
linear regression using our thanks for
[391]
watching this video and make sure to
[393]
check out my other instructional videos
Most Recent Videos:
You can go back to the homepage right here: Homepage





