11a multicollinearity VIF R2 - YouTube

Channel: unknown

[0]
Today we will talk about
[3]
multi-collinearity.
[7]
Sometimes two variables measure a
[9]
similar concept,
[12]
they are correlated but they're not
[14]
exactly identical.
[17]
But the fact that they're correlated
[19]
makes it hard for the regression to
[20]
disentangle their individual effect on
[23]
the response.
[24]
And what happens as a consequence,
[27]
this difficulty to figure out is it due
[30]
to that variable or this one
[32]
results in confidence intervals in the
[34]
regression
[35]
being very very wide only of the
[38]
correlated variables.
[40]
So it reflects the uncertainty of we
[42]
don't really know
[43]
is the contribution; the beta due to one
[46]
variable or the other.
[50]
More generally, it needn't be two
[52]
variables, it could be
[55]
several variables. One variable may be
[57]
correlated with a linear combination
[60]
of multiple variables.
[63]
And one way to measure the
[67]
degree of collinearity is this concept of
[71]
variation inflation factors VIF.
[76]
Let's talk first about exact
[78]
collinearity
[80]
and there's a formal definition. When you
[84]
have a linear combination of the x
[86]
variables
[88]
that is always zero where you can choose
[91]
the t's here as you want.
[94]
Then you have exact collinearity, the t's of
[97]
course cannot be trivially be all zero.
[102]
We have already seen an example of this
[104]
when we talked about indicator variables.
[108]
So the idea with indicator variables, you
[110]
have to take out
[112]
one of the indicators
[115]
to avoid exact collinearity.
[119]
And in the example of male and female,
[123]
we can make this fit this definition;
[126]
we're looking at the intercept male and
[128]
female
[130]
and we're choosing for female the
[133]
coefficient one
[134]
for male the coefficient one. Remember,
[137]
the intercept is one
[139]
and we choose the coefficient minus 1.
[142]
Right? So t0
[143]
is minus 1. With this
[146]
choice, we have minus 1 plus male
[149]
indicator of male and female. These
[152]
together
[152]
are always 1 and so you have minus 1
[155]
plus 1 equals 0.
[159]
So that was exactly linearity,
[163]
we talk about multicollinearity when the
[166]
linear combination is not
[167]
exactly 0 but approximately zero.
[173]
And equivalently,
[176]
we can say that we have multiple
[179]
linearity
[180]
when one x variable can be predicted
[184]
approximately by a linear combination
[188]
of the x variables. Aha
[191]
linear combination, we are back at
[193]
regression.
[194]
So now we can write this as a regression,
[198]
replacing the approximate sign with an
[201]
equal sign but then adding error
[203]
and as always error is distributed
[206]
normally
[207]
in the usual way.
[212]
So, the variance inflation factor is
[214]
formally defined as
[216]
1 divided by 1 minus R squared this is
[219]
R j squared .With this R j squared is the
[223]
coefficient of determination the
[225]
R squared,
[226]
when regressing the j's x variable x j
[229]
on all other variables
[231]
x variables. It is not the R
[234]
squared; the usual R squared from linear
[237]
regression where you have y
[238]
on x there's no y here anywhere.
[245]
Yeah and here's the same thing, the
[248]
linear regression again where we're
[251]
leaving out
[252]
the j's variable because that's over
[253]
here.
[256]
Question for you, what is the variance
[260]
inflation factor when x j
[262]
is uncorrelated with the other x
[264]
variables?
[265]
Here's the formula, here's some answers
[269]
and I will leave you to answer this
[273]
and move on to the next question.
[277]
Next question is very similar.
[281]
What is the variance inflation factor
[283]
when x j is very highly correlated with
[285]
the other x variables?
[288]
Same answers, same formula
[291]
and so I will leave you with that also.
[296]
And going to a fun fact,
[301]
we've defined the variance inflation
[304]
factor
[305]
through the R squared.
[308]
The R squared of the j's variable
[312]
and it can be shown that
[318]
the variance inflation factor of the j's
[320]
variable is
[321]
the j's diagonal entry of this matrix.
[326]
The matrix being inverse X transpose X
[330]
and if you remember the variance of beta;
[333]
the estimate of beta was sigma squared
[335]
times that same
[337]
matrix. So what does it tell us?
[341]
Well,
[344]
the j's diagonal element
[349]
of this matrix
[352]
is the variance of beta j. Right?
[358]
Well, times sigma squared.
[361]
So we have a direct correspondence here
[364]
that well the width of the confidence
[367]
interval
[368]
is directly related to that
[373]
R squared the R j squared that we saw
[376]
earlier.
[377]
That connection is not obvious but it's
[381]
really cool I think. so I'll leave you
[384]
with that.