馃攳
Week 6, Lecture 12, Part 5: What is Collinearity? - YouTube
Channel: unknown
[0]
now that we've talked about issues of non-constant聽
variance and linearity and what transformations聽聽
[7]
we can use to try to correct for the structure聽
of our data to make sure that our assumptions聽聽
[13]
hold we're now going to cross into a new聽
problem that we may have and we're going to聽聽
[20]
you know ponder whether or not this is an issue聽
when our predictors are strongly correlated with聽聽
[26]
each other or in other words what are we going聽
to do when one predictor is a linear combination聽聽
[33]
of other predictors so an example of this聽
would be if we have an x1 and x2 that are聽聽
[40]
perfectly correlated with each other they聽
are going to be what we're going to call them聽聽
[45]
co-linear so for example we could imagine聽
that x2 is a perfect linear combination of x1聽聽
[53]
such that 5 again is where the聽
relationship would cross the y axis
[64]
when x equals 0 and again the slope of that聽
line would be 0.5 so we can imagine that there聽聽
[72]
could be instances in which our predictors are聽
perfectly co-linear with each other this is going聽聽
[80]
to be an issue that we call multi-colinearity聽
where one or more of our predictors are nearly聽聽
[87]
linear related to the others and if one of the聽
predictors is almost perfectly predicted from聽聽
[94]
the other set of variables then we are also聽
going to have multi-collinearity in the model聽聽
[101]
so for the rest of the lecture we're going to聽
talk about how multi-collinearity could affect our聽聽
[107]
statistical inference as well as our predictions聽
and we're going to talk about how we can detect聽聽
[113]
multi-co-linearity once we have detected it聽
what can we do to resolve any issues that it may聽聽
[120]
have on our actual statistical inference i聽
want to begin by talking about the effects聽聽
[126]
that multi-collinearity can have so聽
the first effect that you can look for聽聽
[131]
is your fitted values your y hats are probably聽
not going to be affected by multicollinearity聽聽
[141]
what is going to be affected is the聽
variability in your beta estimates so the聽聽
[150]
standard errors around your estimated coefficients聽
are going to be artificially larger because聽聽
[156]
we're not as certain as to what partial effect is聽
really driving the true underlying relationship聽聽
[164]
between our covariates and our predictors聽
so when we have a high standard error of聽聽
[172]
our betas that's going to mean that fewer of聽
those estimated coefficients are significant聽聽
[179]
even when a true relationship may actually exist聽
another artifact of multicollinearity is that聽聽
[187]
our estimated coefficients are going to be聽
really sensitive to minor changes in the model聽聽
[193]
so if there are really large differences in your聽
estimated coefficients when you leave one variable聽聽
[200]
out and then include it that is a good indication聽
that there may be some multicollinearity聽聽
[206]
in your model another effect of multicollinearity聽
could be the fact that the sample that you have聽聽
[214]
is not really generalizable to the total overall聽
population and so if you had a new sample you may聽聽
[223]
end up getting a very different model because聽
again we're having a hard time distinguishing聽聽
[231]
exactly what partial effect is really driving聽
the true underlying relationship and what is聽聽
[238]
nice is that any coefficients or covariates聽
that are not multi-collinear with each other聽聽
[246]
they should not be affected by this so if there聽
are coefficients that are widely swinging around聽聽
[253]
that should be an indication that those are the聽
covariates or that are co-linear with each other
Most Recent Videos:
You can go back to the homepage right here: Homepage





