馃攳
Standard deviation of residuals or Root-mean-square error (RMSD) - YouTube
Channel: Khan Academy
[0]
what we're going to do in this video is
[2]
calculate a typical measure of how well
[5]
the actual data points agree with a
[8]
model in this case a linear model and
[11]
there's several names for it we could
[13]
consider this to be the standard
[15]
deviation of the residuals and that's
[18]
essentially what we're going to
[19]
calculate you could also call it the
[21]
root mean square error and you'll see
[24]
why it's called this because this really
[25]
describes how we calculate it
[28]
so we're going to do is look at the
[30]
residuals for each of these points and
[34]
then we're going to find the standard
[35]
deviation of them so just as a bit of
[37]
review the ith residual
[40]
is going to be equal to
[42]
the
[43]
y-value for a given x minus the
[46]
predicted y value for a given x now when
[50]
i say y-hat right over here this just
[52]
says what would the linear regression
[55]
predict for a given x and this is the
[58]
actual y for given x so for example and
[62]
we've done this in other videos this is
[63]
all review the residual here when x is
[66]
equal to 1
[68]
we have y is equal to 1 but what was
[72]
predicted by the model is 2.5 times 1
[75]
minus 2 which is 0.5 so 1
[79]
minus 0.5 so this residual here
[83]
this residual is equal to 1 minus 0.5
[88]
which is equal to 0.5 and it's a
[90]
positive 0.5 and if the actual point is
[93]
above the model you are going to have a
[95]
positive residual
[98]
now the residual
[99]
over here
[101]
you also have the actual point being
[103]
higher than the model
[105]
so this is also going to be a positive
[107]
residual and once again when x is equal
[110]
to 3 the actual y is 6
[113]
the predicted y is 2.5 times 3 which is
[118]
7.5 minus 2 which is 5.5 so you have 6
[122]
minus 5.5 so here i'll write residual is
[125]
equal to 6 minus 5.5
[128]
which is equal to
[130]
0.5 so once again you have a positive
[132]
residual
[134]
now for
[135]
this point that sits right on the model
[138]
the actual is the predicted the act when
[141]
x is two the actual is three and what
[144]
was predicted by the model is three so
[146]
the residual here is equal to the actual
[149]
is three and the predicted is three so
[151]
it's equal to zero
[153]
and then last but not least you have
[157]
this data point
[159]
where the residual is going to be the
[161]
actual when x is equal to 2
[164]
is 2
[165]
minus the predicted well when x is equal
[168]
to 2 you have 2.5 times 2 which is equal
[172]
to
[173]
5 minus 2 is equal to 3. so 2 minus 3 is
[177]
equal to negative 1.
[179]
and so when your actual is below your
[181]
regression line you're going to have a
[183]
negative residual so this is going to be
[185]
negative 1 right over there
[187]
now we can calculate the standard
[189]
deviation of the residuals we're going
[191]
to take this first residual which is 0.5
[196]
and we're going to square it we're going
[198]
to add it to the second residual right
[201]
over here i'll use this blue with this
[202]
teal color that's zero
[205]
i'm gonna square that
[207]
then we have
[209]
this third residual which is negative 1.
[213]
so plus
[214]
negative 1 squared and then finally we
[217]
have that fourth residual which is 0.5
[219]
squared 0.5
[222]
squared
[224]
so once again we took each of the
[226]
residuals the which you could view as
[228]
the distance between the points and what
[230]
the model would predict we are squaring
[233]
them when you take a typical standard
[234]
deviation you're taking the distance
[236]
between a point and the mean here we're
[238]
taking the distance between a point and
[240]
what the model would have predicted but
[242]
we're squaring each of those residuals
[244]
and adding them all up together
[246]
and just like we do with the sample
[248]
standard deviation we are now going to
[250]
divide by
[252]
one less than the number of residuals we
[255]
just squared and added so we have four
[257]
residuals we're gonna divide by four
[260]
minus one
[262]
which is equal to of course
[264]
three
[265]
you could view this part as a mean of
[268]
the squared errors and now we're going
[270]
to take the square root of it
[273]
so let's see this is going to be equal
[276]
to
[276]
square root of
[279]
this is 0.25 0.25
[284]
this is just 0.
[287]
this is going to be positive 1
[290]
and then this 0.5 squared is going to be
[292]
0.25
[294]
0.25
[296]
all of that over 3.
[300]
now this numerator is going to be 1.5
[303]
over
[304]
3.
[305]
so this is going to be equal to 1.5 is
[308]
exactly half of three so we could say
[310]
this is equal to the square root
[313]
of one half this is one over the square
[315]
root of two one
[318]
divided by
[319]
the square root of two
[322]
which gets us 2.
[324]
so if we round to the nearest
[326]
thousandths it's roughly
[329]
0.707
[331]
so approximately 0.707
[336]
and if you wanted to visualize that one
[338]
standard deviation of the residuals
[340]
below the line would look like this
[343]
and one standard deviation above the
[345]
line for any given x value would go one
[348]
standard deviation of the residuals
[349]
above it
[350]
it would look something like that
[353]
and this is obviously just a hand-drawn
[355]
approximation but you do see that this
[358]
does seem to be roughly indicative of
[360]
the typical residual
[363]
now it's worth noting sometimes people
[364]
will say it's the average residual and
[366]
it depends how you think about the word
[369]
average because we are squaring the
[371]
residuals so outliers things that are
[374]
really far from the line when you square
[377]
it are going to have disproportionate
[378]
impact here if you didn't want to have
[380]
that behavior we could have done
[382]
something like find the mean of the
[384]
absolute residuals that actually in some
[386]
ways would have been a simpler one but
[387]
this is a standard way of people trying
[390]
to figure out how much a model disagrees
[393]
with the actual data and so you can
[395]
imagine the lower this number is
[398]
the better the fit of the model
Most Recent Videos:
You can go back to the homepage right here: Homepage





