🔍

Statistics 101: Linear Regression, Residual Analysis - YouTube

Channel: Brandon Foltz

[0]

hello my name is Brandon and welcome to

[2]

the next video in my series on basic

[4]

statistics if you are new to the channel

[6]

welcome if you're returning viewer it is

[9]

great to have you back if you liked the

[11]

video please give it a thumbs up share

[13]

it with classmates colleagues or friends

[15]

or anyone else you think might benefit

[17]

from watching so now that we are

[19]

introduced let's go ahead and get

[20]

started so this video is the next in my

[25]

series on simple linear regression

[27]

however I will say that this video does

[30]

have implications beyond just simple

[33]

regression so this video is about

[34]

residual analysis and residual analysis

[38]

will do two primary things for us

[40]

number one it will tell us how good the

[43]

model we have produced fits the data we

[46]

are looking at or in other words how is

[48]

our error is our error large or is our

[51]

error small number two it maybe most

[53]

importantly it will tell us whether or

[55]

not the model we are using is actually

[58]

appropriate for the data we are looking

[60]

at so as you know probably by now there

[63]

are many many ways to model a data set

[66]

and there are certain models that are

[68]

more appropriate than others and a

[70]

residuals can help us decide that so

[73]

residual analysis goes well beyond just

[75]

statistics it goes into higher-level

[77]

statistics it goes into data science and

[79]

of course machine learning when we're

[81]

talking about which model we should

[83]

choose for our application so let's go

[86]

ahead and get started learning about

[87]

residual analysis so this video is

[93]

brought to you by the great people at

[94]

the great courses plus if you're

[96]

watching this chances are you need to or

[98]

want to or like to learn things and

[100]

there are a few better places to learn

[102]

pretty much anything you want then the

[105]

great courses plus they have over 10,000

[108]

video lectures on everything from

[110]

photography to literature philosophy to

[113]

finance and yes statistics so please

[116]

check out the link in the description

[118]

below and learn how you can get a free

[120]

trial to the great courses plus it helps

[123]

my channel and it also helps them so

[125]

let's go ahead learn about residuals

[129]

so here is the data we've been using for

[131]

this entire playlist I'm not going to go

[133]

into it very much but in case you're new

[135]

to the playlist I just want to go over

[137]

it very briefly so you go into a

[139]

restaurant you eat a meal and then that

[140]

course they give you the bill at the end

[141]

and usually especially here in the US

[144]

maybe not everywhere in the world but

[145]

here in the US it's customary to tip the

[147]

server for that meal so we have a bill

[149]

amount along the bottom our x-axis then

[152]

we have the tip amount along the y-axis

[153]

and of course each diamond there is the

[156]

intersection of those two things so as

[158]

far as the data table goes you can see

[160]

that over here on the right so our first

[162]

bill was $34 and the server had a tip of

[165]

five dollars and so on and so forth and

[167]

then the mean of each variable at least

[169]

for the bill amount with 74 dollars and

[171]

for the tip amount was $10 and that's

[174]

data set we'll be using

[177]

so when we put that data into a

[179]

regression model this is what we get so

[182]

first you can see obviously the

[183]

regression line that goes across the

[185]

middle of our graph here we have a

[187]

centroid of 74 and 10 again that's just

[189]

the mean of each variable we have a

[192]

regression line of y equals zero point

[194]

one four six to X minus zero point eight

[197]

one eight eight slightly different in

[199]

those two regression lines and again

[201]

it's just because of differences in the

[202]

algorithms of software but they're

[204]

basically the same so we have a slope of

[206]

zero point one four six two and we have

[209]

our intercept down there in the lower

[210]

left of course we can interpret this

[212]

overall is that as the bill amount goes

[216]

up the tip amount goes up that's why our

[218]

slope is positive and as far as the

[220]

actual numbers go for every dollar

[223]

increase in the meal bill we would

[225]

expect or predict an increase in the tip

[227]

amount of about fifteen cents so again

[230]

this is the very simple small data set

[232]

model we're using and this is what it

[234]

looks like when we actually plot it and

[236]

put it into a regression model so what

[241]

is residual analysis so by definition a

[245]

residual is the quantity remaining after

[248]

other things have sort of been taken

[251]

into account

[252]

so it's subtracted or allowed for so in

[254]

our daily lives most of us get a

[256]

paycheck of some sort and then we have

[258]

our bills to pay we have to buy food and

[260]

other things and hopefully at the end of

[263]

all that

[263]

we have a little bit of money left to

[265]

save or to use for leisure or whatever

[268]

else so once all those obligations that

[271]

we have to do are done that little bit

[273]

of money we might have left is actually

[275]

the residual of our paycheck so a

[278]

residual is literally what's left over

[280]

so in this case it's what's left over

[283]

after our model is done explaining or is

[287]

run out of the ability to explain the

[290]

data that we are looking at so in this

[292]

case in stats it's the difference

[294]

between the observed value of the

[296]

dependent variable which in this case is

[298]

the tip amount and what is predicted by

[300]

the regression model so our regression

[302]

line in the previous slide is actually a

[305]

way of predicting what tip we would

[307]

expect for a given meal amount but we

[310]

also have observed values in there and

[312]

the residual is the difference between

[314]

those two so for example if the

[317]

regression model predicts a tip of $10

[320]

for a given meal but the observed tip

[322]

that actually happens on the table is

[324]

$12 then the residual amount is 12 minus

[328]

10 or 2 the notation which we have seen

[331]

before in many cases is y sub I minus y

[335]

hat sub I that is just the observed tip

[338]

minus the predicted tip so remember the

[344]

standard regression model y equals beta

[346]

sub 0 which is the intercept plus beta

[348]

sub 1 the slope and then we have an

[351]

error term so the first two terms are

[354]

our regression model and then the end is

[356]

the error so the regression model there

[358]

at the beginning is hopefully going to

[360]

explain a lot of the variation and our

[362]

dependent variable but it's probably not

[364]

going to explain all of it it's very

[366]

rare that it will explain all of it so

[368]

there's some left so how do we explain

[370]

what's left and that's what we call it

[372]

residuals so only part of the variance

[375]

in the dependent variable will be

[377]

explained by the values of the

[379]

independent variable so we see that as

[382]

the value of r-squared in regression

[383]

output which again is just the sum of

[386]

squares due to regression and the SS are

[388]

divided by the total sum of squares so

[391]

that is the variance explained by the

[394]

model itself but that's not the whole

[396]

story

[397]

so the variants left unexplained is due

[400]

to the model error so our model will fit

[403]

the data to a certain point but then

[406]

there's some left and that's our error

[408]

or our residuals so you can think of it

[412]

how far off if you're thinking

[414]

negatively or if you're thinking

[416]

positively how good the model accounts

[419]

for the variance in the dependent

[420]

variable so to explain a good chunk of

[422]

it hopefully

[423]

but then there's probably going to be

[424]

some left so this graph is a lot of

[429]

things going on it's actually from a

[431]

previous video I did on regression so if

[433]

you go back in the playlist you'll see

[434]

the first time I went over this slide so

[437]

just to set the stage real quick and

[438]

kind of refresh your memory if you're

[439]

new to the playlist you'll get this

[441]

information so a couple of things are

[443]

going on here

[444]

so the sloped line is our regression

[446]

line so the line with two purple dots

[448]

that goes from lower left upper right

[449]

that's actually a regression line of the

[452]

equation up here in the upper left now

[454]

the purple dots are actually predicted

[456]

values so it should make sense that the

[458]

purple dots on the line are predicted

[461]

values explained by the regression

[463]

equation up here in the upper left now

[465]

you have a dashed line across the middle

[468]

here so that is the mean of the

[470]

dependent variable so the mean tip

[473]

amount was $10 so we create a line

[475]

that's flat right there at the mean of

[478]

the dependent variable and that is that

[480]

line has a couple of black dots on it

[482]

then you have some orange diamonds those

[485]

are our observed values so we have three

[488]

kind of things going on here we have the

[490]

regression line with the purple dots for

[493]

the predicted values we have the dashed

[495]

line with the black dots which is the

[497]

mean of the dependent variable and then

[499]

we have the orange diamonds which are

[501]

the actual observed values now let's

[503]

talk about what SSE SST and SS are are

[507]

very quickly

[508]

so first SSE so here's the equation y

[511]

sub I minus y hat sub I so again that's

[514]

just observed minus predicted squared

[517]

and then summed up that's what SSE is

[520]

that's the error now in distance terms

[522]

it is this so remember the purple dot is

[525]

our the predicted the orange diamond is

[527]

the observed so it's just the difference

[529]

between those two square

[531]

and then summed up for each point along

[533]

the line that's what SSE is so there's

[536]

one up there in the upper right so next

[538]

we have the SST or total sum of squares

[540]

that's y sub I minus y bar squared to

[544]

remember y sub I is the observed value

[547]

which in the orange diamond minus y bar

[550]

which is the black dashed line across

[553]

the middle so we take that distance

[554]

squared and sum them up and that's that

[557]

distance so that's the distance between

[558]

the orange diamond and the black dot

[560]

that's there there's another one there

[562]

now you'll notice by looking at these

[564]

brackets there's one left and that is

[567]

SSR or a sum of squares due to our model

[570]

or sum of squares due to regression so

[573]

that is y hat I that's the purple dot

[576]

the predicted value minus y bar which is

[579]

the mean of the dependent variable

[581]

Square those sum them up and that's that

[583]

distance there and there so you can see

[586]

that we have three measures going on

[588]

SSE sum of squares due to error total

[590]

sum of squares there are some of squares

[592]

due to regression you'll notice that

[595]

over look over here on the right is a

[596]

good example if you see the total sum of

[598]

squares over here in this orange bracket

[601]

and the far right that it's actually

[602]

made up literally of the SSE there in

[605]

the green and the SS are in the purple

[610]

so first a few model assumptions so

[613]

here's our standard regression model we

[614]

talked about and here are some

[615]

assumptions that we make and info number

[618]

one the residuals offer the best

[620]

information about the error term so

[622]

again the be the beta sub zero emitted

[624]

sub 1 that's a regression model that

[626]

won't explain everything they'll explain

[628]

some of it but not everything so we have

[630]

the epsilon or the error that's left the

[632]

residuals offer the best information

[634]

about the remainder of the story and our

[636]

model to the expected value of the error

[639]

term or the mean of the error term is 0

[642]

for all values of the independent

[644]

variable X the variance of the error

[646]

term is the same so what we're saying

[649]

there is that regardless in this case of

[651]

what mule amount you have a meal of $30

[653]

$50 $75 the variance of the error term

[656]

at each point along that independent

[658]

variable is constant it's the same the

[662]

values of the error term are independent

[664]

[665]

each other so there is no relationship

[667]

between the error terms in the error

[670]

term follows a normal distribution so if

[672]

we took all of our errors or all of our

[674]

residuals we put them in their own

[676]

distribution and looked at it it should

[678]

follow a normal bell shaped distribution

[683]

so again here are the residuals in this

[685]

model let's want to put them up here

[687]

because we're gonna graph them here in a

[688]

second now what we can do is graph a

[691]

residuals and often the best way to look

[693]

at residuals is on a graph or a scatter

[695]

plot so what we're gonna do is graph the

[698]

residuals against two things we're gonna

[700]

graph them against the independent

[703]

variable which is the meal amount here

[705]

along the bottom and then we're gonna

[706]

graph them as a function of the

[709]

predicted values so let's go ahead and

[711]

look at both of those graphs so first we

[717]

have the residual plot against the

[718]

independent variable in this case the X

[720]

aware of what is the bill amount and

[722]

here's that looks like so you can see

[724]

that the residual for the first meal

[726]

amount over here on the left-hand side

[728]

was like a meal of like thirty seven

[729]

dollars the residual for that was a

[732]

little bit under one and then for the

[734]

meal amount here in the middle of around

[736]

50 or 51 dollars the residual was a

[739]

little bit less than a negative two and

[741]

so on and so forth so these are the

[744]

residuals plotted against the bill

[746]

amount for each one next and maybe most

[751]

importantly we're gonna plot the

[753]

residuals against the predicted values

[756]

so here's that so how do we interpret

[758]

this let's look at the first dot over

[760]

here on the left for that first meal the

[763]

difference between the predicted tip by

[765]

our model and the observed tip that we

[767]

had in the data was a little bit less

[769]

than one now look at the second one what

[772]

we're saying is that the predicted tip

[774]

that our model gave us and the observed

[777]

tip the difference between those two was

[779]

a little bit less than negative two so

[782]

you can see that what we're doing here

[783]

is actually looking at the observed

[785]

versus the predicted and this is the

[788]

residual plot against Y hat or the

[790]

predicted dependent variable and this is

[792]

probably the most important one what

[793]

we're looking at patterns in the

[795]

residuals

[797]

let's talk about some general patterns

[799]

misery and just generic graphs can look

[802]

for different ways that residuals can

[803]

appear on graphs the first case is kind

[806]

of the best case so if we graph our

[808]

residuals like we did in the previous

[810]

two slides and they kind of look like

[812]

this they're kind of evenly scattered

[814]

left to right up to down all over the

[817]

graph that's a good thing okay they kind

[819]

of fit all here in the middle there's no

[820]

other pattern to them except for being

[822]

uniformly distributed pretty much

[824]

everywhere and there's a technical word

[826]

for that it's called homoscedasticity or

[829]

constant variance so we can see that the

[832]

variance along the residuals here in the

[834]

middle is constant from left to right

[836]

okay there's no sort of bending or

[838]

bowing or you know squeezing or anything

[840]

like that all the residuals are in a

[843]

nice even distribution across the graph

[845]

so constant variance we could have

[849]

something looks like this

[851]

so that is called heteroscedasticity or

[853]

non constant variance on the left side

[856]

of our graph the residuals are much more

[858]

spread out than they are over here on

[861]

the right side and this might cause us

[863]

some pause and we'll see why in a minute

[865]

that our residuals are not evenly

[867]

distributed across the graph left to

[870]

right so the error is larger down on

[873]

this end than it is over on the right

[875]

end and that can be a problem so another

[880]

type of heteroscedasticity is nonlinear

[883]

data or using the wrong model so here

[885]

our residuals are like in a bow shape

[888]

but actually we're gonna see this in

[889]

other videos coming up when we talk

[891]

about nonlinear models but the residuals

[893]

follow an arc either from lower left and

[897]

up and down to the right or maybe

[898]

another direction you know it could be

[900]

sort of a half of an arc or something

[902]

like that but this might show us that

[904]

our data is actually nonlinear in a

[906]

linear model may not be appropriate for

[908]

this data now in the next few videos

[910]

coming up we're gonna talk about

[912]

nonlinear models and you will see this

[914]

pattern in the residuals when we go to

[916]

look at which model is best for fitting

[919]

the data so here is the same residual

[924]

plot we had before so residual plot

[926]

against Y hat our predicted values and

[928]

there we go

[929]

so here

[930]

on the bottom it predicted tip amount

[931]

and then we have the difference with

[933]

that in the observed that's the distance

[934]

over here in the residual so what

[936]

pattern does this follow well it follows

[939]

a fairly standard pattern left to right

[942]

again we only have six observations in

[945]

this very small data set so you might

[946]

see patterns where there aren't really

[948]

any but in this case I think it's fair

[950]

to assume that the residuals you know

[953]

occur on the top and bottom of our plot

[955]

and they're about the same left to right

[957]

there's no you know cone shape or

[959]

there's no curve to them or anything

[960]

like that so this is a good residual

[963]

plot so here are our two plots

[967]

side-by-side so first we a residual plot

[969]

against the independent variable which

[971]

is the bill amount so they're there and

[973]

there

[974]

again nice pattern and in that there's

[977]

no pattern then over here we have the

[979]

same with the residual plot against the

[981]

predicted a dependent variable same

[984]

thing so these look pretty good so now

[989]

let's put it all together we have our

[991]

bill amount line fit plot so first we

[993]

have our observed values here in the

[995]

orange circles and then on top of that

[997]

we can put our regression line and then

[999]

our predicted values there in the yellow

[1002]

circles so we can see how each observed

[1004]

value Falls above or below the actual

[1007]

predicted amount and those are our

[1009]

residuals here's our bill amount

[1014]

residual plot so there there there again

[1017]

and what pattern

[1018]

well no roll pattern that's a good

[1021]

residual plot so a few final points so

[1026]

what happens if the residual analysis

[1028]

reveals heteroscedasticity so that means

[1031]

that our residuals are not sort of

[1033]

uniformly distributed across a residual

[1035]

plot that might have a curvature to them

[1037]

or they might be non constants like a

[1040]

cone shape in one direction what can we

[1041]

[1042]

so we could rebuild the model with

[1044]

different independent variable or

[1045]

variables that's wise one option we

[1048]

could perform some type of

[1049]

transformation on the nonlinear data so

[1052]

you take a logarithm or something else

[1054]

for that variable we could fit a

[1057]

nonlinear regression model so linear

[1060]

regression is not the only type there

[1062]

are many other types there's non linear

[1063]

there's

[1064]

like piecewise regression all kinds but

[1066]

be careful don't over fit the model and

[1069]

in my next playlist where we talk about

[1071]

nonlinear regression we will talk a lot

[1073]

about the dangers of overfitting

[1075]

so if final question is that well are

[1078]

there sort of quantitative statistical

[1080]

tests for residuals and the answer is

[1082]

yes

[1083]

there is their brush pagon test the

[1086]

white test in the NCV test which is non

[1089]

constant variants test however for the

[1092]

sake of this video we're not going to go

[1094]

into that those are more advanced and I

[1095]

actually think there are other ways both

[1097]

visually and computationally to figure

[1100]

out if you are a problem with your

[1101]

residuals so we'll stick to that for now

[1103]

but I want you to be aware that there

[1105]

are some statistical tests out there for

[1107]

residuals this video is brought to you

[1110]

by the great courses plus where you can

[1113]

get unlimited access to over 10,000

[1115]

different video lectures taught by

[1117]

award-winning professors from the Ivy

[1119]

League and other top schools around the

[1121]

world you can learn about anything that

[1123]

interests you science literature and yes

[1126]

statistics like this lecture from

[1129]

Professor Tula Theo Williams called

[1130]

linear regression models and assumptions

[1133]

from her course learning statistics

[1135]

concepts and applications in R and right

[1139]

now the great courses plus is offering

[1141]

my viewers a free trial and is now also

[1144]

optimized for Australia and the UK so go

[1147]

to the great courses plus.com slash

[1150]

brandon volts my name to have access to

[1153]

the ten thousand video lecture library

[1156]

or click on the link in the description

[1158]

below okay so that wraps up our video on

[1162]

residual analysis in simple linear

[1164]

regression again it is a very important

[1167]

concept when figuring out one how good

[1169]

our model is and two whether the model

[1171]

were trying to implement is actually

[1172]

appropriate for the data we have and it

[1175]

does have implications for other areas

[1177]

more advanced in other fields such as

[1179]

advanced stats and data science and

[1181]

machine learning and things of that

[1182]

nature so I hope you found this very

[1185]

visual very insightful and it's

[1187]

something you can take with you as you

[1189]

progress so thank you very much for

[1190]

watching and I look forward to seeing

[1192]

you again in our next video

[1194]

take care

Most Recent Videos:

WE KILLED 6 HEROIC BOSSES! - YouTube

¿Quién inventó el dinero? - YouTube

Cuándo se inventó el dinero y cómo el dólar se convirtió en la principal moneda del mundo - YouTube

This Citizenship Program is Failing - YouTube

Candida Treatment Protocol w/ Dr. DiNezza - YouTube

$500M investor reacts to Real Estate Tik Toks 2 - YouTube

You can go back to the homepage right here: Homepage