🔍

R-squared or coefficient of determination | Regression | Probability and Statistics | Khan Academy - YouTube

Channel: Khan Academy

[0]

In the last few videos, we saw that if we had n points, each

[5]

of them have x and y-coordinates.

[7]

Let me draw n of those points.

[9]

So let's call this point one.

[11]

It has coordinates x1, y1.

[16]

You have the second point over here.

[19]

It had coordinates x2, y2.

[21]

And we keep putting points up here and eventually we get to

[25]

the nth point.

[28]

That has coordinates xn, yn.

[31]

What we saw is that there is a line that we can find that

[37]

minimizes the squared distance.

[42]

This line right here, I'll call it y, is

[45]

equal to mx plus b.

[49]

There's some line that minimizes the square distance

[51]

to the points.

[52]

And let me just review what those squared distances are.

[55]

Sometimes, it's called the squared error.

[57]

So this is the error between the line and point one.

[60]

So I'll call that error one.

[62]

This is the error between the line and point two.

[66]

We'll call this error two.

[68]

This is the error between the line and point n.

[73]

So if you wanted the total error, if you want the total

[76]

squared error-- this is actually how we started off

[79]

this whole discussion-- the total squared error between

[81]

the points and the line, you literally just take the y

[88]

value each point.

[90]

So for example, you would take y1.

[92]

That's this value right over here, you take y1 minus the y

[98]

value at this point in the line.

[100]

Well, that point in the line is, essentially, the y value

[103]

you get when you substitute x1 into this equation.

[106]

So I'll just substitute x1 into this equation.

[108]

So minus m x1 plus b.

[113]

This right here, that is the this y value right over here.

[117]

That is m x1 b.

[120]

I don't want to my get my graph too cluttered.

[122]

So I'll just delete that there.

[125]

That is error one right over there.

[128]

And we want the squared errors between each of the

[130]

points of the line.

[131]

So that's the first one.

[132]

Then you do the same thing for the second point.

[135]

And we started our discussion this way.

[136]

y2 minus m x2 plus b squared, all the way-- I'll do dot dot

[144]

dot to show that there are a bunch of these that we have to

[147]

do until we get to the nth point-- all the way to yn

[150]

minus m xn plus b squared.

[155]

And now that we actually know how to find these m's and b's,

[161]

I showed you the formula.

[162]

And in fact, we've proved the formula.

[166]

We can find this line.

[167]

And if we want to say, well, how much error is there?

[171]

We can then calculate it.

[172]

Because we now know the m's and the b's.

[173]

So we can calculate it for certain set of data.

[176]

Now, what I want to do is kind of come up with a more

[179]

meaningful estimate of how good this line is fitting the

[186]

data points that we have. And to do that, we're going to ask

[188]

ourselves the question, what percentage of the variation in

[205]

y is described by the variation in x?

[219]

So let's think about this.

[220]

How much of the total variation in y-- there's

[223]

obviously variation in y.

[224]

This y value is over here.

[226]

This point's y value is over here.

[227]

There is clearly a bunch of variation in the y.

[230]

But how much of that is essentially described by the

[232]

variation in x?

[233]

Or described by the line?

[236]

So let's think about that.

[237]

First, let's think about what the total variation is.

[239]

How much of the total variation in y?

[244]

So let's just figure out what the total variation in y is.

[249]

It's really just a tool for measuring.

[254]

When we think about variation, and this is even true when we

[257]

thought about variance, which was the mean variation in y.

[261]

If you think about the squared distance from some central

[264]

tendency, and the best central measure we can have of y is

[267]

the arithmetic mean.

[269]

So we could just say, the total variation in y is just

[273]

going to be the sum of the distances of each of the y's.

[278]

So you get y1 minus the mean of all the y's squared.

[294]

Plus y2 minus the mean of all the y's squared.

[299]

Plus, and you just keep going all the way

[301]

to the nth y value.

[303]

To yn minus the mean of all the y's squared.

[307]

This gives you the total variation in y.

[309]

You can just take out all the y values.

[312]

Find their mean.

[313]

It'll be some value, maybe it's

[314]

right over here someplace.

[318]

And so you can even visualize it the same way we visualized

[321]

the squared error from the line.

[323]

So if you visualize it, you can imagine a line that's y is

[327]

equal to the mean of y.

[329]

Which would look just like that.

[331]

And what we're measuring over here, this error right over

[334]

here, is the square of this distance right over here.

[336]

Between this point vertically and this line.

[340]

The second one is going to be this distance.

[344]

Just right up to the line.

[345]

And the nth one is going to be the distance from there all

[348]

the way to the line right over there.

[349]

And there are these other points in between.

[352]

This is the total variation in y.

[354]

Makes sense.

[355]

If you divide this by n, you're going to get what we

[363]

typically associate as the variance of y, which is kind

[367]

of the average squared distance.

[368]

Now, we have the total squared distance.

[371]

So what we want to do is-- how much of the total variation in

[375]

y is described by the variation in x?

[379]

So maybe we can think of it this way.

[380]

So our denominator, we want what percentage of the total

[382]

variation in y?

[383]

Let me write it this way.

[385]

Let me call this the squared error from the average.

[393]

Maybe I'll call this the squared error

[395]

from the mean of y.

[399]

And this is really the total variation in y.

[401]

So let's put that as the denominator.

[405]

The total variation in y, which is the squared error

[408]

from the mean of the y's.

[413]

Now we want to what percentage of this is described by the

[417]

variation in x.

[418]

Now, what is not described by the variation in x?

[421]

We want to how much is described by the

[423]

variation in x.

[423]

But what if we want how much of the total variation is not

[440]

described by the regression line?

[454]

Well, we already have a measure for that.

[456]

We have the squared error of the line.

[458]

This tells us the square of the distances from each point

[461]

to our line.

[462]

So it is exactly this measure.

[464]

It tells us how much of the total variation is not

[467]

described by the regression line.

[469]

So if you want to know what percentage of the total

[472]

variation is not described by the regression line, it would

[480]

just be the squared error of the line, because this is the

[485]

total variation not described by the regression line,

[488]

divided by the total variation.

[490]

So let me make it clear.

[493]

This, right over here, tells us what percentage of the

[505]

total variation is not described by the

[514]

variation in x.

[521]

Or by the regression line.

[529]

So to answer our question, what percentage is described

[532]

by the variation?

[533]

Well, the rest of it has to be described by the

[536]

variation in x.

[537]

Because our question is what percent of the total variation

[539]

is described by the variation in x.

[541]

This is the percentage that is not described.

[543]

So if this number is 30%-- if 30% of the variation in y is

[550]

not described by the line, then the remainder will be

[553]

described by the line.

[555]

So we could essentially just subtract this from 1.

[558]

So if we take 1 minus the squared error between our data

[563]

points and the line over the squared error between the y's

[569]

and the mean y, this actually tells us what percentage of

[578]

total variation is described by the line.

[586]

You can either view it's described by the line or by

[589]

the variation in x.

[598]

And this number right here, this is called the coefficient

[601]

of determination.

[605]

It's just what statisticians have decided to name it.

[617]

And it's also called R-squared.

[619]

You might have even heard that term when people talk about

[621]

regression.

[622]

Now let's think about it.

[624]

If the squared error of the line is really small

[633]

what does that mean?

[634]

It means that these errors, right over

[638]

here, are really small.

[641]

Which means that the line is a really good fit.

[650]

So let me write it over here.

[651]

If the squared error of the line is small, it tells us

[657]

that the line is a good fit.

[665]

Now, what would happen over here?

[666]

Well, if this number is really small, this is going to be a

[669]

very small fraction over here.

[672]

1 minus a very small fraction is going to be a

[676]

number close to 1.

[677]

So then, our R-squared will be close to 1, which tells us

[686]

that a lot of the variation in y is described by the

[690]

variation in x.

[691]

Which makes sense, because the line is a good fit.

[694]

You take the opposite case.

[695]

If the squared error of the line is huge, then that means

[703]

there's a lot of error between the data points and the line.

[706]

So if this number is huge, then this number over here is

[708]

going to be huge.

[711]

Or it's going to be a percentage close to 1.

[713]

And 1 minus that is going to be close to 0.

[716]

And so if the squared error of the line is large, this whole

[727]

thing's going to be close to 1.

[729]

And if this whole thing is close to 1, the whole

[731]

coefficient of determination, the whole R-squared, is going

[733]

to be close to 0, which makes sense.

[740]

That tells us that very little of the total variation in y is

[743]

described by the variation in x, or described by the line.

[746]

Well, anyway, everything I've been dealing with so far has

[748]

been a little bit in the abstract.

[749]

In the next video, we'll actually look at some data

[752]

samples and calculate their regression line.

[755]

And also calculate the R-squared, and see how good of

[758]

a fit it really is.

Most Recent Videos:

WE KILLED 6 HEROIC BOSSES! - YouTube

¿Quién inventó el dinero? - YouTube

Cuándo se inventó el dinero y cómo el dólar se convirtió en la principal moneda del mundo - YouTube

This Citizenship Program is Failing - YouTube

Candida Treatment Protocol w/ Dr. DiNezza - YouTube

$500M investor reacts to Real Estate Tik Toks 2 - YouTube

You can go back to the homepage right here: Homepage