馃攳
Statistics - How to find outliers - YouTube
Channel: MySecretMathTutor
[0]
In this video we want to identify outliers
in a set of data.
[4]
If you are not sure what an outliers is, here
is what they are.
[8]
An outlier is an extremely high or extremely
low value in the data set.
[13]
Now in addition to just being something extremely
high or something low, you want to make sure
[17]
that it satisfies the following criteria.
[19]
If you want to find an outlier it must be
greater than Q3 + 1.5(Interquartile Range)
[25]
or it must be lower than Q1 - 1.5(Interquartile
Range)
[30]
This is making sure that it really is an extremely
high value or extremely low value.
[34]
You can see though that you need to compute
a few different things like Q3 and Q1 and
[39]
the Interquartile Range if we are going to
properly identify one of these outliers.
[44]
So lets look at some data, and see how this
works.
[50]
In my data I have a chart of how many phone
calls were received on any given day.
[54]
So I have 10 phone calls on the first day,
12 phone calls on the second day, and so on
[59]
and so forth.
[61]
If I'm going to compute things like Q1 and
Q3 and the Interquartile Range, its probably
[65]
a good idea to take all of this data and write
it out in order.
[70]
10, 11, 11, 11, 12, 12, 13, 14, 14, 15, 17,
22
[89]
Alright, so you can see that when I list out
my data like this 22 does look like a pretty
[98]
high value and 10 looks like a fairly low
value.
[102]
To double check that, you know, one of these
might be an outlier or maybe both of them,
[106]
lets go ahead and start breaking down our
data to find Q1 and Q3.
[113]
So I want to find the half way point of my
data, and I have twelve data points, so one,
[117]
two, three, four, five, six.
[122]
Alright, so I need the median of the first
half and the median of the second half.
[129]
Let's see, the half way point of the first
half lets call this Q1.
[137]
And looks like that is equal to 11.
[140]
Remember you find that by adding 11 plus 11,
dividing by 2.
[145]
The median of the second half, this would
be 14.5
[150]
Alright, now to
find our Interquartile range, we would end
[159]
up subtracting these two values from one another.
[163]
This would give us 3.5.
[165]
Alright, we have all of the information we
need, now we can figure out other values so
[173]
we can figure out outliers.
[178]
So to look for an extremely high value it
must be larger than Q3, which is 14.5 plus
[187]
1.5 times the interquartile range, 3.5.
[193]
And to find an extremely low value I'd take
Q1, 11 and I would minus 1.5 times the interquartile
[204]
range.
[207]
Let's see what these equal.
[214]
19.75
And 5.75
[235]
Alright, so here is how this works, if I have
any data points that are larger than 19.75,
[248]
they are an outlier.
[249]
If I have any data points that smaller than
5.75 those are outliers.
[255]
Well looking at all of our data, we can see
that the 22 is definitely larger than 19.75,
[261]
so its definitely an outlier.
[264]
Unfortunately I have nothing less than 5.75,
so I don't have any lower outliers.
[269]
So this entire set of data only has one outlier
and its just the 22, so its definitely an
[274]
extreme value.
[277]
So remember that you have to find a few different
bits of information first, but this is how
[281]
you go about finding your outliers.
Most Recent Videos:
You can go back to the homepage right here: Homepage





