馃攳
Histogram Explained - YouTube
Channel: cylurian
[0]
>> In this video we're
going to do histograms.
[6]
Okay. First, let me
show you a picture
[8]
of what a histogram looks like
[11]
and then give you
some properties.
[13]
So the first property
of a histogram is
[20]
that the histogram
uses quantitative data.
[27]
Quantitative data is
numeric data, right,
[31]
that you can do arithmetic with.
[34]
The next thing that a
histogram, one of the properties
[40]
of histogram, is that it doesn't
have any gaps between the bars.
[48]
If you see gaps, if
you see several gaps,
[51]
in this particular graph,
[56]
then you will most likely
be looking at a bar graph.
[62]
A bar graph is used when you
have categorical data, right.
[66]
Categorical data tends
to be like names,
[72]
or things that are
in order, right.
[75]
But a histogram usually
doesn't have gaps.
[81]
If it does, it's because
there's missing data.
[85]
Another thing that the histogram
has is the bar width, right,
[93]
the bar width is
constant in a histogram.
[96]
In books sometimes they'll
say the bin size is something,
[100]
or the class size is something.
[102]
In the case of our graph here
the bin size is going to be 10
[109]
and you can see that
it's consistent
[114]
with all of these bars.
[119]
Okay. And then the y axis
corresponds to the frequency.
[124]
Sometimes a count, in
this case it's counts.
[128]
Sometimes it's going
to be something
[133]
like the relative
percent or something,
[135]
or some type of percent.
[138]
So, again, the vertical
represents a count
[144]
or some frequency
of the histogram.
[151]
Okay. So let's do a problem.
[156]
Let's say you gave
a test, right.
[160]
Let's say you gave a
midterm one or something.
[164]
And you have 15 students and you
grade these exams and you wanted
[170]
to look at the histogram
on these exams, scores.
[175]
So let's say you
had these scores.
[179]
Now you can always
pause the video
[181]
and you can record
all these 15 scores.
[184]
Again, we want to build a
histogram from the data.
[191]
And the first thing we need to
do is we need to somehow figure
[197]
out this bin size, or class
size, or the width of the bars.
[207]
That's very important.
[210]
How do we do that?
[211]
Well, the first thing is
that we find the lowest value
[216]
and the highest value
of our dataset.
[219]
And I think that's
pretty reasonable.
[221]
We can probably look at what
the lowest is and the highest.
[225]
Have you found the lowest yet?
[227]
Yes. The lowest is 48.
[228]
And did you find
the highest value?
[230]
Okay. Probably you did, 97.
[234]
So knowing that this is an
exam, right, and it goes,
[245]
usually exams are from
zero to 100, right.
[250]
You can think of zero
percent to 100 percent.
[253]
Yes, there are no percents here.
[256]
And usually when you think
of grades you think of A,
[261]
B, and C, and so forth.
[264]
Well, you know, an A tends
to have a range of 10 points
[270]
so that might be
a good indicator
[273]
that because it's a test
that the bin size, you know,
[277]
might be 10, might
be a good size.
[280]
Now you could have chosen 12
or maybe 15, but sometimes
[286]
when you choose larger values
[289]
or smaller values your
graph will tend to change.
[292]
Do some experiments on that
[294]
and you'll see what
I'm talking about.
[298]
But in this case I think
it's a pretty fair bet
[303]
that the bin sizes could be 10.
[305]
So what does that mean?
[309]
It means that if we're going
to do a bin size of 10,
[315]
we should have multiples of 10.
[317]
And because our lowest
score is 48
[321]
and our highest is 97
we don't have to start
[325]
at zero score, you
know, zero score.
[328]
We can probably start at maybe
40 and then do increments of 10:
[335]
40 to 50, 50 to 60, and
so forth, up to 100.
[340]
The highest score was 97 so
we can go all the way to 100.
[344]
Again, that's very flexible.
[346]
You could have started at 45
and do it by tens, you know,
[351]
45 to 55, and from 55
to 65, and so forth.
[359]
But it would be a little
more confusing that way.
[364]
Now before we draw the histogram
we need to create some type
[368]
of table, some type of
frequency count table.
[372]
This table is going to help
us to draw the histogram.
[378]
So in this case we
have scores and counts.
[380]
Scores of course
are the bin sizes.
[385]
And the counts, well, we need
to find, there's going to be 15
[388]
of these scores so we have
to slot them into the bins.
[394]
So our first bin is going
to be between 40 through 50.
[402]
Now this looks a little weird
and it might not be the case
[406]
for your class, but I think it
should give you an explanation
[412]
of what we're trying to do.
[413]
Your teacher might not accept
this because it might be not
[418]
to standard but this
video is more to help you
[421]
out in understanding how to,
what exactly is it that I have
[426]
to do to get this histogram.
[428]
Okay. So here we have
a bracket 40 comma 50
[435]
and then a parentheses.
[437]
Well, the bracket
means to include
[441]
and the parentheses
means not include.
[446]
What does it mean
by not include?
[447]
Well, if you have a score of 40,
[449]
you put it in this
slot in this bin.
[452]
And I'll tell you what that kind
of means but if you had a value
[456]
in this case of 41,
you'd put it in here.
[459]
You count it in this bin.
[461]
If you have a score of 50,
you don't put it in this bin.
[467]
You don't count it here because
it says 50 is not included.
[473]
So where would you
put a 50 score?
[475]
Well, you would put
it on the next one.
[477]
So there's a bracket there
which is not include, I'm sorry,
[481]
which is include, right.
[482]
The bracket is to include so
50, you include 50 comma 60.
[488]
So values between 50 and
60, right, is this bin.
[494]
If you have a value
of 56 or something,
[496]
you could put it in there.
[497]
But if you have a value of
60, you won't put it in there.
[501]
You would put it
in the next one.
[503]
And of course we
can write the rest.
[509]
Again, it's possible that
you won't have these brackets
[514]
or parentheses when
you do your homework
[518]
but the idea is is
that, you know,
[523]
where do the numbers
fit in these bins.
[528]
So let's look at our first
data point which is 88.
[531]
Where would 88 go?
[533]
Oh, yeah, between 80 and 90.
[536]
That sounds right.
[538]
48, where do you
think 48 goes in?
[541]
Where in these bins?
[543]
Yes, between 40 and 50.
[546]
Now we get into 60, you
now, where does it go?
[552]
Well, it can't go in
here because we know
[559]
that 60 is not included
in this bin.
[563]
But it is included
in the next bin.
[568]
So let's put it in there.
[570]
And then we have 51, well,
that's between 50 and 60.
[575]
And then we got 57 which
is between 50 and 60.
[579]
Now if I'm going a little
too fast, all you need
[582]
to do is rewind and then
just go ahead and listen.
[586]
Okay. 85, right.
[589]
So 85, where does 85 go?
[591]
Between 80 and 90.
[593]
Okay. I'm going to
put it in there.
[595]
69, where's 69 go?
[597]
69, that's below 70, okay,
it goes between 60 and 70.
[603]
Okay. 75, where does 75 go?
[606]
Yes. Between 70 and 80.
[608]
Let's put a mark there.
[609]
97, let's look.
[612]
Yeah. That's below 100, yeah,
that's between 90 and 100.
[617]
Okay. And then 72,
71, 79, and so forth.
[623]
Okay. Good.
[625]
You can pause the video there.
[628]
All right.
[629]
So now we have our counts.
[631]
The next thing we want to do
is to create our y and x axes.
[637]
And we know that our count
are going to be the y.
[642]
And you can see the counts, I
started at zero, 1, 2, 3, 4, 5.
[647]
I went up to 5 because if I look
at my table, my frequency table,
[652]
we don't seem to have counts
more than 5 it looks like.
[656]
Now for the x. Where
do we start?
[659]
Do we start at zero?
[662]
We could if we wanted
to but we don't have to.
[665]
As long as we mark it correctly
and we tell the teacher, "Hey,
[669]
I'm starting at 40," we're fine.
[672]
Some teachers don't like that.
[673]
Some teachers will accept that.
[675]
In this case we're
going to start at 40
[678]
and we're going to
end up at 100.
[680]
And we got to label it.
[681]
These are scores.
[682]
And you can see the
bin sizes are 10.
[687]
Okay. We decided that
at the beginning.
[690]
Now we're going to draw the bar.
[694]
Let's look at the first
bin, 40 through 50,
[699]
and it only has 1 count.
[701]
So the bar is drawn 1 unit.
[705]
Then 50 through 60, that's going
[709]
to be 2 units so
let's draw that.
[711]
And we got to draw it where it's
going to be together, right.
[717]
So we're going to draw
it this way right there.
[721]
So there's no gaps.
[724]
And then the next
bin, 60 through 70,
[728]
let's draw it, no gaps.
[730]
Okay. 70 through
80, again, no gaps.
[735]
That goes up to 5 by the way.
[737]
You know, it's like
how high am I going?
[741]
Well, those are the
counts, okay.
[743]
The counts tell me
how high I'm going.
[745]
70 through 80 had 5.
[748]
The next one, 80 through 90,
what is that, 80 through 90?
[753]
How high is the counts?
[754]
2. So we draw 2.
[757]
And then 90 through 100,
how many counts is it?
[761]
It's going to be 1.
[763]
And we got to make sure
we give it a title.
[767]
And that's it.
[770]
So the important
thing is you got
[774]
to find these bins
or class size.
[780]
When you find that, then you
can create your bins, you know,
[786]
your scored in this case, and
then you can start slotting
[789]
in all the values until
you have some type
[793]
of frequency table here.
[794]
And that's going to help
you create the graph.
Most Recent Videos:
You can go back to the homepage right here: Homepage





