馃攳
Counting Carefully - The Base Rate Fallacy - YouTube
Channel: unknown
[0]
Hello Internet, I want to talk to you about
counting.
[2]
No, not kind.
[5]
Not that kind either.
[6]
I want to talk to you about the kind of counting
that can easily fool you into believing something
[10]
is obviously true when it's neither obvious
nor true.
[14]
Here's Lambda.
[15]
Lambda just learned his country has a thousand
cases of some disease, let's say vampirism,
[20]
each year.
[21]
Being a little paranoid, he badgers his doctor
to test him for it.
[25]
The test has a 99.99% accuracy.
[27]
A few days later, Lambda gets a positive test
result.
[31]
Should he be worried?
[32]
What are the chances that he is sick?
[34]
If you guessed that Lambda actually has a
less than a 1 in 10 chance of being sick,
[38]
congratulations!
[39]
Wait... what?
[41]
How can a such a highly accurate test produce
such vague results?
[44]
Let's find out.
[45]
What do we know?
[47]
We know how many people usually get vampirism
in Lambda's country, how many people live
[51]
there otherwise and how good the test is.
[53]
But that's not what Lambda cares about.
[55]
The question he really cares about is "Am
I infected?"
[58]
or rather "What are the odds/likelihood that
I am infected?".
[62]
The answer to this comes down to counting
very carefully.
[64]
The procedure is this - we first gather everyone
relevant to our question.
[68]
Then, we split them up repeatedly into groups
based on all the information we have.
[73]
Finally, identify the group that we care about
and check how many members it is expected
[78]
to have - and perhaps, compare it with other
groups that we care about.
[81]
Remember, the odds of something happening
is simply the ratio between the various ways
[85]
in which it can happen and the ways it cannot.
[88]
So, here are the 100 million residents of
Lambda's country.
[92]
Just for comparison, the United States has
a little over 3 times as many people.
[96]
Of these, we know that usually about 1000
of them are infected at any given time.
[100]
Now, remember that our test has a 99.99% accuracy.
[104]
If we applied our test on everyone, we expect
the test to mess up once in every 10000 trials
[109]
more or less.
[110]
It is important to remember that these failures
occur randomly - the test doesn't care if
[115]
you are sick or healthy when messing up.
[117]
The result will look like this:
Amongst the 1000 who are infected, the test
[121]
would almost certainly identify every single
one of them as sick.
[124]
It's a really good test after all.
[127]
Amongst everyone remaining, the test would
correctly identify the vast majority of them
[130]
as healthy - well, 99.99% of them.
[134]
Unfortunately, the test will mistakenly declare
0.01% of the healthy people as infected - that's
[138]
10,000 of them.
[139]
That's what you get when dealing with such
large numbers.
[140]
The question now is, if the test declared
you as infected, how likely is it that you
[146]
are infected.
[148]
Let's group everyone that the test declared
as infected - healthy people and vampires.
[152]
Lambda is somewhere in here.
[153]
Now what do you now think his chances of being
infected are?
[156]
A lot less than 99.99% for sure.
[159]
It's less than one in ten.
[160]
Weird right?
[161]
Now, you might be wondering why we had to
start with the entire population and whittle
[165]
our way down.
[166]
This is because when doing these counting
arguments, we limit our grouping based only
[170]
on the information we have.
[172]
Lambda was not showing any signs of infection.
[176]
For all we knew, he was just another resident
of his country.
[180]
What if Lambda was showing signs of infection
such as, oh I don't know, glittering or burning
[185]
under sunlight or looking like this?
[189]
We can add that to our diagram.
[191]
For the sake of argument, let's say that such
symptoms are universal amongst vampires and
[196]
uncommon amongst non-vampires affecting only
1 in 1000 people.
[201]
If we didn't know this number, we could have
set up a survey and found out.
[205]
Then, let's test people, but only if they
are symptomatic.
[209]
The test will still catch a few glittery but
otherwise healthy people amongst all the vampires
[213]
because it still has its 0.01% error rate
but now there are far fewer of them.
[218]
Now, take a look at the set of people who
were symptomatic and were declared infected
[223]
by the test.
[224]
If Lambda had been symptomatic and had gotten
a positive test result, we could be almost
[228]
certain that he was infected.
[230]
This is why doctors don't order tests indiscriminately.
[233]
This curious result wasn't because the test
for vampirism was bad.
[238]
It was because the test was applied inappropriately;
the error rate of the test was comparable
[243]
to the fraction of the test subjects that
the test was trying to categorize.
[247]
Only a tiny fraction of the population was
infected and the test had a tendency to mess
[251]
up more often than that.
[252]
It was akin to knitting lace while donning
oven-mitts.
[255]
If we instead manage to filter the test subjects
even a little bit, then the combined effect
[260]
becomes powerful enough to give us the certainty
that we want.
[263]
"Why would I care?", you cry.
[266]
Because this variety of "weird math" shows
up in everything from spam filtration and
[270]
quality control to medical testing and cyber
security.
[273]
It's important to recognize situations where
your own intuition is about to fool you, particularly
[278]
on issues that are important to you.
[279]
Here are real world hot button issues where
this knowledge of math is critical.
[280]
And remember, once you eliminate the impossible,
whatever remains, no matter how improbable,
[284]
must be the truth.
Most Recent Videos:
You can go back to the homepage right here: Homepage





