🔍

Alternative to SIR: Modelling coronavirus (COVID-19) with stochastic process [PART I] - YouTube

Channel: Mathemaniac

[0]

A lot of channels have talked about the SIR model, including my own, but there is a very

[5]

different way to model coronavirus, which boils down to how diseases really spread from

[11]

person to person. This is generally known as branching process, because every individual

[18]

here branches out to some random number of other individuals. Because this branching

[24]

out is random, as it can have 2 branches, 3 branches, or even not branching out at all,

[30]

we say this is a kind of stochastic process. Stochastic being a fancier name for random.

[38]

This video will focus on the simplest type of branching process, known as Bienayme-Galton-Watson

[43]

process, BGW for short. So what does this BGW say about the spread of coronavirus?

[53]

Before we move on, we need to introduce a few terminologies. All people on the same

[58]

vertical line would be in the same generation. We call this orange column the first generation,

[65]

with the number of individuals denoted as X_1. Similarly, we denote X_2, X_3 and so

[72]

on. Next up, we say these two individuals from the same branch the offspring of the

[78]

individual in the previous generation that spreads the disease to them. The reason we

[84]

use these terms is that this process was originally studied in a different context related to

[90]

reproduction, but that’s a story for another time.

[94]

Let’s focus on the first branching here. We say that this is a random process, so actually,

[100]

it could stop branching right away, or branch to 1 individual, or 2, or 3 and so on. Because

[108]

the number of branches is just X_1, the number of offspring of this first patient, we say

[114]

that X_1 follows a distribution. The distribution can be more conveniently written as a table.

[122]

The left column represents all the possible values of X_1, and the right column represents

[128]

the corresponding probability. So the probability of X_1 being 0 is p_0 and so on. Because these

[137]

are the probabilities, they have to be nonnegative, and the sum of these probabilities would be

[143]

[145]

The reason why this BGW process is the simplest branching process is that it assumes two things:

[152]

each branching that you see here are all the same, which means it all follows the same

[157]

distribution as the first branching. So they are said to have the same offspring distribution.

[164]

The second simplification is that it assumes each individual in the same generation spreads

[171]

the disease independently. Independence has a specific meaning in probability theory,

[177]

summarised in this formula. As a very quick detour, let’s see why we have this formula

[183]

for independence.

[185]

To see why this formula has anything to do with independence of two events, let’s consider

[191]

this rectangular box as the entire sample space. And the green and yellow ovals represent

[197]

the events A and B respectively. Then the overlapping region represents A and B happening

[204]

together. For the intuitive sense of independence, the probability of B should be the same regardless

[212]

of whether A happens. So if we just focus on the event A, the proportion of the overlapping

[219]

region should be the same as the proportion of the yellow bubble within the rectangular

[225]

box itself. This means the ratio of probabilities on the left side equals the probability of

[232]

B. Just by rearranging, we get this definition of independence. More generally, if we have

[240]

n independent events, we will have the probabilities of them happening together to be just the

[245]

product of the individual probabilities.

[248]

Going back to the BGW model, the two questions that we want to ask is what is the number

[255]

of individuals in the nth generation, but because this is a random process, it can be

[261]

0, 1 and so on. So actually what we are looking for should be the distribution of X_n, detailing

[268]

the probabilities of the different values of X_n. The second question is what’s called

[274]

the extinction probability. This is the probability that at some point, the entire generation

[280]

of patients just stop spreading coronavirus entirely. But how do we get started on these

[287]

two problems?

[289]

The main issue here is that there seems to be a lot of parameters in this offspring distribution

[294]

to characterise the entire process. These probabilities here can all change. So we want

[300]

to turn this table to just one single thing. We do this via something called a generating

[307]

function, which is a way to encode an infinite amount of data into just one thing. It does

[313]

the encoding like this: the final product will be a function of a variable, say z. This

[320]

will be a power series in z. The constant term would be p_0, the coefficient of the

[327]

z term would be p_1, and so on. As a very simple example to illustrate why this is useful,

[334]

suppose the probabilities are just negative powers of 2, then using the formula for geometric

[340]

series, we can obtain the generating function to be just a succinct formula.

[346]

This hopefully explains why generating functions are actually very useful. Basically, if you

[352]

have all these probabilities in the distribution, then you can encode it pretty easily to this

[358]

generating function. What’s not obvious is the decoding, but it can be done. If you

[364]

are a bit concerned about the rigour of all these, pause and read this.

[369]

Anyway, another useful way to see this generating function is that it is the weighted average

[375]

of z^X_1. This is because p_0 of the time, X_1 is 0, and so z^X_1 is just 1, and you

[386]

get the constant term, and p_1 of the time, X_1 is 1, and so z^X_1 is just z, and you

[395]

get the z term to have the weight p_1, and so on. This weighted average is usually called

[402]

the expected value, denoted as E of z^X_1. All these discussions would also be valid

[411]

for a general X_n, the number of individuals in the nth generation, just that the probabilities,

[418]

hence the generating function or the weighted average are all different, which are all exactly

[424]

what we are after. So basically, if we are given the offspring distribution, then we

[430]

can encode it in the generating function, which can be written in terms of weighted

[435]

averages, then we will find some way to find the generating function for X_n, which we

[443]

then decode to find the distribution of X_n, which is exactly what we want. This is a much

[450]

easier approach than just directly going from the offspring distribution.

[456]

In this BGW process, we have X_n to be the number of individuals in the nth generation.

[463]

Because the entire generation comes from the branchings of the previous generation, X_n

[468]

is the sum of the different copies of X_1, because each branching is just a copy of X_1.

[474]

The number of copies is X_(n-1), the number of people in the previous generation.

[481]

So the generating function for X_n, written in terms of the weighted average, can be written

[486]

in terms of this, by rewriting X_n as the sum of copies of X_1. This is pretty complicated

[494]

because X_{n-1} is not a fixed value, because it can vary. So let’s suppose X_{n-1} is

[501]

a fixed number m, and then we will see what we can do afterwards. Because we are calculating

[508]

weighted averages, we have to sum up terms of this form, with the weight being the probability

[516]

that the copies of X_1 being a specific value. Because we have assumed all the spreading

[521]

here is independent, we can apply this more general definition of independence. So we

[528]

can change this weight into a product of probabilities as shown here. Similarly, we can rewrite the

[535]

value here, into a product of these powers.

[539]

Rearranging, we get a product of terms of the form of weighted values. So if we first

[545]

sum up all the values of a_1, then only these two terms will be affected. Summing terms

[552]

of this form is precisely the weighted average of z^(X_1), which is the generating function

[558]

for X_1. Similarly, the same generating function appears for the other pairs. Since there are

[566]

m pairs, on the condition that X_(n-1) is really m, the weighted average would be G(z)

[573]

to the m. However, X_(n-1) is really a variable, this temporary result only happens sometimes.

[581]

More precisely, this G(z) to the m will happen with exactly this probability. This is again

[590]

in the form of weighted values. And so, this can be expressed as a weighted average of

[597]

these powers of G(z). This is just the generating function for X_(n-1), but applied to G(z)

[605]

as an argument. In other words, G_n(z) is G_(n-1) applied to G(z). This is true for

[614]

all n larger than 1, and so we can apply this identity again to get G_(n-2) of G of G of

[623]

z. Repeatedly applying this we get the generating function for X_n is just generating function

[630]

for X_1 iterated n times.

[633]

This means that we have finally cracked the first question, because given the offspring

[638]

distribution, we can theoretically work out the distribution of X_n by decoding that generating

[645]

function. As a sanity check that this is correct, let’s suppose there will always be 2 branches,

[651]

which means that the offspring distribution looks something like this, where the probability

[657]

that X_1 equals 2 is 1, and all other probabilities are 0. This means the generating function

[666]

is z^2. By iterating it n times, the generating function for the nth generation is z^(2^n),

[675]

meaning that the number of individuals in the nth generation is 2^n, which is what we

[680]

expected!

[682]

However, the second question will be discussed in another video, which will be released in

[687]

a few days. What about limitations? Or the original context that prompted the three mathematicians

[694]

to study this process? Don’t worry, I will cover all these in the future.

[699]

If you enjoyed this video, be sure to give it a like and subscribe to the channel with

[703]

notifications on! See you next time!

Most Recent Videos:

WE KILLED 6 HEROIC BOSSES! - YouTube

¿Quién inventó el dinero? - YouTube

Cuándo se inventó el dinero y cómo el dólar se convirtió en la principal moneda del mundo - YouTube

This Citizenship Program is Failing - YouTube

Candida Treatment Protocol w/ Dr. DiNezza - YouTube

$500M investor reacts to Real Estate Tik Toks 2 - YouTube

You can go back to the homepage right here: Homepage