But what is a partial differential equation? | DE2 - YouTube

Channel: 3Blue1Brown

[4]
After seeing how we think about ordinary differential equations in chapter 1, we turn now to an
[9]
example of a partial differential equation, the heat equation.
[13]
To set things up, imagine you have some object like a piece of metal, and you know how the
[17]
heat is distributed across it at one moment; what the temperature of every individual point
[23]
is.
[24]
You might think of that temperature here as being graphed over the body.
[25]
The question is, how will that distribution change over time, as heat flows from the warmer
[29]
spots to the cooler ones.
[31]
The image on the left shows the temperature of an example plate with color, with the graph
[36]
of that temperature being shown on the right, both changing with time.
[39]
To take a concrete 1d example, say you have two rods at different temperatures, where
[44]
that temperature is uniform on each one.
[47]
You know that when you bring them into contact, the temperature will tend towards being equal
[50]
throughout the rod, but how exactly?
[56]
What will the temperature distribution be at each point in time?
[61]
As is typical with differential equations, the idea is that it’s easier to describe
[65]
how this setup changes from moment to moment than it is to jump to a description of the
[70]
full evolution.
[71]
We write this rule of change in the language of derivatives, though as you’ll see we’ll
[76]
need to expand our vocabulary a bit beyond ordinary derivatives.
[79]
Don’t worry, we’ll learn how to read these equations in a minute.
[84]
Variations of the heat equation show up in many other parts of math and physics, like
[88]
Brownian motion, the Black-Scholes equations from finance, and all sorts of diffusion,
[93]
so there are many dividends to be had from a deep understanding of this one setup.
[98]
In the last video, we looked at ways of building understanding while acknowledging the truth
[102]
that most differential equations to difficult to actually solve.
[106]
And indeed, PDEs tend to be even harder than ODEs, largely because they involve modeling
[112]
infinitely many values changing in concert.
[115]
But our main character now is an equation we actually can solve.
[119]
In fact, if you’ve ever heard of Fourier series, you may be interested to know that
[123]
this is the physical problem which baby face Fourier over here was solving when he stumbled
[128]
across the corner of math now so replete with his name.
[132]
We’ll dig into much more deeply into Fourier series in the next chapter, but I would like
[135]
to give at least a little hint of the beautiful connection which is to come.
[142]
This animation is showing how lots of little rotating vectors, each rotating at some constant
[147]
integer frequency, can trace out an arbitrary shape.
[151]
To be clear, what’s happening is that these vectors are being added together, tip to tail,
[159]
and you might imagine the last one as having a pencil at its tip, tracing some path as
[164]
it goes.
[167]
This tracing usually won’t be a perfect replica of the target shape, in this animation
[171]
a lower case letter f, but the more circles you include, the closer it gets.
[175]
This animation uses only 100 circles, and I think you’d agree the deviations from
[180]
the real path are negligible.
[185]
Tweaking the initial size and angle of each vector gives enough control to approximate
[190]
any curve you want.
[192]
At first, this might just seem like an idle curiosity; a neat art project but little more.
[200]
In fact, the math underlying this is the same as the math describing the physics of heat
[205]
flow, as you’ll see in due time.
[210]
But we’re getting ahead of ourselves.
[212]
Step one is to build up to the heat equation, and for that let’s be clear on what the
[217]
function we’re analyzing is, exactly.
[219]
The heat equation
[220]
To be clear about what this graph represents, we have a rod in one-dimension, and we’re
[223]
thinking of it as sitting on an x-axis, so each point of the rod is labeled with a unique
[228]
number, x.
[231]
The temperature is some function of that position number, T(x), shown here as a graph above
[236]
it.
[237]
But really, since this value changes over time, we should think of it this a function
[242]
as having one more input, t for time.
[245]
You could, if you wanted, think of the input space as a two-dimensional plane, representing
[250]
space and time, with the temperature being graphed as a surface above it, each slice
[255]
across time showing you what the distribution looks like at a given moment.
[263]
Or you could simply think of the graph of the temperature changing over time.
[266]
Both are equivalent.
[270]
This surface is not to be confused with what I was showing earlier, the temperature graph
[274]
of a two-dimensional body.
[276]
Be mindful of whether time is being represented with its own axis, or if it’s being represented
[282]
with an animation showing literal changes over time.
[287]
Last chapter, we looked at some systems where just a handful of numbers changed over time,
[291]
like the angle and angular velocity of a pendulum, describing that change in the language of
[296]
derivatives.
[297]
But when we have an entire function changing with time, the mathematical tools become slightly
[302]
more intricate.
[303]
Because we’re thinking of this temperature as a function with multiple dimensions to
[306]
its input space, in this case, one for space and one for time, there are multiple different
[312]
rates of change at play.
[314]
There’s the derivative with respect to x; how rapidly the temperature changes as you
[319]
move along the rod.
[321]
You might think of this as the slope of our surface when you slice it parallel to the
[324]
x-axis; given a tiny step in the x-direction, and the tiny change to temperature caused
[329]
by it, what’s the ratio.
[332]
Then there’s the rate of change with time, which you might think of as the slope of this
[340]
surface when we slice it in a direction parallel to the time axis.
[344]
Each one of these derivatives only tells part of the story for how the temperature function
[348]
changes, so we call them “partial derivatives”.
[351]
To emphasize this point, the notation changes a little, replacing the letter d with this
[355]
special curly d, sometimes called “del”.
[358]
Personally, I think it’s a little silly to change the notation for this since it’s
[362]
essentially the same operation.
[364]
I’d rather see notation which emphasizes the del T terms in these numerators refer
[369]
to different changes.
[370]
One refers to a small change to temperature after a small change in time, the other refers
[375]
to the change in temperature after a small step in space.
[382]
To reiterate a point I made in the calculus series, I do think it's healthy to initially
[386]
read derivatives like this as a literal ratio between a small change to a function's output,
[391]
and the small change to the input that caused it.
[394]
Just keep in mind that what this notation is meant to convey is the limit of that ratio
[399]
for smaller and smaller nudges to the input, rather than for some specific finitely small
[405]
nudge.
[406]
This goes for partial derivatives just as it does for ordinary derivatives.
[413]
The heat equation is written in terms of these partial derivatives.
[416]
It tells us that the way this function changes with respect to time
[420]
depends on how it changes with respect to space.
[423]
More specifically, it's proportional to the second partial derivative with respect to x.
[428]
At a high level, the intuition is that at points where the temperature distribution
[433]
curves, it tends to change in the direction of that curvature.
[438]
Since a rule like this is written with partial derivatives, we call it a partial differential
[443]
equation.
[444]
This has the funny result that to an outsider, the name sounds like a tamer version of ordinary
[448]
differential equations when to the contrary partial differential equations tend to tell
[452]
a much richer story than ODEs.
[457]
The general heat equation applies to bodies in any number of dimensions, which would mean
[461]
more inputs to our temperature function, but it’ll be easiest for us to stay focused
[465]
on the one-dimensional case of a rod.
[468]
As it is, graphing this in a way which gives time its own axis already pushes the visuals
[472]
into three-dimensions.
[477]
But where does an equation like this come from?
[478]
How could you have thought this up yourself?
[481]
Well, for that, let’s simplify things by describing a discrete version of this setup,
[485]
where you have only finitely many points x in a row.
[489]
This is sort of like working in a pixelated universe, where instead of having a continuum
[493]
of temperatures, we have a finite set of separate values.
[497]
The intuition here is simple: For a particular point, if its two neighbors on either side
[501]
are, on average, hotter than it is, it will heat up.
[506]
If they are cooler on average, it will cool down.
[509]
Focus on three neighboring points, x1, x2, and x3, with corresponding temperatures T1,
[516]
T2, and T3.
[517]
What we want to compare is the average of T1 and T3 with the value of T2.
[525]
When this difference is greater than 0, T2 will tend to heat up.
[530]
And the bigger the difference, the faster it heats up.
[534]
Likewise, if it’s negative, T2 will cool down, at a rate proportional to the difference.
[543]
More formally, the derivative of T2, with respect to time, is proportional to this difference
[549]
between the average value of its neighbors and its own value.
[552]
Alpha, here, is simply a proportionality constant.
[556]
To write this in a way that will ultimately explain the second derivative in the heat
[559]
equation, let me rearrange this right-hand side in terms of the difference between T3
[564]
and T2 and the difference between T2 and T1.
[568]
You can quickly check that these two are the same.
[570]
The top has half of T1, and in the bottom, there are two minuses in front of the T1,
[576]
so it’s positive, and that half has been factored out.
[580]
Likewise, both have half of T3.
[584]
Then on the bottom, we have a negative T2 effectively written twice, so when you take
[589]
half, it’s the same as the single -T2 up top.
[594]
As I said, the reason to rewrite it is that it takes a step closer to the language of
[600]
derivatives.
[601]
Let’s write these as delta-T1 and delta-T2.
[604]
It’s the same number, but we’re adding a new perspective.
[611]
Instead of comparing the average of the neighbors to T2, we’re thinking of the difference
[615]
of the differences.
[616]
Here, take a moment to gut-check that this makes sense.
[619]
If those two differences are the same, then the average of T1 and T3 is the same as T2,
[625]
so T2 will not tend to change.
[628]
If delta-T2 is bigger than delta-T1, meaning the difference of the differences is positive,
[635]
notice how the average of T1 and T3 is bigger than T2, so T2 tends to increase.
[641]
Likewise, if the difference of the differences is negative, meaning delta-T2 is smaller than
[649]
delta-T1, it corresponds to the average of these neighbors being less than T2.
[663]
This is known in the lingo as a “second difference”.
[665]
If it feels a little weird to think about, keep in mind that it’s essentially a compact
[669]
way of writing this idea of how much T2 differs from the average of its neighbors, just with
[674]
an extra factor of 1/2 is all.
[676]
That factor doesn’t really matter, because either way we’re writing our equation in
[679]
terms of some proportionality constant.
[682]
The upshot is that the rate of change for the temperature of a point is proportional
[686]
to the second difference around it.
[689]
As we go from this finite context to the infinite continuous case, the analog of a second difference
[694]
is the second derivative.
[698]
Instead of looking at the difference between temperature values at points some fixed distance
[702]
apart, you consider what happens as you shrink this size of that step towards 0.
[708]
And in calculus, instead of asking about absolute differences, which would approach 0, you think
[713]
in terms of the rate of change, in this case, what’s the rate of change in temperature
[718]
per unit distance.
[719]
Remember, there are two separate rates of change at play: How does the temperature as
[724]
time progresses, and how does the temperature change as you move along the rod.
[729]
The core intuition remains the same as what we just looked at for the discrete case: To
[732]
know how a point differs from its neighbors, look not just at how the function changes
[736]
from one point to the next, but at how that rate of change changes.
[744]
This is written as del^2 T / del-x^2, the second partial derivative of our function
[750]
with respect to x.
[752]
Notice how this slope increases at points where the graph curves upwards, meaning the
[756]
rate of change of the rate of change is positive.
[760]
Similarly, that slope decreases at points where the graph curves downward, where the
[765]
rate of change of the rate of change is negative.
[769]
Tuck that away as a meaningful intuition for problems well beyond the heat equation: Second
[774]
derivatives give a measure of how a value compares to the average of its neighbors.
[778]
Hopefully, that gives some satisfying added color to this equation.
[782]
It’s pretty intuitive when reading it as saying curved points tend to flatten out,
[787]
but I think there’s something even more satisfying seeing a partial differential equation
[791]
arise, almost mechanistically, from thinking of each point as tending towards the average
[796]
of its neighbors.
[798]
Take a moment to compare what this feels like to the case of ordinary differential equations.
[803]
For example, if we have multiple bodies in space, tugging on each other with gravity,
[808]
we have a handful of changing numbers: The coordinates for the position and velocity
[812]
of each body.
[814]
The rate of change for any one of these values depends on the values of the other numbers,
[819]
which we write down as a system of equations.
[822]
On the left, we have the derivatives of these values with respect to time, and the right
[826]
is some combination of all these values.
[830]
In our partial differential equation, we have infinitely many values from a continuum, all
[836]
changing.
[837]
And again, the way any one of these values changes depends on the other values.
[842]
But helpfully, each one only depends on its immediate neighbors, in some limiting sense
[847]
of the word neighbor.
[849]
So here, the relation on the right-hand side is not some sum or product of the other numbers,
[854]
it’s also a kind of derivative, just a derivative with respect to space instead of time.
[860]
In a sense, this one partial differential equation is like a system of infinitely many
[867]
equations, one for each point on the rod.
[870]
When your object is spread out in more than one dimension, the equation looks quite similar,
[879]
but you include the second derivative with respect to the other spatial directions as
[883]
well.
[885]
Adding all the second spatial second derivatives like this is a common enough operation that
[890]
it has its own special name, the “Laplacian”, often written as an upside triangle squared.
[895]
It’s essentially a multivariable version of the second derivative, and the intuition
[900]
for this equation is no different from the 1d case: This Laplacian still can be thought
[905]
of as measuring how different a point is from the average of its neighbors, but now these
[910]
neighbors aren’t just to the left and right, they’re all around.
[914]
I did a couple of simple videos during my time at Khan Academy on this operator, if
[919]
you want to check them out.
[928]
For our purposes, let’s stay focused on one dimension.
[932]
If you feel like you understand all this, pat yourself on the back.
[935]
Being able to read a PDE is no joke, and it’s a powerful addition to your vocabulary for
[940]
describing the world around you.
[943]
But after all this time spent interpreting the equations, I say it’s high time we start
[947]
solving them, don’t you?
[949]
And trust me, there are few pieces of math quite as satisfying as what poodle-haired
[953]
Fourier over here developed to solve this problem.
[956]
All this and more in the next chapter.
[961]
I was originally inspired to cover this particular topic when I got an early view of Steve Strogatz’s
[969]
new book “Infinite Powers”.
[971]
This isn’t a sponsored message or anything like that, but all cards on the table, I do
[975]
have two selfish ulterior motives for mentioning it.
[978]
The first is that Steve has been a really strong, perhaps even pivotal, advocate for
[982]
the channel since its beginnings, and I’ve had the itch to repay the kindness for quite
[986]
a while.
[987]
The second is to make more people love math.
[992]
That might not sound selfish, but think about it: When more people love math, the potential
[995]
audience base for these videos gets bigger.
[998]
And frankly, there are few better ways to get people loving the subject than to expose
[1001]
them to Strogatz’s writing.
[1003]
If you have friends who you know would enjoy the ideas of calculus, but maybe have been
[1008]
intimidated by math in the past, this book really does an outstanding job communicating
[1012]
the heart of the subject both substantively and accessibly.
[1015]
Its core theme is the idea of constructing solutions to complex real-world problems from
[1020]
simple idealized building blocks, which as you’ll see is exactly what Fourier did here.
[1025]
And for those who already know and love the subject, you will still find no shortage of
[1028]
fresh insights and enlightening stories.
[1030]
Again, I know that sounds like an ad, but it’s not.
[1035]
I actually think you’ll enjoy the book.