馃攳
Neural Networks from Scratch - P.1 Intro and Neuron Code - YouTube
Channel: sentdex
[5]
What's going on everybody welcome to the Neural Networks
[7]
from Scratch series where we will be creating
[10]
creating Neural Networks end-to-end so from...
[13]
the Neurons, the Layers,
[16]
Activation functions, Optimization,...
[18]
Back Propagation all this stuff were going to be coding
[21]
that from scratch in python. Now...
[24]
everything we do we are going to show
first in truly raw python
[27]
no libraries or no 3rd party libraries
[30]
and then we are going to use NumPy for multiple reasons
[33]
NumPy just makes a lot of sense here. It's a
[36]
extremely useful library and
[39]
it'll cut our lines of a full application down-a-ton
[43]
it'll make it much faster
[44]
and NumPy is a great thing to learn, so...
[46]
we will show everything from scratch in python first
and then we are going to use NumPy. Now...
[50]
why would anybody wanna do this to themselves?
[54]
Well, first of all, it's very interesting umm...
[57]
the idea is not-even though we effectively
we are going
[60]
to be programming our own Neural Networks
[62]
framework, it's not really the
[65]
purpose here. The purpose is to actually learn
[68]
how Neural Networks work at a very deep level so that
[71]
when we go back to whatever Framework
[74]
we actually use, be it PyTorch or Tensorflow's Keras
[76]
or maybe some library that doesn't even exist yet.
[80]
We actually understand what we are doing. So
[83]
from myself when I at least when I learned Deep Learning, it was all-
[85]
-yes it was hard but at the same time
[88]
everything was kinda solved for me you know, how
[91]
many layers, how many multiple layers, what Activation functions to use
[94]
but all that stuff was just fed to me
[97]
I just kind of-okay
[100]
and then I just sort of memorized like here's the activation function
[103]
used for this sort of task
[106]
But I didn't understand why and
[108]
this became a problem when I tried to solve problems that
[112]
had not yet been solved for me.
[114]
So classifying images of handwritten
[117]
digits? Pretty darn simple. Taking that
[120]
a step further and classifying images of cats and dogs?
[123]
Pretty simple.
[126]
But then taking that just a tiny step further
[129]
and classifying images that were instead frames from a video game
[133]
and trying to map that to actions that I want to take
[136]
in the video game. Suddenly, I'm lost.
[139]
and that's.....
[141]
...that's no good and there's no way to really know
[144]
where to go next and you can see that there is a problem
[147]
but actually trying to solve
[149]
more custom problems is going to require to have a deeper
[152]
understanding of how things actually work
[155]
Unfortunately though deeper understanding Deep Learning
[158]
can be very very complicated.
[161]
So you can just-just a simple forward pass of a Neural Network
[164]
Neural Network looks
[166]
extremely daunting. So looking at the calculations
[169]
for a Neural Network, it looks pretty confusing fairly quick
[172]
You've got your input data
[173]
and every unique input and every unique neuron.
[176]
The information coming through has a unique weight associated-
[179]
with it. Those get summed together per neuron
[183]
neuron plus a bias. Run through an
[185]
activation function and then we do that for
[189]
every single layer, giving us the output
[192]
information. From there we wanna calculate a loss
[194]
which is a calculation of how wrong the neural Network is,
[198]
so that hopefully we can fix it.
[200]
And at the end of that, even though that was just a forward pass
[204]
that looks already extremely-extremely daunting
[208]
Now let's take a look at the exact same formula just in code version.
[212]
You have all your inputs times your weights
(Input x Weights)
[215]
And you don't have to follow along perfectly here just look at each element and see if any of these elements
[219]
look like they are over your head. They really shouldn't be.
[222]
So inputs times weights, you can make this even simpler by doing just a dot product so 'np.dot'
[227]
y1 is just the max of zero and the output
[230]
We do this for all the layers. Then we have
[233]
our softmax (activation function) at the very end.
[236]
And then the loss turns out to be
[239]
a negative log (logarithmic) loss
[241]
due to the nature of neural networks.
[245]
So this is the entire forward pass and calculation of loss formula
[249]
I urge you to look at all of the functions and things that we are doing here
[254]
to determine if any of these is really over your head because it shouldn't be right?
[258]
We've got very simple functions going on here
[260]
We are calculating a log, sum-if you don't know what a 'log' means btw we're going to explain it.
[265]
'Sum', you should know what that means.
[267]
'Exponential', again if you don't know what that means we are going to explain but it's very simple.
[271]
'Dot Product', again if you don't know what that means
[273]
no worries we will explain it, very simple.
[275]
'Maximum', is just the max whatever two values you pass here. Again very simple.
[281]
Some more dot products,
[283]
'Transpose', again very simple.
[285]
If you don't know it we will explain it anyway
[291]
That's it! None of this is over your head I promise you
[294]
As far as prerequisites go the only expectation I have from the viewers is that you understand
[299]
programming and Object-Oriented Programming (OOP)
[302]
otherwise, you're going to feel kinda lost.
[304]
If you're coming from a different programming language you will probably find
[308]
python simple
[309]
You can probably use the programming language that you are familiar with
[313]
Everything is so low level here that you should be able
[317]
to follow along in any other language that you want
[319]
So feel free to do that, if you wanna do that
[322]
otherwise, if you wanna follow along in python make sure you know the basics and OOP
[327]
I'll put links in the description for both of those
[329]
If you need to brush up
[331]
Next, is the version of python
[335]
If you follow along with python we are gonna do things like 'fstring'
[339]
I'm gonna be on Python 3.7
[342]
The NumPy version, I can't think of any reason of why it would matter.
[346]
I'll put that information in the description
[349]
just in case any function does change.
[352]
Like I said, everything's so low level this series should be good for like 10 years.
[357]
Let's hope for that :)
[359]
So those are the prerequisites
[361]
it's really not much.
[363]
I don't expect anybody to have any background knowledge of Deep Learning
[366]
So if you do know things about deep learning, yes we are going to cover hopefully quickly
[369]
the fundamentals, just so people understand
[372]
like what exactly are we aiming for here?
[375]
And then the bulk of your understanding of how Neural Networks work
[379]
is gonna come from us just building these Neural Networks
[383]
If things feel a little fuzzy to you
[386]
it's probably normal, to be honest.
[389]
I think once you build a Neural Network from scratch
[392]
that is all the understanding you are gonna need
[395]
So you're not expected to know math or anything like that.
[398]
For me in college, the only math class I took was 'Math Fundamentals'
[404]
And I don't think we did any math calculation at all in that class.
[408]
It was definitely a joke.
[410]
So if I can do this, I know you can do this too.
[413]
So If you wanna brush up on Math
[415]
There is Khan academy for Linear Algebra and the calculus stuff
[421]
But I wouldn't even suggest you go through the full series on either of those topics
[424]
You can use those to kinda like a spot check issues that you still find confusing
[430]
With that in mind, this series is also provided in conjunction with the 'Neural Networks from Scratch' book
[435]
We are going to be covering the same material for the most part.
[439]
The book might be a little more verbose.
[442]
The series is obviously free
[444]
but the book ha various prices depending on what you want.
[447]
The E-book, Softcover, Hardcover. We ship everywhere in the world
[451]
Access to the book gives you access to the E-book.
[455]
So whichever version you buy you always have access to the E-book.
[459]
That gives you the opportunity to access the google docs
[462]
draft, currently it's in a draft form
[464]
at some point, it won't be a draft anymore
[466]
but you can highlight, post comments, ask questions inline with the text
[472]
Also if you're impatient you can access even though its a draft right now
[476]
it's complete from basically end-to-end
[479]
of training the Neural Network
[481]
and we're doing testing right now.
[483]
So, we are already obviously quite a bit ahead of where the videos are
[488]
so if you're impatient you can also access information earlier with the book
[493]
I would use the book as either reviews or the videos as reviews
[496]
So I would maybe read the book before watching the video and then use the video as a review of what I learned in the book or vice versa
[510]
This is a topic that you're not gonna blow through this in a weekend.
[513]
It's gonna require multiple sittings multiple environments
[516]
and ideally multiple mediums so,
[519]
If you're interested in the book you can get that at 'nnfs.io'.
[524]
So we call these Neural Networks because they look visually like a network.
[528]
You've got your Neurons which in this case are the blue circles.
[530]
They're connected via those orange lines
[534]
and in this case, we have basically the input layer, two hidden layers of 4 neurons each
[540]
and then your output layer.
[542]
Now data is gonna get past forward through this,
[545]
so it starts at your input layer, so in this case, we only have two
[549]
pieces of data they are gonna come in.
[550]
That gets past forward to that first hidden layer
[553]
Then that hidden layer passes data to the second layer and then finally to the output layer
[558]
where we hope that it will output something that we want, for example:
[563]
based on some sensor data maybe we wanna predict
[566]
Failure or not failure. You could either have one neuron or in this case we have two neurons.
[571]
So the top neuron might be a failure neuron, the bottom neuron is a not failure neuron.
[576]
And depending on which one has the higher value
[578]
that's the supposed prediction.
[581]
Now the end goal of neural networks like most machine learning is to take some input data
[584]
and produce output data that is desired.
[587]
In this case, we've got images of cats and dogs.
[589]
We hope that we can pass it through in pixel form to our neural network
[593]
and if its a dog then that final output neuron on top
[596]
is going to be the strongest, if it's a cat
[598]
then that final output neuron on the bottom will be the strongest.
[602]
And we can do this by tuning the weights and biases
[605]
So all those unique weights and biases, we do that by tuning those
[609]
and that is the actual training process.
[611]
It's tuning those in such a way so that
[613]
hopefully, we can take data that this neural network has never seen,
[617]
give it pictures of cats and dogs that it has never seen
[620]
and have it accurately predict those
[623]
So how and why do neural networks work?
[625]
Well, if you just look at them and really consider what's going on
[628]
every neuron is connected to the subsequent layer of neurons in folds.
[633]
So each of those orange lines, that connection
[635]
that's a unique weight
[637]
and then every neuron is a unique bias.
[639]
So what this ends up giving us
[642]
is a huge number of uniquely tunable parameters
[647]
that go into this GIGANTIC function. So for example, with
[650]
64 x 3 hidden layers here we have
[653]
9164 tunable parameters
[658]
in this gigantic function. And each of those parameters
[661]
impacts the output of the next neurons
[663]
and so on. And so what we end up having
[666]
are these complex relationships that can in theory
[669]
be mapped. So to me the
[672]
impressive thing of neural networks is not
[675]
necessarily all of those connections its not
[678]
really complex to understand.
[679]
What the hard part of neural networks and deep learning is figuring out
[685]
how to tune such a thing.
[688]
Alright! since I think it would be lame to not post
[692]
any code at all in this first video we are going to begin to code neuron.
[696]
But first I wanna go over the version numbers real quick
[698]
because quite frankly I am gonna forget to put it in the description
[702]
So I'm using Python 3.7.7
[704]
NumPy 1.18.2
[706]
Matplotlib 3.2.1
[708]
Again all of this should work
[711]
very far into the future
[713]
But just in case not, there are
[715]
the exact versions if anybody needs to have the exact version to follow along
[720]
Now, let's go ahead and begin to code
[723]
just to say I am using Sublime Text
[726]
You can use whatever editor you want.
[728]
Everyone's got a really strong opinion on editors.
[731]
I'm gonna be using Sublime text. There might be a time where we use Jupyter Notebook.
[735]
Who knows, I'm gonna use whatever makes the most sense to me at the time
[738]
which may not make sense to you or me later
[742]
Anyway, continuing on, so every neuron basically-
[745]
Let's pretend we are coding a neuron that's somewhere in this-
[749]
densely connected feed-forward multilayer
[752]
perceptron model. Using all the big words.
[756]
Don't worry about knowing what those mean, by the end of this
[759]
you will know all of those words.
[761]
But part the problem with learning deep learning is people use the same 3-4 words
[768]
for the exact same thing and it can be very daunting. So for now,
[771]
we are gonna called this a neuron, it's somewhere in our neural network.
[774]
Now in this fully connected neural network
[777]
every neuron has a unique connection
[781]
to every single previous neuron.
[785]
So, let's say there are 3 neurons
[788]
that are feeding into this neuron that we are gonna build.
[791]
So, we don't know much about those neurons but we know that they are outputting some value
[796]
So first, their output becomes the neurons that we are coding's inputs
[805]
We are gonna just make up some numbers, we're just gonna say 1.2, 5.1, 2.1
[811]
Those are the unique inputs
[815]
So these are outputs from the three neurons in the previous layer
[822]
Every unique input is also going to have a unique weight associated with it.
[828]
So we are going to say weights and you should know how many weights we are going to have
[833]
Well since we have three inputs we know we are going to have three weights (3.1, 2.1, 8.7)
[839]
I'm just making up these numbers.
[841]
It's just for, beginning to code how neurons are gonna work
[846]
So you've got your inputs, your weights
[848]
and then every unique neuron has a unique bias
[853]
So bias equals to 3
[858]
So now, the first step to for a neuron
[861]
is to add up all the inputs times weights plus the bias
[867]
So this is relatively simple.
[870]
In very raw python no loops required at this stage.
[873]
We're just going to say basically
[875]
output so far of this neuron is going to be
[879]
*look at the screen now*
[897]
*You need to have some basic knowledge of 'list' here*
[913]
That is basically so far...
[916]
There will be other things that will happen soon enough, but so far
[919]
that is the output to our neural network
[922]
So we'll just print the output, run it. 35.7
[925]
If you're new to Sublime Text for some reason
[928]
'ctrl + b' to run. But you'll have to set up the build system to run with python.
[935]
But I would expect how
[937]
that people know how to work with a programming language
[940]
But if not feel free to
[942]
either comment below, join us in 'discord.gg/sentdex
[948]
And we'll be happy to help.
[949]
Anyway that's it for now
[953]
if you've got questions, comments, concerns, whatever
[955]
moving forward feel free to ask them below
[957]
otherwise, we're just gonna keep slowly shipping away
[962]
at Neural Networks from Scratch
[964]
As you can see pretty darn simple so far
[967]
and for the most part, it's just adding a bunch of-
[970]
as we've broken it down so far
[973]
We are gonna break it down to the point where every little additional step
[976]
is gonna look an awful lot like that
[980]
There's a couple of points that may be a little more challenging
[982]
but the goal is to break it down
[985]
so much that it is painfully simple
[988]
So I will see you guys in the next video :)
Most Recent Videos:
You can go back to the homepage right here: Homepage