Computer Networks: Crash Course Computer Science #28 - YouTube

Channel: CrashCourse

[3]
Hi, I’m Carrie Anne, and welcome to CrashCourse Computer Science!
[6]
The internet is amazing.
[7]
In just a few keystrokes, we can stream videos on Youtube -- Hello! -- read articles on Wikipedia,
[13]
order supplies on amazon, video chat with friends, and tweet about the weather.
[17]
Without a doubt, the ability for computers, and their users, to send and receive information
[21]
over a global telecommunications network forever changed
[24]
the world.
[25]
150 years ago, sending a letter from London to California would have taken two to three
[28]
weeks, and that’s if you paid for express mail.
[31]
Today, that email takes a fraction of a second.
[33]
This million fold improvement in latency, that’s the time it takes for a message to
[37]
transfer, juiced up the global economy helping the modern world to move at the speed of light
[41]
on fiber optic cables spanning the globe.
[43]
You might think that computers and networks always went hand in hand, but actually most
[47]
computers pre-1970 were humming away all alone.
[50]
However, as big computers began popping up everywhere, and low cost machines started
[54]
to show up on people’s desks, it became increasingly useful to share data and resources,
[59]
and the first networks of computers appeared.
[61]
Today, we’re going to start a three-episode arc on how computer networks came into being
[65]
and the fundamental principles and techniques that power them.
[68]
INTRO
[77]
The first computer networks appeared in the 1950s and 60s.
[81]
They were generally used within an organization – like a company or research lab – to
[85]
facilitate the exchange of information between different people and computers.
[88]
This was faster and more reliable than the previous method of having someone walk a pile
[92]
of punch cards, or a reel of magnetic tape, to a computer on the other side of a building
[96]
‒ which was later dubbed a sneakernet.
[99]
A second benefit of networks was the ability to share physical resources.
[102]
For example, instead of each computer having its own printer, everyone could share one
[106]
attached to the network.
[107]
It was also common on early networks to have large, shared, storage drives, ones too expensive
[113]
to have attached to every machine.
[114]
These relatively small networks of close-by computers are called Local Area Networks,
[118]
or LANs.
[119]
A LAN could be as small as two machines in the same room, or as large as a university
[123]
campus with thousands of computers.
[125]
Although many LAN technologies were developed and deployed, the most famous and succesful
[129]
was Ethernet, developed in the early 1970s at Xerox PARC, and still widely used today.
[134]
In its simplest form, a series of computers are connected to a single, common ethernet cable.
[139]
When a computer wants to transmit data to another computer, it writes the data, as an
[143]
electrical signal, onto the cable.
[145]
Of course, because the cable is shared, every computer plugged into the network sees the
[149]
transmission, but doesn’t know if data is intended for them or another computer.
[153]
To solve this problem, Ethernet requires that each computer has a unique Media Access Control
[157]
address, or MAC address.
[158]
This unique address is put into a header that prefixes any data sent over the network.
[163]
So, computers simply listen to the ethernet cable, and only process data when they see
[168]
their address in the header.
[169]
This works really well; every computer made today comes with its own unique MAC address
[174]
for both Ethernet and WiFi.
[175]
The general term for this approach is Carrier Sense Multiple Access, or CSMA for short.
[180]
The “carrier”, in this case, is any shared transmission medium that carries data – copper
[184]
wire in the case of ethernet, and the air carrying radio waves for WiFi.
[188]
Many computers can simultaneously sense the carrier, hence the “Sense” and “Multiple
[192]
Access”, and the rate at which a carrier can transmit data is called its Bandwidth.
[196]
Unfortunately, using a shared carrier has one big drawback.
[200]
When network traffic is light, computers can simply wait for silence on the carrier, and
[204]
then transmit their data.
[205]
But, as network traffic increases, the probability that two computers will attempt to write data
[210]
at the same time also increases.
[213]
This is called a collision, and the data gets all garbled up, like two people trying to
[217]
talk on the phone at the same time.
[218]
Fortunately, computers can detect these collisions by listening to the signal on the wire.
[223]
The most obvious solution is for computers to stop transmitting, wait for silence, then
[227]
try again.
[228]
Problem is, the other computer is going to try that too, and other computers on the network
[232]
that have been waiting for the carrier to go silent will try to jump in during any pause.
[236]
This just leads to more and more collisions.
[239]
Soon, everyone is talking over one another and has a backlog of things they need to say,
[243]
like breaking up with a boyfriend over a family holiday dinner.
[246]
Terrible idea!
[247]
Ethernet had a surprisingly simple and effective fix.
[250]
When transmitting computers detect a collision, they wait for a brief period before attempting
[255]
to re-transmit.
[256]
As an example, let’s say 1 second.
[258]
Of course, this doesn’t work if all the computers use the same wait duration -- they’ll
[262]
just collide again one second later.
[264]
So, a random period is added: one computer might wait 1.3 seconds, while another waits
[269]
1.5 seconds.
[270]
With any luck, the computer that waited 1.3 seconds will wake up, find the carrier to
[274]
be silent, and start transmitting.
[277]
When the 1.5 second computer wakes up a moment later, it’ll see the carrier is in use,
[281]
and will wait for the other computer to finish.
[283]
This definitely helps, but doesn’t totally solve the problem, so an extra trick is used.
[288]
As I just explained, if a computer detects a collision while transmitting, it will wait
[292]
1 second, plus some random extra time.
[294]
However, if it collides again, which suggests network congestion, instead of waiting another
[299]
1 second, this time it will wait 2 seconds.
[302]
If it collides again, it’ll wait 4 seconds, and then 8, and then 16, and so on, until
[307]
it’s successful.
[308]
With computers backing off, the rate of collisions goes down, and data starts moving again, freeing
[312]
up the network.
[313]
Family dinner saved!
[314]
This “backing off” behavior using an exponentially growing wait time is called Exponential Backoff.
[320]
Both Ethernet and WiFi use it, and so do many transmission protocols.
[324]
But even with clever tricks like Exponential Backoff, you could never have an entire university’s
[328]
worth of computers on one shared ethernet cable.
[331]
To reduce collisions and improve efficiency, we need to shrink the number of devices on
[335]
any given shared carrier -- what’s called the Collision Domain.
[338]
Let go back to our earlier Ethernet example, where we had six computers on one shared cable,
[344]
a.k.a. one collision domain.
[345]
To reduce the likelihood of collisions, we can break this network into two collision
[349]
domains by using a Network Switch.
[351]
It sits between our two smaller networks, and only passes data between them if necessary.
[356]
It does this by keeping a list of what MAC addresses are on what side of the network.
[360]
So if A wants to transmit to C, the switch doesn’t forward the data to the other network
[364]
– there’s no need.
[365]
This means if E wants to transmit to F at the same time, the network is wide open, and
[369]
two transmissions can happen at once.
[371]
But, if F wants to send data to A, then the switch passes it through, and the two networks
[375]
are both briefly occupied.
[377]
This is essentially how big computer networks are constructed, including the biggest one
[381]
of all – The Internet – which literally inter-connects a bunch of smaller networks,
[386]
allowing inter-network communication.
[387]
What’s interesting about these big networks, is that there’s often multiple paths to
[391]
get data from one location to another.
[393]
And this brings us to another fundamental networking topic, routing.
[396]
The simplest way to connect two distant computers, or networks, is by allocating a communication
[401]
line for their exclusive use.
[403]
This is how early telephone systems worked.
[405]
For example, there might be 5 telephone lines running between Indianapolis and Missoula.
[409]
If John picked up the phone wanting to call Hank, in the 1910s, John would tell a human
[413]
operator where he wanted to call, and they’d physically connect John’s phone line into
[417]
an unused line running to Missoula.
[419]
For the length of the call, that line was occupied, and if all 5 lines were already
[423]
in use, John would have to wait for one to become free.
[426]
This approach is called Circuit Switching, because you’re literally switching whole
[429]
circuits to route traffic to the correct destination.
[431]
It works fine, but it’s relatively inflexible and expensive, because there’s often unused
[436]
capacity.
[437]
On the upside, once you have a line to yourself – or if you have the money to buy one for
[441]
your private use – you can use it to its full capacity, without having to share.
[445]
For this reason, the military, banks and other high importance operations still buy dedicated
[449]
circuits to connect their data centers.
[452]
Another approach for getting data from one place to another is Message Switching, which
[456]
is sort of like how the postal system works.
[458]
Instead of dedicated route from A to B, messages are passed through several stops.
[462]
So if John writes a letter to Hank, it might go from Indianapolis to Chicago, and then
[466]
hop to Minneapolis, then Billings, and then finally make it to Missoula.
[470]
Each stop knows where to send it next because they keep a table of where to pass letters
[475]
given a destination address.
[476]
What’s neat about Message Switching is that it can use different routes, making communication
[480]
more reliable and fault-tolerant.
[482]
Sticking with our mail example, if there’s a blizzard in Minneapolis grinding things
[485]
to a halt, the Chicago mail hub can decide to route the letter through Omaha instead.
[490]
In our example, cities are acting like network routers.
[493]
The number of hops a message takes along a route is called the hop count.
[496]
Keeping track of the hop count is useful because it can help identify routing problems.
[500]
For example, let’s say Chicago thinks the fastest route to Missoula is through Omaha,
[505]
but Omaha thinks the fastest route is through Chicago.
[507]
That's bad, because both cities are going to look at the destination address, Missoula,
[511]
and end up passing the message back and forth between them, endlessly.
[514]
Not only is this wasting bandwidth, but it’s a routing error that needs to get fixed!
[518]
This kind of error can be detected because the hop count is stored with the message and
[521]
updated along its journey.
[523]
If you start seeing messages with high hop counts, you can bet something has gone awry
[527]
in the routing!
[528]
This threshold is the Hop Limit.
[530]
A downside to Message Switching is that messages are sometimes big.
[533]
So, they can clog up the network, because the whole message has to be transmitted from
[537]
one stop to the next before continuing on its way.
[540]
While a big file is transferring, that whole link is tied up.
[544]
Even if you have a tiny, one kilobyte email trying to get through, it either has to wait
[548]
for the big file transfer to finish or take a less efficient route.
[551]
That’s bad.
[552]
The solution is to chop up big transmissions into many small pieces, called packets.
[556]
Just like with Message Switching, each packet contains a destination address on the network,
[560]
so routers know where to forward them.
[562]
This format is defined by the “Internet Protocol”, or IP for short, a standard created
[566]
in the 1970s.
[568]
Every computer connected to a network gets an IP Address.
[571]
You’ve probably seen these as four, 8-bit numbers written with dots in between.
[575]
For example,172.217.7.238 is an IP Address for one of Google’s servers.
[581]
With millions of computers online, all exchanging data, bottlenecks can appear and disappear
[585]
in milliseconds.
[586]
Network routers are constantly trying to balance the load across whatever routes they know
[590]
to ensure speedy and reliable delivery, which is called congestion control.
[595]
Sometimes different packets from the same message take different routes through a network.
[598]
This opens the possibility of packets arriving at their destination out of order, which is
[602]
a problem for some applications.
[604]
Fortunately, there are protocols that run on top of IP, like TCP/IP, that handle this
[609]
issue.
[610]
We’ll talk more about that next week.
[611]
Chopping up data into small packets, and passing these along flexible routes with spare capacity,
[616]
is so efficient and fault-tolerant, it’s what the whole internet runs on today.
[620]
This routing approach is called Packet Switching.
[622]
It also has the nice property of being decentralized, with no central authority or single point
[627]
of failure.
[628]
In fact, the threat of nuclear attack is why packet switching was developed during the
[632]
cold war!
[633]
Today, routers all over the globe work cooperatively to find efficient routings, exchanging information
[638]
with each other using special protocols, like the Internet Control Message Protocol (ICMP)
[643]
and the Border Gateway Protocol (BGP).
[646]
The world's first packet-switched network, and the ancestor to the modern internet, was
[649]
the ARPANET, named after the US agency that funded it, the Advanced Research Projects
[654]
Agency.
[655]
Here’s what the entire ARPANET looked like in 1974.
[658]
Each smaller circle is a location, like a university or research lab, that operated
[662]
a router.
[663]
They also plugged in one or more computers – you can see PDP-1’s, IBM System 360s,
[668]
and even an ATLAS in London connected over a satellite link.
[672]
Obviously the internet has grown by leaps and bounds in the decades since.
[675]
Today, instead of a few dozen computers online, it’s estimated to be nearing 10 billion.
[680]
And it continues to grow rapidly, especially with the advent of wifi-connected refrigerators
[685]
and other smart appliances, forming an “internet of things”.
[689]
So that’s part one – an overview of computer networks.
[692]
Is it a series of tubes?
[694]
Well, sort of.
[695]
Next week we’ll tackle some higher-level transmission protocols, slowly working our
[699]
way up to the World Wide Web.
[701]
I’ll see you then!