đ
Data Structures: Crash Course Computer Science #14 - YouTube
Channel: CrashCourse
[3]
Hi, I'm Carrie Anne, and welcome to Crash
Course Computer Science!
[5]
Last episode, we discussed a few example classic
algorithms, like sorting a list of numbers
[9]
and finding the shortest path in a graph.
[11]
What we didnât talk much about, is how the
data the algorithms ran on was stored in computer
[16]
memory.
[16]
You donât want your data to be like John
Greenâs college dorm room, with food, clothing
[19]
and papers strewn everywhere.
[21]
Instead, we want our data to be structured,
so that itâs organized, allowing things
[25]
to be easily retrieved and read.
[27]
For this, computer scientists use Data Structures!
[29]
INTRO
[39]
We already introduced one basic data structure
last episode, Arrays, also called lists or
[43]
Vectors in some languages.
[45]
These are a series of values stored in memory.
[47]
So instead of just a single value being saved
into a variable, like âj equals 5â, we
[51]
can define a whole series of numbers, and
save that into an array variable.
[55]
To be able to find a particular value in this
array, we have to specify an index.
[59]
Almost all programing languages start arrays
at index 0, and use a square bracket syntax
[64]
to denote array access.
[65]
So, for example, if we want to add the values
in the first and third spots of our array
[70]
âjâ, and save that into a variable âaâ,
we would write a line of code like this.
[74]
How an array is stored in memory is pretty
straightforward.
[76]
For simplicity, letâs say that the compiler
chose to store ours at memory location 1,000.
[81]
The array contains 7 numbers, and these are stored one after another in memory, as seen here.
[86]
So when we write âj index of 0â, the computer
goes to memory location 1,000, with an offset
[91]
of 0, and we get the value 5.
[93]
If we wanted to retrieve âj index of 5â,
our program goes to memory location 1000,
[98]
plus an offset of 5, which in this case, holds
a value of 4.
[102]
Itâs easy to confuse the fifth number in
the array with the number at index 5.
[105]
They are not the same.
[106]
Remember, the number at index 5 is the 6th
number in the array because the first number
[110]
is at index 0.
[111]
Arrays are extremely versatile data structures,
used all the time, and so there are many functions
[115]
that can handle them to do useful things.
[117]
For example, pretty much every programming
language comes with a built-in sort function,
[121]
where you just pass in your array, and it
comes back sorted.
[124]
So thereâs no need to write that algorithm
from scratch.
[126]
Very closely related are Strings, which are
just arrays of characters, like letters, numbers,
[131]
punctuation and other written symbols.
[133]
We talked about how computers store characters
way back in Episode 4.
[136]
Most often, to save a string into memory,
you just put it in quotes, like so.
[140]
Although it doesnât look like an array,
it is.
[143]
Behind the scenes, the memory looks like this.
[144]
Note that the string ends with a zero in memory.
[147]
Itâs not the character zero, but the binary
value 0.
[151]
This is called the null character, and denotes
the end of the string in memory.
[154]
This is important because if I call a function
like âprint quoteâ, which writes the string
[158]
to the screen, it prints out each character
in turn starting at the first memory location,
[162]
but it needs to know when to stop!
[164]
Otherwise, it would print out every single
thing in memory as text.
[167]
The zero tells string functions when to stop.
[170]
Because computers work with text so often,
there are many functions that specifically
[174]
handle strings.
[175]
For example, many programming languages have
a string concatenation function, or âstrcatâ,
[179]
which takes in two strings, and copies the
second one to the end of the first.
[183]
We can use arrays for making one dimensional
lists, but sometimes you want to manipulate
[187]
data that is two dimensional, like a grid
of numbers in a spreadsheet, or the pixels
[191]
on your computer screen.
[192]
For this, we need a Matrix.
[194]
You can think of a Matrix as an array of arrays!
[197]
So a 3 by 3 matrix is really 2 an array of
size 3, with each index storing an array of
[201]
size 3.
[203]
We can initialize a matrix like so.
[204]
In memory, this is packed together in order
like this.
[207]
To access a value, you need to specify two
indexes, like âJ index of 2, then index
[212]
of 1â - this tells the computer youâre
looking for the item in subarray 2 at position 1.
[217]
And this would give us the value 12.
[219]
The cool thing about matrices is weâre not
limited to 3 by 3 -- we can make them any
[222]
size we want -- and we can also make them
any number of dimensions we want.
[226]
For example, we can create a five dimensional
matrix and access it like this.
[230]
Thatâs right, you now know how to access
a five dimensional matrix -- tell your friends!
[234]
So far, weâve been storing individual numbers
or letters into our arrays or matrices.
[239]
But often itâs useful to store a block of
related variables together.
[242]
Like, you might want to store a bank account
number along with its balance.
[245]
Groups of variables like these can be bundled
together into a Struct.
[249]
Now we can create variables that arenât
just single numbers, but are compound data
[252]
structures, able to store several pieces of
data at once.
[256]
We can even make arrays of structs that we
define, which are automatically bundled together
[260]
in memory.
[261]
If we access, for example, J index of 0, we
get back the whole struct stored there, and
[265]
we can pull the specific account number and
balance data we want.
[268]
This array of structs, like any other array,
gets created at a fixed size that canât
[272]
be enlarged to add more items.
[274]
Also, arrays must be stored in order in memory,
making it hard to add a new item to the middle.
[278]
But, the struct data structure can be used
for building more complicated data structures
[282]
that avoid these restrictions.
[284]
Letâs take a look at this struct thatâs
called a ânodeâ.
[286]
It stores a variable, like a number, and also
a pointer.
[289]
A pointer is a special variable that points,
hence the name, to a location in memory.
[293]
Using this struct, we can create a linked
list, which is a flexible data structure that
[297]
can store many nodes.
[299]
It does this by having each node point to
the next node in the list.
[302]
Letâs imagine we have three node structs
saved in memory, at locations 1000, 1002 and 1008.
[309]
They might be spaced apart, because they were
created at different times, and other data
[313]
can sit between them.
[314]
So, you see that the first node contains the
value 7, and the location 1008 in its ânextâ
[319]
pointer.
[320]
This means that the next node in the linked
list is located at memory location 1008.
[324]
Looking down the linked list, to the next
node, we see it stores the value 112 and points
[328]
to another node at location 1002.
[331]
If we follow that, we find a node that contains
the value 14 and points back to the first
[336]
node at location 1000.
[337]
So this linked list happened to be circular,
but it could also have been terminated by
[341]
using a next pointer value of 0 -- the null
value -- which would indicate weâve reached
[345]
the end of the list.
[347]
When programmers use linked lists, they rarely
look at the memory values stored in the next
[350]
pointers.
[351]
Instead, they can use an abstraction of a
linked list, that looks like this, which is
[355]
much easier to conceptualize.
[356]
Unlike an array, whose size has to be pre-defined,
linked lists can be dynamically extended or
[361]
shortened.
[362]
For example, we can allocate a new node in
memory, and insert it into this list, just
[366]
by changing the next pointers.
[367]
Linked Lists can also easily be re-ordered,
trimmed, split, reversed, and so on.
[372]
Which is pretty nifty!
[373]
And pretty useful for algorithms like sorting,
which we talked about last week.
[376]
Owing to this flexibility, many more-complex data structures are built on top of linked lists
[381]
The most famous and universal are queues and
stacks.
[383]
A queue â like the line at your post office
â goes in order of arrival.
[387]
The person who has been waiting the longest,
gets served first.
[390]
No matter how frustrating it is that all you
want to do is buy stamps and the person in
[393]
front of you seems to be mailing 23 packages.
[396]
But, regardless, this behavior is called First-In
First-Out, or FIFO.
[399]
Thatâs the first part.
[401]
Not the 23 packages thing.
[402]
Imagine we have a pointer, named âpost office
queueâ, that points to the first node in
[406]
our linked list.
[407]
Once weâre done serving Hank, we can read
Hankâs next pointer, and update our âpost
[411]
office queueâ pointer to the next person
in the line.
[414]
Weâve successfully dequeued Hank -- heâs
gone, done, finished.
[418]
If we want to enqueue someone, that is, add
them to the line, we have to traverse down
[422]
the linked list until we hit the end, and
then change that next pointer to point to
[425]
the new person.
[426]
With just a small change, we can use linked
lists as stacks, which are LIFOâŠ
[430]
Last-In First-Out.
[431]
You can think of this like a stack of pancakes...
as you make them, you add them to the top
[435]
of stack.
[436]
And when you want to eat one, you take them
from the top of the stack.
[439]
Delicious!
[440]
Instead of enqueueing and dequeuing, data
is pushed onto the stack and popped from the stacks.
[445]
Yep, those are the official terms!
[447]
If we update our node struct to contain not
just one, but two pointers, we can build trees,
[452]
another data structure thatâs used in many
algorithms.
[454]
Again, programmers rarely look at the values
of these pointers, and instead conceptualize
[458]
trees like this:
The top most node is called the root.
[462]
And any nodes that hang from other nodes are
called children nodes.
[465]
As you might expect, nodes above children
are called parent nodes.
[468]
Does this example imply that Thomas Jefferson
is the parent of Aaron Burr?
[471]
Iâll leave that to your fanfiction to decide.
[474]
And finally, any nodes that have no children -- where the tree ends -- are called Leaf Nodes.
[478]
In our example, nodes can have up to two children,
and for that reason, this particular data
[483]
structure is called a binary tree.
[484]
But you could just as easily have trees with
three, four or any number of children by modifying
[489]
the data structure accordingly.
[490]
You can even have tree nodes that use linked
lists to store all the nodes they point to.
[495]
An important property of trees â both in
real life and in data structures â is that
[498]
thereâs a one-way path from roots to leaves.
[500]
Itâd be weird if roots connected to leaves,
that connected to roots.
[503]
For data that links arbitrarily, that include
things like loops, we can use a graph data
[508]
structure instead.
[509]
Remember our graph from last episode of cities
connected by roads?
[512]
This can be stored as nodes with many pointers,
very much like a tree, but there is no notion
[516]
of roots and leaves, and children and parentsâŠ
[518]
Anything can point to anything!
[520]
So thatâs a whirlwind overview of pretty
much all of the fundamental data structures
[523]
used in computer science.
[525]
On top of these basic building blocks, programmers
have built all sorts of clever variants, with
[529]
slightly different properties -- data structures
like red-black trees and heaps, which we donât
[533]
have time to cover.
[534]
These different data structures have properties
that are useful for particular computations.
[539]
The right choice of data structure can make
your job a lot easier, so it pays off to think
[542]
about how you want to structure your data
before you jump in.
[545]
Fortunately, most programming languages come with libraries packed full of ready-made data structures.
[550]
For example, C++ has its Standard Template
Library, and Java has the Java Class Library.
[555]
These mean programmers donât have to waste
time implementing things from scratch, and
[559]
can instead wield the power of data structures
to do more interesting things, once again
[564]
allowing us to operate at a new level of abstraction!
[567]
Iâll see you next week.
You can go back to the homepage right here: Homepage





