🔍

Data Structures: Crash Course Computer Science #14 - YouTube

Channel: CrashCourse

[3]

Hi, I'm Carrie Anne, and welcome to Crash Course Computer Science!

[5]

Last episode, we discussed a few example classic algorithms, like sorting a list of numbers

[9]

and finding the shortest path in a graph.

[11]

What we didn’t talk much about, is how the data the algorithms ran on was stored in computer

[16]

memory.

[16]

You don’t want your data to be like John Green’s college dorm room, with food, clothing

[19]

and papers strewn everywhere.

[21]

Instead, we want our data to be structured, so that it’s organized, allowing things

[25]

to be easily retrieved and read.

[27]

For this, computer scientists use Data Structures!

[29]

INTRO

[39]

We already introduced one basic data structure last episode, Arrays, also called lists or

[43]

Vectors in some languages.

[45]

These are a series of values stored in memory.

[47]

So instead of just a single value being saved into a variable, like ‘j equals 5’, we

[51]

can define a whole series of numbers, and save that into an array variable.

[55]

To be able to find a particular value in this array, we have to specify an index.

[59]

Almost all programing languages start arrays at index 0, and use a square bracket syntax

[64]

to denote array access.

[65]

So, for example, if we want to add the values in the first and third spots of our array

[70]

‘j’, and save that into a variable ‘a’, we would write a line of code like this.

[74]

How an array is stored in memory is pretty straightforward.

[76]

For simplicity, let’s say that the compiler chose to store ours at memory location 1,000.

[81]

The array contains 7 numbers, and these are stored one after another in memory, as seen here.

[86]

So when we write “j index of 0”, the computer goes to memory location 1,000, with an offset

[91]

of 0, and we get the value 5.

[93]

If we wanted to retrieve “j index of 5”, our program goes to memory location 1000,

[98]

plus an offset of 5, which in this case, holds a value of 4.

[102]

It’s easy to confuse the fifth number in the array with the number at index 5.

[105]

They are not the same.

[106]

Remember, the number at index 5 is the 6th number in the array because the first number

[110]

is at index 0.

[111]

Arrays are extremely versatile data structures, used all the time, and so there are many functions

[115]

that can handle them to do useful things.

[117]

For example, pretty much every programming language comes with a built-in sort function,

[121]

where you just pass in your array, and it comes back sorted.

[124]

So there’s no need to write that algorithm from scratch.

[126]

Very closely related are Strings, which are just arrays of characters, like letters, numbers,

[131]

punctuation and other written symbols.

[133]

We talked about how computers store characters way back in Episode 4.

[136]

Most often, to save a string into memory, you just put it in quotes, like so.

[140]

Although it doesn’t look like an array, it is.

[143]

Behind the scenes, the memory looks like this.

[144]

Note that the string ends with a zero in memory.

[147]

It’s not the character zero, but the binary value 0.

[151]

This is called the null character, and denotes the end of the string in memory.

[154]

This is important because if I call a function like “print quote”, which writes the string

[158]

to the screen, it prints out each character in turn starting at the first memory location,

[162]

but it needs to know when to stop!

[164]

Otherwise, it would print out every single thing in memory as text.

[167]

The zero tells string functions when to stop.

[170]

Because computers work with text so often, there are many functions that specifically

[174]

handle strings.

[175]

For example, many programming languages have a string concatenation function, or “strcat”,

[179]

which takes in two strings, and copies the second one to the end of the first.

[183]

We can use arrays for making one dimensional lists, but sometimes you want to manipulate

[187]

data that is two dimensional, like a grid of numbers in a spreadsheet, or the pixels

[191]

on your computer screen.

[192]

For this, we need a Matrix.

[194]

You can think of a Matrix as an array of arrays!

[197]

So a 3 by 3 matrix is really 2 an array of size 3, with each index storing an array of

[201]

size 3.

[203]

We can initialize a matrix like so.

[204]

In memory, this is packed together in order like this.

[207]

To access a value, you need to specify two indexes, like “J index of 2, then index

[212]

of 1” - this tells the computer you’re looking for the item in subarray 2 at position 1.

[217]

And this would give us the value 12.

[219]

The cool thing about matrices is we’re not limited to 3 by 3 -- we can make them any

[222]

size we want -- and we can also make them any number of dimensions we want.

[226]

For example, we can create a five dimensional matrix and access it like this.

[230]

That’s right, you now know how to access a five dimensional matrix -- tell your friends!

[234]

So far, we’ve been storing individual numbers or letters into our arrays or matrices.

[239]

But often it’s useful to store a block of related variables together.

[242]

Like, you might want to store a bank account number along with its balance.

[245]

Groups of variables like these can be bundled together into a Struct.

[249]

Now we can create variables that aren’t just single numbers, but are compound data

[252]

structures, able to store several pieces of data at once.

[256]

We can even make arrays of structs that we define, which are automatically bundled together

[260]

in memory.

[261]

If we access, for example, J index of 0, we get back the whole struct stored there, and

[265]

we can pull the specific account number and balance data we want.

[268]

This array of structs, like any other array, gets created at a fixed size that can’t

[272]

be enlarged to add more items.

[274]

Also, arrays must be stored in order in memory, making it hard to add a new item to the middle.

[278]

But, the struct data structure can be used for building more complicated data structures

[282]

that avoid these restrictions.

[284]

Let’s take a look at this struct that’s called a “node”.

[286]

It stores a variable, like a number, and also a pointer.

[289]

A pointer is a special variable that points, hence the name, to a location in memory.

[293]

Using this struct, we can create a linked list, which is a flexible data structure that

[297]

can store many nodes.

[299]

It does this by having each node point to the next node in the list.

[302]

Let’s imagine we have three node structs saved in memory, at locations 1000, 1002 and 1008.

[309]

They might be spaced apart, because they were created at different times, and other data

[313]

can sit between them.

[314]

So, you see that the first node contains the value 7, and the location 1008 in its “next”

[319]

pointer.

[320]

This means that the next node in the linked list is located at memory location 1008.

[324]

Looking down the linked list, to the next node, we see it stores the value 112 and points

[328]

to another node at location 1002.

[331]

If we follow that, we find a node that contains the value 14 and points back to the first

[336]

node at location 1000.

[337]

So this linked list happened to be circular, but it could also have been terminated by

[341]

using a next pointer value of 0 -- the null value -- which would indicate we’ve reached

[345]

the end of the list.

[347]

When programmers use linked lists, they rarely look at the memory values stored in the next

[350]

pointers.

[351]

Instead, they can use an abstraction of a linked list, that looks like this, which is

[355]

much easier to conceptualize.

[356]

Unlike an array, whose size has to be pre-defined, linked lists can be dynamically extended or

[361]

shortened.

[362]

For example, we can allocate a new node in memory, and insert it into this list, just

[366]

by changing the next pointers.

[367]

Linked Lists can also easily be re-ordered, trimmed, split, reversed, and so on.

[372]

Which is pretty nifty!

[373]

And pretty useful for algorithms like sorting, which we talked about last week.

[376]

Owing to this flexibility, many more-complex data structures are built on top of linked lists

[381]

The most famous and universal are queues and stacks.

[383]

A queue – like the line at your post office – goes in order of arrival.

[387]

The person who has been waiting the longest, gets served first.

[390]

No matter how frustrating it is that all you want to do is buy stamps and the person in

[393]

front of you seems to be mailing 23 packages.

[396]

But, regardless, this behavior is called First-In First-Out, or FIFO.

[399]

That’s the first part.

[401]

Not the 23 packages thing.

[402]

Imagine we have a pointer, named “post office queue”, that points to the first node in

[406]

our linked list.

[407]

Once we’re done serving Hank, we can read Hank’s next pointer, and update our “post

[411]

office queue” pointer to the next person in the line.

[414]

We’ve successfully dequeued Hank -- he’s gone, done, finished.

[418]

If we want to enqueue someone, that is, add them to the line, we have to traverse down

[422]

the linked list until we hit the end, and then change that next pointer to point to

[425]

the new person.

[426]

With just a small change, we can use linked lists as stacks, which are LIFO…

[430]

Last-In First-Out.

[431]

You can think of this like a stack of pancakes... as you make them, you add them to the top

[435]

of stack.

[436]

And when you want to eat one, you take them from the top of the stack.

[439]

Delicious!

[440]

Instead of enqueueing and dequeuing, data is pushed onto the stack and popped from the stacks.

[445]

Yep, those are the official terms!

[447]

If we update our node struct to contain not just one, but two pointers, we can build trees,

[452]

another data structure that’s used in many algorithms.

[454]

Again, programmers rarely look at the values of these pointers, and instead conceptualize

[458]

trees like this: The top most node is called the root.

[462]

And any nodes that hang from other nodes are called children nodes.

[465]

As you might expect, nodes above children are called parent nodes.

[468]

Does this example imply that Thomas Jefferson is the parent of Aaron Burr?

[471]

I’ll leave that to your fanfiction to decide.

[474]

And finally, any nodes that have no children -- where the tree ends -- are called Leaf Nodes.

[478]

In our example, nodes can have up to two children, and for that reason, this particular data

[483]

structure is called a binary tree.

[484]

But you could just as easily have trees with three, four or any number of children by modifying

[489]

the data structure accordingly.

[490]

You can even have tree nodes that use linked lists to store all the nodes they point to.

[495]

An important property of trees – both in real life and in data structures – is that

[498]

there’s a one-way path from roots to leaves.

[500]

It’d be weird if roots connected to leaves, that connected to roots.

[503]

For data that links arbitrarily, that include things like loops, we can use a graph data

[508]

structure instead.

[509]

Remember our graph from last episode of cities connected by roads?

[512]

This can be stored as nodes with many pointers, very much like a tree, but there is no notion

[516]

of roots and leaves, and children and parents…

[518]

Anything can point to anything!

[520]

So that’s a whirlwind overview of pretty much all of the fundamental data structures

[523]

used in computer science.

[525]

On top of these basic building blocks, programmers have built all sorts of clever variants, with

[529]

slightly different properties -- data structures like red-black trees and heaps, which we don’t

[533]

have time to cover.

[534]

These different data structures have properties that are useful for particular computations.

[539]

The right choice of data structure can make your job a lot easier, so it pays off to think

[542]

about how you want to structure your data before you jump in.

[545]

Fortunately, most programming languages come with libraries packed full of ready-made data structures.

[550]

For example, C++ has its Standard Template Library, and Java has the Java Class Library.

[555]

These mean programmers don’t have to waste time implementing things from scratch, and

[559]

can instead wield the power of data structures to do more interesting things, once again

[564]

allowing us to operate at a new level of abstraction!

[567]

I’ll see you next week.

Most Recent Videos:

WE KILLED 6 HEROIC BOSSES! - YouTube

¿Quién inventó el dinero? - YouTube

Cuándo se inventó el dinero y cómo el dólar se convirtió en la principal moneda del mundo - YouTube

This Citizenship Program is Failing - YouTube

Candida Treatment Protocol w/ Dr. DiNezza - YouTube

$500M investor reacts to Real Estate Tik Toks 2 - YouTube

You can go back to the homepage right here: Homepage