🔍

Our data is GONE... Again - Petabyte Project Recovery Part 1 - YouTube

Channel: unknown

[0]

- When I say we store and handle a lotta data

[2]

for a YouTube channel, I mean it.

[5]

I mean, we've built some'n sick,

[6]

hundred plus terabyte servers

[8]

for some of our fellow YouTubers,

[10]

but those are nothing

[11]

compared to the two plus petabytes of archival storage

[16]

that we currently have in production in our server room

[18]

that is storing all the footage

[20]

for every video we have ever made, at full quality.

[25]

For the uninitiated,

[26]

that is over 11,000, Warzone Installs worth of data.

[31]

But with great power comes great responsibility,

[34]

and we weren't responsible.

[38]

Despite our super dope hardware, we made a little oopsie

[43]

that resulted in us permanently losing data

[45]

that we don't have any backup for.

[48]

We still don't know how much,

[49]

but what we do know is what went wrong

[52]

and we've got a plan to recover what we can,

[54]

but it is going to take some work, and some money,

[58]

thanks to our sponsor, Hetzner.

[60]

Hetzner offers high-performance Cloud Servers

[62]

for an amazing price.

[64]

With their new US location in Ashburn, Virginia,

[66]

you can deploy cloud servers in four different locations

[69]

and benefit from features like, Load Balancers,

[70]

Block Storage and more.

[72]

Use code LTT22 at the link below for $20 off.

[79]

(upbeat music)

[85]

Let's start with a bit of background on our servers.

[87]

Our archival storage is composed

[89]

of two discrete GlusterFS clusters.

[92]

Both of them spread across two 45Drives Storinator servers,

[96]

each with 60 hard drives.

[99]

The original petabyte project,

[101]

is made up of the Delta 1 and Delta 2 servers,

[104]

and goes by the moniker Old Vault.

[107]

Petabyte project two, or the New Vault

[109]

is Delta 3 and Delta 4.

[112]

Now, because of the nature of our content,

[113]

most of our employees are pretty tech literate

[117]

with many of them even falling

[118]

into the tech wizard category.

[120]

So, we've always had substantially lower need

[122]

for tech support than the average company.

[125]

And as a result, we have never hired a full-time IT person,

[129]

despite the handful of times, perhaps including this one,

[132]

that it probably would have been helpful.

[134]

So, in the early days, I managed the infrastructure,

[138]

and since then I've had some help from both outside sources,

[144]

and other members of the writing team.

[149]

We all have different strengths,

[151]

but what we all have in common

[153]

is that we have other jobs to do,

[155]

meaning that it's never really been clear

[158]

who exactly is supposed to be accountable

[161]

when something slips through the cracks.

[163]

And unfortunately, while obvious issues

[166]

like, a replacement power cable

[168]

and a handful of failed drives over the years

[171]

were handled by Anthony,

[172]

we never really tasked anyone

[174]

with performing preventative maintenance

[176]

on our precious petabyte servers.

[179]

A quick point of clarification

[180]

before we get into the rest of this.

[182]

Nothing that happened as the result of anything

[184]

other than us messing up.

[186]

The hardware, both from 45Drives and from Seagate

[190]

who provided the bulk of what makes up

[191]

our petabyte project servers,

[193]

has performed beyond our expectations

[196]

and we would recommend checking out both of them,

[198]

if you or your business has serious data storage needs.

[201]

We're gonna have links to them down below.

[203]

But even the best hardware in the world can be let down

[207]

by misconfigured software.

[209]

And Jake, who tasked himself

[211]

with auditing our current infrastructure,

[213]

found just such a thing.

[215]

Everything was actually going pretty well.

[217]

He was setting up monitoring and alerts,

[219]

verifying that every machine would gracefully shut down

[222]

when the power goes out,

[223]

which happens a lot here for some reason,

[225]

but he eventually worked his way around

[227]

to the petabyte project servers

[228]

and checked the status of the ZFS pools or Zpools

[232]

on each of them.

[233]

And this is where the Kaka hit the fan.

[236]

Right off the bat, Delta 1 had two of its 60 drives faulted

[240]

in the same Vdev.

[242]

And you can think of a Vdev,

[243]

kind of like its own mini RAID array

[246]

within a larger pool of multiple RAID arrays.

[250]

So, in our configuration where we're running RAID-Z2,

[253]

if another disc out of our 15 drive Vdev

[256]

was to have any kind of problem,

[258]

we would incur irrecoverable data loss.

[261]

Upon further inspection, both of the drives

[263]

were completely dead,

[265]

which does happen with mechanical devices

[267]

and had dropped from the system.

[269]

So, we replaced them and let the array start rebuilding.

[272]

That's pretty scary, but not in and of itself a lost cause.

[277]

More on that later though.

[278]

Far scarier was when Delta 3,

[280]

which is part of the New Vault cluster

[283]

had five drives in a faulted state

[286]

with two of the Vdevs having two drives down.

[289]

That's very dangerous.

[292]

Interestingly, these drives weren't actually dead,

[296]

instead, they had just faulted

[297]

due to having too many errors.

[300]

So, read and writers like this

[302]

are usually caused by a faulty cable or a connection,

[305]

but they can also be the sign of a dying drive.

[308]

In our case, these errors probably cropped up

[310]

due to a sudden power loss

[311]

or due to naturally occurring bit rot,

[314]

as they were never configured

[315]

to shut down nicely while on backup power,

[317]

in the case of an outage.

[318]

And we've had quite a few of those over the years.

[321]

Now, storage systems are usually designed

[324]

to be able to recover from such an event,

[326]

especially ZFS, which is known for being

[328]

one of the most resilient ones out there.

[330]

After booting back up from a power loss,

[332]

ZFS pools and most other RAID or RAID like storage arrays,

[336]

should do something called a scrub or a re-sync,

[339]

which in the case of ZFS means

[341]

that every block of data gets checked

[343]

to ensure that there are no errors.

[345]

And if there are any errors,

[346]

these errors are automatically fixed

[348]

with the parity data that is stored in the array.

[351]

On most NAS operating systems,

[353]

like TrueNAS, Unraid or any pre-built NAS,

[357]

this process should just happen automatically.

[359]

And even if nothing goes wrong,

[361]

they should also run a scheduled scrub every month or so.

[364]

But our servers were set up by us a long time ago on CentOS

[370]

and never updated.

[371]

So, neither a scheduled nor a power on recovery scrub

[376]

was ever configured.

[377]

Meaning the only time data integrity

[379]

would have been checked on these arrays,

[381]

is when a block of data got read.

[384]

This function should theoretically protect against bit rot,

[387]

but since we have thousands of old videos,

[390]

of which a very, very small portion

[393]

ever actually gets accessed,

[395]

the rest were essentially left to slowly rot

[398]

and power lost themselves into an unrecoverable mess.

[402]

When we found the drive issues,

[404]

we weren't even aware of all this yet.

[405]

And even though the five drives weren't technically dead,

[408]

we erred on the side of caution

[410]

and started a replacement operation on all of them.

[413]

It was while we were rebuilding the array on Delta 3

[415]

with the new discs,

[416]

that we started to uncover the absolute mess of data errors.

[421]

ZFS has reported around 169 million errors

[425]

at the time of recording this.

[427]

And no, it's not nice.

[429]

In fact, there are so many errors on Delta 3

[432]

that with two faulted drives in both of the first Vdevs,

[435]

there is not enough parity data to fix the errors.

[439]

And this caused the array to offline itself

[441]

to protect against further degradation.

[444]

And unfortunately, much further along in the process,

[447]

the same thing happened on Delta 1.

[450]

That means that both the original and new petabyte projects,

[454]

Old and New Vault, have suffered nonrecoverable data loss.

[461]

So, now what do we do?

[463]

In regards to the corrupted and lost data, honestly nothing.

[467]

I mean, it's very likely

[468]

that even with 169 million data errors,

[471]

we still have virtually

[472]

all of the original bits in the right places.

[476]

But as far as we know,

[477]

there's no way to just tell ZFS,

[480]

"Yo dawg! Ignore those errors, you know,

[482]

"Pretend like they never happened,

[484]

"tow easy ZFS" or something.

[486]

Instead then, the plan is to build a new

[489]

properly configured 1.2 petabyte server,

[492]

featuring Seagate's shiny new 20 terabyte drives,

[495]

which we're really excited about like,

[496]

these things are almost as shiny

[498]

as our reflective hard drive shirt, lttstore.com.

[502]

And once that's complete,

[503]

we intend to move all of the data

[505]

from the New Vault cluster onto this New, New Vault.

[508]

- [Jake] All three.

[509]

- New New Vault.

[511]

Then we'll reset up New Vault,

[514]

ensure all the drives are good

[515]

and repeat the process to move Old Vault's data onto it.

[519]

Then we can reformat Old Vault, probably upgraded a bit

[523]

and use it for new data.

[524]

Maybe we'll rename it to New, New, New Vault.

[527]

Get subscribed, so, you don't miss any of that.

[529]

We'll hopefully be building that new server this week.

[532]

Now, if everything were set up properly

[534]

with regularly scheduled and post power loss scrubs,

[537]

this entire problem would probably have never happened.

[541]

And if we had a backup of that data,

[543]

we would be able to simply restore from that.

[545]

But here's the thing, backing up over a petabyte of data

[549]

is really expensive.

[551]

Either we would need to build a duplicate server array

[554]

to backup to, or we could back up to the cloud.

[557]

But even using the economical option, Backblaze B2,

[560]

it would cost us somewhere between

[561]

five and 10,000 US dollars per month,

[566]

to store that kind of data.

[567]

Now, if it was mission critical,

[569]

then by all means it should have been backed up

[571]

in both of those ways,

[573]

but having all of our archival footage

[575]

from day one of the channel

[576]

has always been a nice to have

[579]

and an excuse for us to explore really cool tech

[582]

that we otherwise wouldn't have any reason to play with.

[584]

I mean, it takes a little bit more effort

[586]

and it yields lower quality results,

[588]

but we have a backup of all of our old videos.

[591]

It's called downloading them off of YouTube

[593]

or Floatplane, if we wanted a higher quality copy.

[596]

So, the good news, is that our production 1X server

[599]

is running great.

[601]

With proper backups configured,

[602]

and this isn't gonna have any kind of lasting effect

[604]

on our business,

[606]

but I am still hopeful that

[607]

if all goes well with the recovery efforts,

[609]

we'll be able to get back the majority of the data,

[611]

mostly error free.

[613]

But only time will tell, a lot of time

[616]

because transferring all those petabytes of data

[618]

off of hard drives to other hard drives,

[620]

is gonna take weeks or even months.

[623]

So, let this be a lesson,

[624]

follow proper storage practices, have a backup

[628]

and probably hire someone to take care of your data

[631]

if you don't have the time.

[632]

Especially if you measure it in anything other

[633]

than tenths of terabytes,

[636]

or you might lose all of it.

[638]

But you won't lose our sponsor, Lambda.

[641]

Are you training deep learning models

[642]

for the next big breakthrough in artificial intelligence?

[644]

Then you should know about Lambda,

[646]

the deep learning company.

[647]

Founded by machine learning engineers,

[649]

Lambda builds GPU workstations, servers,

[651]

and cloud infrastructure for creating deep learning models.

[654]

They've helped all five of the big tech companies

[656]

and 47 of the top 50 research universities

[659]

accelerate their machine learning workflows.

[661]

Lambda's easy to use configurators let you spec out

[663]

exactly the hardware you need

[665]

from GPU laptops and workstations

[667]

all the way up to custom server clusters

[669]

and all Lambda machines come pre-installed

[671]

with Lambda Stack,

[672]

keeping your Linux machine learning environment up to date

[675]

and out of dependency hell.

[676]

And with Lambda Cloud,

[677]

you can spin up a virtual machine in minutes,

[679]

train models with 4 NVIDIA A6000s,

[682]

at just a fraction of the cost of the big cloud providers.

[685]

So, go to Lambdalabs.com/linus

[687]

to configure your own workstation

[688]

or try out Lambda Cloud today.

[690]

If you liked this video,

[691]

maybe check out the time I almost lost

[693]

all of our active projects when the OG 1X server failed.

[698]

That was a far more stressful situation.

[701]

I'm actually pretty relaxed right now

[705]

for someone with less much data on the line.

[707]

- [Jake] Yeah, must be nice.

[708]

- Yeah, I'm doing okay, thanks for asking.

[712]

I mean, I'd prefer to get it back, you know.(chuckles)

Most Recent Videos:

WE KILLED 6 HEROIC BOSSES! - YouTube

¿Quién inventó el dinero? - YouTube

Cuándo se inventó el dinero y cómo el dólar se convirtió en la principal moneda del mundo - YouTube

This Citizenship Program is Failing - YouTube

Candida Treatment Protocol w/ Dr. DiNezza - YouTube

$500M investor reacts to Real Estate Tik Toks 2 - YouTube

You can go back to the homepage right here: Homepage