Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Data Archival Options
View unanswered posts
View posts from last 24 hours

Goto page 1, 2  Next  
Reply to topic    Gentoo Forums Forum Index Off the Wall
View previous topic :: View next topic  
Author Message
R0b0t1
Apprentice
Apprentice


Joined: 05 Jun 2008
Posts: 255

PostPosted: Tue Apr 24, 2018 6:00 pm    Post subject: Data Archival Options Reply with quote

Hello,

What would be the most economical for data archival? I wanted to use tape, but the better option, LTO drives, seems to require a $5k investment in a drive, but thereafter ~10TB of capacity is $20-30. There is another brand where the drive is a few hundred dollars, but so are the tapes.

The other option I had considered is Bluray disks, which will do 100GB (for triple layer disks, not many people sell quadruple layer disks yet) for about $5/ea. This is not nearly as good, but only requires a $60 drive.

Other options? Any recommended hardware?
Back to top
View user's profile Send private message
Naib
Watchman
Watchman


Joined: 21 May 2004
Posts: 5401
Location: Removed by Neddy

PostPosted: Tue Apr 24, 2018 6:00 pm    Post subject: Reply with quote

NSA or Facebook
_________________
The best argument against democracy is a five-minute conversation with the average voter
Great Britain is a republic, with a hereditary president, while the United States is a monarchy with an elective king
Back to top
View user's profile Send private message
Jaglover
Watchman
Watchman


Joined: 29 May 2005
Posts: 6395
Location: Saint Amant, Acadiana

PostPosted: Tue Apr 24, 2018 6:13 pm    Post subject: Reply with quote

I even didn't know there are archive-grade blueray disks.
_________________
Please learn how to denote units correctly!
Back to top
View user's profile Send private message
R0b0t1
Apprentice
Apprentice


Joined: 05 Jun 2008
Posts: 255

PostPosted: Tue Apr 24, 2018 6:33 pm    Post subject: Reply with quote

Jaglover wrote:
I even didn't know there are archive-grade blueray disks.
Well, you're supposed to store them in a low moisture and low oxygen environment, but other archival media has similar suggestions.

The problem with Bluray is it's not cost effective. It is cheaper to buy >1TB drives and migrate data between them as fast as they die than it is to export 1TB to Bluray disks. I'm hoping there is something else, like cheaper clone LTO drives that accept the cheap LTO tapes.

The goal is for this to be local. Paying someone else to archive my data can get expensive very quickly.
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1312
Location: KUUSANKOSKI, Finland

PostPosted: Tue Apr 24, 2018 7:22 pm    Post subject: Reply with quote

I've been running five disk btrfs array for quite a long time now. Since btrfs supports snapshots, it kind of also works as a backup too. That said - I till have seperate hard drive(s) for real backups. Although I haven't pulled anything from those backups for years. Hard drives fail from time to time. I have just replaced them with new (usually bigger) ones.
But this LTO thing... Really nice.
I think I would use LTO as my final backup if the dang drives were cheaper. Cheapest here (a 5.25" - half height) costs around 1700€. Bummer. :(

I think best removable backup is HDD dock and some rubber case for storing the hard drives.
_________________
..: Zucca :..

Code:
ERROR: '--failure' is not an option. Aborting...


Last edited by Zucca on Wed Apr 25, 2018 3:40 pm; edited 1 time in total
Back to top
View user's profile Send private message
Bones McCracker
Veteran
Veteran


Joined: 14 Mar 2006
Posts: 1605
Location: U.S.A.

PostPosted: Tue Apr 24, 2018 10:28 pm    Post subject: Reply with quote

He means archive, as in aaarrr-cchhhiiiveee, not your garden variety backup good for a couple years.

The best option is to use file steganography (e..g. steghide) to embed your data into the frames of video clips and upload them to pornhub. The files will be stored in massive parallel and re-copied in hyper-redundancy for at least a decade, and no one but you will be the wiser.
_________________
patrix_neo wrote:
The human thought: I cannot win.
The ratbrain in me : I can only go forward and that's it.
Back to top
View user's profile Send private message
steveL
Watchman
Watchman


Joined: 13 Sep 2006
Posts: 5099
Location: The Peanut Gallery

PostPosted: Wed Apr 25, 2018 12:58 pm    Post subject: Re: Data Archival Options Reply with quote

R0b0t1 wrote:
What would be the most economical for data archival? I wanted to use tape, but the better option, LTO drives, seems to require a $5k investment in a drive, but thereafter ~10TB of capacity is $20-30. There is another brand where the drive is a few hundred dollars, but so are the tapes.

The other option I had considered is Bluray disks, which will do 100GB (for triple layer disks, not many people sell quadruple layer disks yet) for about $5/ea. This is not nearly as good, but only requires a $60 drive.

Other options? Any recommended hardware?
I'd use a combination of optical disks and hard-drives.

Whatever you use, you must keep rereading and checking the content, and preferably keep 5 copies.

For hard-drives, expect to change every 5 years, and have spares ready upfront; you also want to keep rewriting those on a periodic basis, along with the usual checks.

There was a distributed git backup thing, that looked really nice.

Sorry no options on hardware (wait for a sysadmin) and really sorry you ended up in OTW. ;)
Back to top
View user's profile Send private message
cokey
Advocate
Advocate


Joined: 23 Apr 2004
Posts: 3343

PostPosted: Wed Apr 25, 2018 2:49 pm    Post subject: Reply with quote

Try Amazon Glacier
_________________
"Sex: breakfast of champions" - James Hunt
Back to top
View user's profile Send private message
John-Boy
Guru
Guru


Joined: 23 Jun 2004
Posts: 439
Location: Desperately seeking moksha in all the wrong places

PostPosted: Wed Apr 25, 2018 3:57 pm    Post subject: Reply with quote

Low density floppies
_________________
Like the Roman, I seem to see "the River Tiber foaming with much blood"
Back to top
View user's profile Send private message
szatox
Veteran
Veteran


Joined: 27 Aug 2013
Posts: 1682

PostPosted: Wed Apr 25, 2018 8:10 pm    Post subject: Reply with quote

cokey wrote:
Try Amazon Glacier

Disks are cheaper. Particularly when it comes down to keeping that data for a long time, which is what people usually mean by "archiving". Or if you actually want to restore that data.
I'd also favour drives over tapes, because tapes stretch and tear, and they suck at random IO so you _can't_ really make a Reduntant Array of Tape Drives which in turn means you need more copies to provide the same reliability.
A pretty decent setup for disk backup is 2 servers with raid6 and a hot spare. You recover from random failures by replacing drives, and in case of common cause failure you have another copy. In case of low-duty scenario, you may want to power off those machines when not in use.

Weight is a pretty serious downside of hard drives. However, random access enables using incremental backups (and strong compression), so you will probably need significantly less raw capacity than in case of tapes, bringing down the price of effective capacity.
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1312
Location: KUUSANKOSKI, Finland

PostPosted: Wed Apr 25, 2018 8:46 pm    Post subject: Reply with quote

I see old SCSI LTO drives are very cheap used. The maximum tape capacity is probably around 100-500GB. :(
Those would still beat BluRay.
_________________
..: Zucca :..

Code:
ERROR: '--failure' is not an option. Aborting...
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1312
Location: KUUSANKOSKI, Finland

PostPosted: Wed Apr 25, 2018 8:47 pm    Post subject: Reply with quote

John-Boy wrote:
Low density floppies
Go play with your mercury delay lines, boy.
_________________
..: Zucca :..

Code:
ERROR: '--failure' is not an option. Aborting...
Back to top
View user's profile Send private message
Bones McCracker
Veteran
Veteran


Joined: 14 Mar 2006
Posts: 1605
Location: U.S.A.

PostPosted: Wed Apr 25, 2018 10:31 pm    Post subject: Reply with quote

cokey wrote:
Try Amazon Glacier

So, with that reliability rate, most interpretations would equate to guaranteed data corruption in less than a year. Sounds like a bullshit made-up number. Also, who would archive their data in the US, where the next Clinton or Obama could send the tax Nazis to nit-pick through your laundry?
_________________
patrix_neo wrote:
The human thought: I cannot win.
The ratbrain in me : I can only go forward and that's it.
Back to top
View user's profile Send private message
Akkara
Administrator
Administrator


Joined: 28 Mar 2006
Posts: 6494
Location: &akkara

PostPosted: Wed Apr 25, 2018 11:10 pm    Post subject: Reply with quote

How much data volume are you looking at?

If it is more than a TB or two, make sure the server(s) used have ECC memory. Previously, using regular consumer motherboards, I'd see an unexplained bit-flip every 10TB or so. The corruptions were not caught by the redundant storage, so must have happened in transit, somewhere between reading it from A and writing it to B. These problems went away after moving to server-grade hardware with ECC.

I use regular hard-drives and a hot-swap rack. As the main arrays get upgraded, the previous drives are re-purposed for backup use. Eventually the oldest of those get retired.

General Request: please keep the politics and the snark to a minimum on this thread. It is a legitimate question and I want to see it get the well-considered thought it deserves. There are plenty of other threads here where you all can air out that nonsense.
_________________
Humility means having to eat less crow when you are shown to be wrong.
Back to top
View user's profile Send private message
Bones McCracker
Veteran
Veteran


Joined: 14 Mar 2006
Posts: 1605
Location: U.S.A.

PostPosted: Wed Apr 25, 2018 11:55 pm    Post subject: Reply with quote

ECC itself offers a range of possibilities, some of which might make more sense than usual in such a setup (such as rank sparing, mirroring, multibit, etc.).
_________________
patrix_neo wrote:
The human thought: I cannot win.
The ratbrain in me : I can only go forward and that's it.
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1312
Location: KUUSANKOSKI, Finland

PostPosted: Thu Apr 26, 2018 9:11 am    Post subject: Reply with quote

Akkara wrote:
If it is more than a TB or two, make sure the server(s) used have ECC memory. Previously, using regular consumer motherboards, I'd see an unexplained bit-flip every 10TB or so. The corruptions were not caught by the redundant storage, so must have happened in transit, somewhere between reading it from A and writing it to B. These problems went away after moving to server-grade hardware with ECC.
Doesn't all Ryzen platforms support ECC? If so, you don't neccessarily need to invest in server grade hardware. Although used server hardware tends to be cheap but also noisy and power hungry at the same time.
I've forgotten what's the difference between different types of ECC RAM: (un)buffered, (un)registered... and if Ryzen supports any/all of those types.
If we go even further (ie. extreme) some special hardware has redundant CPU cores. Basically every instruction is calculated on both cores and the result needs to match. Not sure if this can be achieved using software (kernel patch)... Anyway, those kinds of setups are mainly made for banks etc... ;) Not probably anything people here might need.

Akkara wrote:
I use regular hard-drives and a hot-swap rack. As the main arrays get upgraded, the previous drives are re-purposed for backup use. Eventually the oldest of those get retired.
I have the exact same setup and methods. For example, one of my drives was otherwise ok but the head parking count was quite high. Not too high for a backup disk which would spin up once a week or even once a day. So it was time for it to retire and become a backup disk.
All in all. The method is pretty inexpensive way of keeping backups.

Then there is also these LTO backup tapes. I looked around bit more. HP StorageWorks Ultrium 448 seems to be the most cheapest and common LTO tape drive used. $20-$30 without shipping. It can store maximum of 400GB into a single tape, compressed. 200GB uncompressed. I've not yet dug into what's the downside of uncompressed format... no redundancy?


Akkara wrote:
General Request: please keep the politics and the snark to a minimum on this thread. It is a legitimate question and I want to see it get the well-considered thought it deserves. There are plenty of other threads here where you all can air out that nonsense.
Thank you.
_________________
..: Zucca :..

Code:
ERROR: '--failure' is not an option. Aborting...
Back to top
View user's profile Send private message
Akkara
Administrator
Administrator


Joined: 28 Mar 2006
Posts: 6494
Location: &akkara

PostPosted: Thu Apr 26, 2018 9:11 pm    Post subject: Reply with quote

Zucca wrote:
Doesn't all Ryzen platforms support ECC? If so, you don't neccessarily need to invest in server grade hardware.

Do they? I had been meaning to check out the Ryzen family, but haven't gotten there yet. If so, that is good news.

Quote:
I've forgotten what's the difference between different types of ECC RAM: (un)buffered, (un)registered...

Registered just means there's a hardware address latch between the address input to the module, and the chip address lines. It reduces loading on the memory interface and allows more memory to be attached without slowing the memory clock. But it introduces an additional cycle of latency in every mem-op.

Buffered memory has a "buffer" (essentially, an amplifier) between the memory interface and the chips. It also reduces loading and since it isn't a latch it has the potential to operate without that additional cycle of latency. However it still introduces some delay so might not be able to operate as fast as unbuffered.

Unbuffered/unregistered is just a straight connection to the chip pins, so is free of these delays. But it loads the memory interface more. And the loading slows things down since it needs longer for the signals to stabilize.

(More or less... I think the larger "unbuffered" modules might have something of an amplifier anyway; and I think there's also some overlap, some use the term "buffered" when it is actually a latch in there.)

Depending how much memory is in the system, one of these three options is the best. And "a lot of" memory isn't that hard to reach: I've seen motherboards that can't run the fastest speeds if both RAM slots are filled.
_________________
Humility means having to eat less crow when you are shown to be wrong.
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1312
Location: KUUSANKOSKI, Finland

PostPosted: Thu Apr 26, 2018 9:47 pm    Post subject: Reply with quote

Akkara wrote:
Zucca wrote:
Doesn't all Ryzen platforms support ECC? If so, you don't neccessarily need to invest in server grade hardware.

Do they? I had been meaning to check out the Ryzen family, but haven't gotten there yet. If so, that is good news.
It seems it's yes and no. BTW... IF you're adventurous you can try to mount EPYC on Threadripper motherboard. I've heard it works on some motherboards with reflashed BIOS/UEFI. But I still don't see much reason to do so... Unless you specifically need a Threadriper motherboard, since the EPYc will cost you a lot so going for EPYC motherboard at the same time doesn't raise the price much comapared...

Oh well. I dig more into ECC memory when I'm about to buy it.
_________________
..: Zucca :..

Code:
ERROR: '--failure' is not an option. Aborting...
Back to top
View user's profile Send private message
notageek
Tux's lil' helper
Tux's lil' helper


Joined: 05 Jun 2008
Posts: 131
Location: India

PostPosted: Thu Apr 26, 2018 10:00 pm    Post subject: Reply with quote

I posted something about JBODs, which isn't probably something you're looking for. So, here's a plug: https://aws.amazon.com/glacier/

If you're willing to pay money, it's probably better to let somebody else do it for you.
_________________
"Defeat is a state of mind. No one is ever defeated, until defeat has been accepted as a reality." -- Bruce Lee
Back to top
View user's profile Send private message
Bones McCracker
Veteran
Veteran


Joined: 14 Mar 2006
Posts: 1605
Location: U.S.A.

PostPosted: Fri Apr 27, 2018 12:06 am    Post subject: Reply with quote

Not true. The one time I needed to recover from backup service (a top brand you all know), they couldn't deliver. Behind the scenes, they were actually reselling somebody else's bacup service, and behind that scene,vthose people were using somebody else's storage service. By the time I got down to the bone, I'm talking to some clowns in India who are pretending they can't speak English very well, and the various layers are pointing the finger at each other. The data was corrupted, and they were going to cover their legal asses by showing "good faith" and hire some data recovery people (probably more Indians pretending they know what they're doing), but that's going to take weeks.

They refunded years worth of fees, but that doesn't come close to the damage that was done. The silver lining was that it forced me to try something else, which worked. But, I'll never believe the marketing bullshit for storage again. You're better off buying liability insurance and keeping a good lawyer on retainer.
_________________
patrix_neo wrote:
The human thought: I cannot win.
The ratbrain in me : I can only go forward and that's it.
Back to top
View user's profile Send private message
notageek
Tux's lil' helper
Tux's lil' helper


Joined: 05 Jun 2008
Posts: 131
Location: India

PostPosted: Fri Apr 27, 2018 12:09 am    Post subject: Reply with quote

Yeah, I know. It sucks to be you.

But, if you really must get a job done well, you must do it yourself.
_________________
"Defeat is a state of mind. No one is ever defeated, until defeat has been accepted as a reality." -- Bruce Lee
Back to top
View user's profile Send private message
Bones McCracker
Veteran
Veteran


Joined: 14 Mar 2006
Posts: 1605
Location: U.S.A.

PostPosted: Fri Apr 27, 2018 1:13 am    Post subject: Reply with quote

Or at least get it done in a civilized country.
_________________
patrix_neo wrote:
The human thought: I cannot win.
The ratbrain in me : I can only go forward and that's it.
Back to top
View user's profile Send private message
R0b0t1
Apprentice
Apprentice


Joined: 05 Jun 2008
Posts: 255

PostPosted: Fri Apr 27, 2018 7:55 am    Post subject: Reply with quote

Zucca wrote:
Then there is also these LTO backup tapes. I looked around bit more. HP StorageWorks Ultrium 448 seems to be the most cheapest and common LTO tape drive used. $20-$30 without shipping. It can store maximum of 400GB into a single tape, compressed. 200GB uncompressed. I've not yet dug into what's the downside of uncompressed format... no redundancy?
The compression seems to be done by the drive firmware itself (on the very high end drives), or perhaps tape authoring software. I'm only going to refer to uncompressed size as I would be writing encrypted data. Performing compression before encryption it might be possible to meet the advertised capacities, likely obtaining similar benefit on hard drives or Blu-ray disks.

A quick comparison chart:
Code:
1TB 2.5" for $ 45: 0.045  US$ / GB
2TB 2.5" for $ 85: 0.045  US$ / GB
3TB 2.5" for $135: 0.045  US$ / GB
4TB 2.5" for $130: 0.0325 US$ / GB
5TB 2.5" for $188: 0.0376 US$ / GB
Code:
 1TB 3.5" for $ 45: 0.045    US$ / GB
 2TB 3.5" for $ 60: 0.03     US$ / GB
 3TB 3.5" for $ 75: 0.025    US$ / GB
 4TB 3.5" for $ 94: 0.0235   US$ / GB
 6TB 3.5" for $160: 0.026667 US$ / GB
 8TB 3.5" for $193: 0.024125 US$ / GB
10TB 3.5" for $350: 0.035    US$ / GB *
12TB 3.5" for $440: 0.036667 US$ / GB *
Code:
LTO-1,  100GB for $ 15: 0.15     US$ / GB
LTO-2,  200GB for $ 12: 0.06     US$ / GB
LTO-3,  400GB for $ 15: 0.0375   US$ / GB
LTO-4,  800GB for $ 19: 0.02375  US$ / GB
LTO-5,  1.5TB for $ 26: 0.017333 US$ / GB
LTO-6,  2.5TB for $ 27: 0.0108   US$ / GB
LTO-7,  6.0TB for $ 80: 0.013333 US$ / GB
LTO-8, 12.0TB for $180: 0.015    US$ / GB
Code:
BD-R,     25GB for $0.42: 0.0168 US$ / GB
BD-R DL,  50GB for $1.58: 0.0316 US$ / GB
BD-R XL, 100GB for $4.80: 0.048  US$ / GB

Unless you can get an LTO-6 drive, it looks like Blu-ray is cost effective. However, tape prices might improve, and you can reuse both hard drives and tapes. The marked capacities (with asterisks) are only available with manufacturer provided data recovery insurance.


ECC is very important. Are there any small form factor components that support ECC?
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1312
Location: KUUSANKOSKI, Finland

PostPosted: Fri Apr 27, 2018 9:06 am    Post subject: Reply with quote

If you're lucky you can get LTO-5 drive for around $300-$400. However LTO-4 drives can go for $50. One or two LTO-4 tapes could cover my most valuable data. The ability to rewrite is a big bonus for me at least.
That said... It still seems that sticking a hard drive into my eSATA dock is still the easiest way to go, since I already have useless, but working, hard drives lying around.
If I had a spare 5.25" slot on my server, I'd buy an LTO-4 drive just to test out it's usage and if it's any conveivent.

R0b0t1 wrote:
ECC is very important. Are there any small form factor components that support ECC?
Small form factor? Motherboards? There are some µATX AM4 boards that do support ECC.
_________________
..: Zucca :..

Code:
ERROR: '--failure' is not an option. Aborting...
Back to top
View user's profile Send private message
cokey
Advocate
Advocate


Joined: 23 Apr 2004
Posts: 3343

PostPosted: Fri Apr 27, 2018 5:43 pm    Post subject: Reply with quote

R0b0t1 wrote:
Zucca wrote:
Then there is also these LTO backup tapes. I looked around bit more. HP StorageWorks Ultrium 448 seems to be the most cheapest and common LTO tape drive used. $20-$30 without shipping. It can store maximum of 400GB into a single tape, compressed. 200GB uncompressed. I've not yet dug into what's the downside of uncompressed format... no redundancy?
The compression seems to be done by the drive firmware itself (on the very high end drives), or perhaps tape authoring software. I'm only going to refer to uncompressed size as I would be writing encrypted data. Performing compression before encryption it might be possible to meet the advertised capacities, likely obtaining similar benefit on hard drives or Blu-ray disks.

A quick comparison chart:
Code:
1TB 2.5" for $ 45: 0.045  US$ / GB
2TB 2.5" for $ 85: 0.045  US$ / GB
3TB 2.5" for $135: 0.045  US$ / GB
4TB 2.5" for $130: 0.0325 US$ / GB
5TB 2.5" for $188: 0.0376 US$ / GB
Code:
 1TB 3.5" for $ 45: 0.045    US$ / GB
 2TB 3.5" for $ 60: 0.03     US$ / GB
 3TB 3.5" for $ 75: 0.025    US$ / GB
 4TB 3.5" for $ 94: 0.0235   US$ / GB
 6TB 3.5" for $160: 0.026667 US$ / GB
 8TB 3.5" for $193: 0.024125 US$ / GB
10TB 3.5" for $350: 0.035    US$ / GB *
12TB 3.5" for $440: 0.036667 US$ / GB *
Code:
LTO-1,  100GB for $ 15: 0.15     US$ / GB
LTO-2,  200GB for $ 12: 0.06     US$ / GB
LTO-3,  400GB for $ 15: 0.0375   US$ / GB
LTO-4,  800GB for $ 19: 0.02375  US$ / GB
LTO-5,  1.5TB for $ 26: 0.017333 US$ / GB
LTO-6,  2.5TB for $ 27: 0.0108   US$ / GB
LTO-7,  6.0TB for $ 80: 0.013333 US$ / GB
LTO-8, 12.0TB for $180: 0.015    US$ / GB
Code:
BD-R,     25GB for $0.42: 0.0168 US$ / GB
BD-R DL,  50GB for $1.58: 0.0316 US$ / GB
BD-R XL, 100GB for $4.80: 0.048  US$ / GB

Unless you can get an LTO-6 drive, it looks like Blu-ray is cost effective. However, tape prices might improve, and you can reuse both hard drives and tapes. The marked capacities (with asterisks) are only available with manufacturer provided data recovery insurance.


ECC is very important. Are there any small form factor components that support ECC?
Code:
Amazon Glacier Pricing
Pay only for what you use. There is no minimum fee.
Storage Pricing

$0.004 per GB / Month

AWS Free Usage Tier
As part of the AWS Free Usage Tier, you can retrieve up to 10 GB of your Amazon Glacier data per month for free. The free tier allowance can be used at any time during the month and applies to Standard retrievals.
it's cheaper to use AWS, also it's hassle free and guaranteed
_________________
"Sex: breakfast of champions" - James Hunt
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Off the Wall All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum