The last time Hackerfall tried to access this page, it returned a not found error. A cached version of the page is below, or click here to continue anyway

How I Store My 1’s and 0’s: ZFS + Bargain HP Microserver = JOY – mockyblog

I have a large collection of 0’s and 1’s and am eager not to lose them. It’s a very large collection – storing them in the cloud would cost a few hundred dollars a month and even if I paid that, the pipe between my house and the cloud isn’t fat enough to access them efficiently.

So instead – like most nerds – I’ve had a motley progression of servers in my house for the last 15 years. They were built out of spare parts, held together with gaffer tape and blind faith.

Things are better these days and gaffer tape is usually not required. A few friends recently asked how to do home servers the right way. While 2TB of data copies onto my new baby I’ll explain how I’m doing it.



So what’s it for? I’m not building one of my kickass web hosting platforms here. It’s for storing those 1’s and 0’s, serving them back up at a reasonable speed and taking reasonable precautions not to lose the data. The emphasis here is on reasonable – we want moderate reliability but it’s built down to a price and the occasional day’s downtime won’t ruin anyone’s business.

As well as storing the 1’s and the 0’s we might also wish to run a handful of services and a couple of VM’s. For example:

It is desirable for the system not to need much maintenance. Anything more than one day a year is too much. But if we’re going to run regular services we need a fairly normal OS so let’s stick with an LTS Ubuntu release.



What about the hardware?

HP do a lovely little machine called the Proliant Microserver. It’s dirt cheap. Usually about 250 but buy in the UK right now and they’ll give you a 110 rebate cheque. Even without the discount it’s good value; buy one now. Mine’s a year old and still going strong.

Mine is dustier

Nothing amazing but it isn’t meant to be. Comes with 2GB RAM (plenty for file serving, bump it up to 8 if you want to run a few VM’s) and an uninteresting 25oGB disk. Dual-core low power x86 64-bit CPU. But what really makes it interesting is…

The thing is just covered in ports. Mine has so many devices connected up it looks like an octopus.

Mine looks like this (from Wikimedia Commons)

HP have taken a lot of trouble to get the microserver right – inside the door you’ll even find a torx key for working on it and as many extra screws as you could ever need. The only serious limitation is memory: it won’t take more than 8GB. Probably they are doing this to keep hardcore business users buying more expensive machines and subsidize giving these things away to me.

Edit: a lot of commenters have pointed out that you really should be using ECC RAM. Here’s how to find out if an existing server uses it.



What OS?

Keep it vanilla. The latest Ubuntu LTS Server you can get your hands on. LTS because you want this thing to sit behind your sofa for aeons before you upgrade, Ubuntu because you want a normal Linux with all the trimmings. Great as they are, if you go for some cut-down fileserver distro I guarantee you’ll miss Ubuntu’s sprawling package repositories.

Why not Debian or CentOS? Cool, go that way if you prefer them. But personally I am in luuuuuurve with the Ubuntu ZFS PPA. More on that later…



OS Disk

This will offend the purists but I’ve come to think of booting Linux from a software RAID as a pain. Maybe it’s me but something always seems to go wrong eventually. Instead I take a more pragmatic approach: a single, cheap, sacrificial OS disk (use a small SSD or an old 2.5″ laptop drive to save power?) and a RAID for the stuff that matters. Do it this way and whatever happens to the RAID you’ll almost always have an OS you can boot up to fix it. Hell, if your OS is so screwed it wont boot you can trash the disk and make a new one in an hour. Just don’t keep anything you care about on there.

Here’s a tip: buy a pair of cheap drive mounting rails and stick your main OS disk inside the the space meant for the DVD drive. You’ll find an extra SATA port for it on the motherboard. Jamming it in there now will save you a total rebuild later when you want to add that fourth data drive.



Data Tank

So you have a simple Ubuntu server running. Great. Ain’t gonna fit a bazillion 1’s and 0’s in that though, are we.

At the time of writing the sweet spot for disk pricing is 2-3TB. I’ve gone for a pair of 3’s since they’ll last a little longer (before obsolescence I mean, not MTBF) and leave more space to expand. Remember this thing’s going to run 24×7: you’re footing the power bill for it so go for low-power disks. I bought a pair of WD Caviar Green 3TB’s.

Not my WD Caviar Green. They don’t work with the lid off.

Only a pair of disks?

Normally we fear expanding a home server because resizing a RAID always means copying terabytes of data off somewhere else, wiping the lot and constructing a new one including the original disks. This officially sucks – who has the same amount of storage again lying around? The result of this suckage is that we built our home servers to last for years and spent most of our time staring at acres of expensive, unused space which would have been half the price if we could only have put off buying until we needed it.

But it’s 2012 and we don’t need to do that shit anymore. Enter Sun Microsystems’ parting gift to the world: ZFS.

The Linux ports of ZFS have been around a while now but every time I’ve considered it before RAID-Z – what we need to make a redundant, expandable volume – wasn’t yet working. Now it is. It’s 2012 (I said that twice) and finally we can have redundant, expandable storage that doesn’t need a painful rebuild whenever it grows.

On the Ubuntu 10.04 I’m running this was trivial to setup. Follow the instructions to get the ZFS PPA (or compile it if you’re a masochist) and you’re mostly done. Be sure to read the FAQ’s notes on adding your disks to the pool by ID rather than /dev/sdX notation – this means your pool will still work if you ever shuffle your disks around.

zpool create tank raidz /dev/disk/by-id/abcdef /dev/disk/by-id/ghijkl
# as if by magic /tank comes into being

zfs create tank/photos 
# /tank/photos appears.  create more to suit.

# Then...
# 1) setup accounts & network shares
# 2) setup other volumes if required
# 3) setup monitoring to mail you when a disk fails

…and that’s it. You can now relax safe in the knowledge that your 1’s and 0’s are being held by a pre-1.0 version of a filesystem invented by a dead company. Check out this excellent guide when you need to know more of the ZFS administration commands.



So Why is This Awesome?

  1. You’re not dependent on expensive hardware or crummy faux-hardware RAID
  2. OMFG deduplication! But remember the dedupe table costs you performance or RAM.
  3. OMFG snapshotting! Rotate a few nightly snapshots and it’s easy to retrieve mistakenly deleted stuff.
  4. Since the costs of adding space are low there’s no need to build for the amount of data you’ll have in two years time. No more empty disks spinning away wasting your money.
  5. Disks in a RAID-Z need not be the same size. ZFS will give your pool as much space as it can while still maintaining redundancy. Want to build a RAID-Z out of 4x old 500GB disks and 1x new 2TB one? Knock yourself out.
  6. Corollary to this: you can grow your tank by throwing out the smallest disk and replacing it with whatever’s the new sweet spot for drive pricing. And doing this does not require copying all your data out then making a new array. Never rebuild again. – but not yet. “It is not possible to add a disk as a column to a RAID-Z, RAID-Z2, or RAID-Z3 vdev. This feature depends on the block pointer rewrite functionality due to be added soon.”
  7. You can stick eyes on the front and ears made out folded post-its and it looks like a cat.


Awesome. Srsly. And if any of the people who worked to bring the joy of ZFS to Linux read this – dude, I owe you a beer.

Continue reading on