Future Proof Backup Strategy

Mechanical spinning disks and modern SSDs are fallible components. Over the course of the past 10 years, I have crashed 3 different hard disks among the dozen that I have owned. It is a far cry from the failure rates announced by manufacturers.

So obviously you have backups. But are these up to date? Are you sure your backups work? What if your house burns or someone steals your equipment? What if you primary disk gets corrupted?

It does not take much for things to go wrong. After each hard drive failure, I either barely avoided catastrophe or lost a little bit of data. Here is a few things I have learned in the process:

  1. One copy is not enough. You may think you would have time to duplicate the data after one of the disks crashed, but think again. Maybe the backup has been corrupted, and a secondary backup would be useful. Maybe your backup drive will die. As unlikely as it sounds, it happened to me on my old PowerMac G4. I had two disks in software RAID-1. The second disk died a few days later while mirroring the data after I inserted a brand new disk.

  2. You need off-site backups. In case your house burns. In case someone breaks in, steal your computer and your backup drive. In case a power surge fries all the equipment plugged on the same socket.

  3. Encrypt your data. Not because the government wants to snoop on you. They can use a $5 wrench or the law. But because someone might steal your gear or, if if you use an online storage, their security could be breached. And even if you have nothing to hide, the data contained on your computer can be used for a lot of nefarious purposes, such as identity theft.

  4. Use incremental backups instead of mirroring. Not just to save time, but also to avoid propagating errors from your primary disk. Most file systems do not include checksums of the data. So when you copy a file, and the data had been corrupted, you copy the same errors to your backup disk. On a checksumed filesystem, you would normally get an I/O error instead. I have over 100GB of pictures, and one day, I discovered a handful (10 files or so) had been corrupted. One of my backups (mirrored) was also corrupted, but fortunately, not the incremental backups.

Here is what I currently use. This is Mac OS X specific.

  1. I use Filevault 2 on all my drives. Filevault 2 is real full disk encryption, and has nothing to do with its predecessor of the same name, which was an abomination. It’s very secure even though Apple has not published formal specifications.

  2. I use TimeMachine. This is a no brainer. It just works and there is no better way to have up to date backups.

  3. I use SuperDuper! with the smart update option instead of the regular cloning, to avoid coping mistakes from the primary. I have a Newer Tech external Voyager and a set of 2 internal hard drives. I always keep one at home and one off-site, at the office. I typically do a backup once a month, or before every MacOS X update. If things go wrong, I can directly boot from the backup.

  4. I use Arq and Amazon Glaciers. This is a little bit like an insurance, a solution of last resort in case everything fails. I have about 500GB of data backed up there. To limit the number of requests and pay less, I make only one backup every month. I end up paying about $5 per month.

  5. I have a local Linux Miniserver. This is my NAS. I use it to backup super critical data like photos and family videos, as I do not have sufficient storage to keep more than a couple of years on the iMac. I use snapraid to provide redundancy on my 4 disks array. Snapraid would allow me to reconstruct the data if a single disk in the array were to die. I monitor the SMART status of all the disks with smartmontools, and I have a spare disk I can swap in at any moment. A lot of the critical data is also backed up to Amazon Glaciers, before being uploaded to the NAS.

With that kind of crazy setup, even if North Korea was to nuke Oxfordshire tomorrow, my data would be safe… but nobody would be left around to decipher it.