…in five easy steps.
So basically, I royally screwed my computer this weekend. For those of you who didn’t know, Ubuntu 7.1 (Gutsy Gibbon) came out last Thursday, and I decided that I just had to try it. I had just written an impassioned review of why Linux isn’t the superior OS for developers, but I figured that maybe this time it would be different. A lot of my headaches with Linux step from the fact that you have to spend so much time mucking about in its uncharted guts just to get your mouse (insert peripheral here) working. In short, it’s not a very user friendly OS. And even as a developer, I want an OS that stays out of my way and does what I ask, when I ask of it.
Naturally, I remembered how much pain I experienced the last time I tried to wipe my OS and replace it with Linux. I was back on XP then, and it took a full three days to get everything back to productive levels. Even then, things were never the same. Since I try to never repeat the same mistake twice (leaving more time for new and exciting mistakes) I decided to take some precautions first.
Linux Drive Imaging 101
So for those of you who didn’t know, it’s possible to save the data on a hard drive bit-by-bit into a file on another machine or hard drive. Once this file is in place, it’s also possible to restore the data on the drive bit-by-bit (otherwise the process wouldn’t be very useful) and achieve exactly the same state as immediately prior to sucking the image off of the drive. Colleges use this technique often for lab machines, allowing them to be altered at will without affecting the next student’s work. There are other applications, but it’s a long and dull list which would really detract from the focus of this post.
Being an enterprising computer user savvy in various forms of Linux voodoo, I decided to attack this problem myself. Now, I’ve actually built software to accomplish just this task in the past, so I figured it would be a cakewalk. After about 5 seconds of deliberation, I settled on the GNU Linux utility dd as the foundation for my scheme. dd is naturally cheaper than commercial offerings like Ghost or Alteris, and it also has the significant advantage of being installed on just about every Linux system, including live CDs such as Knoppix or the Ubuntu installer. This will work to our advantage.
dd is just about the simplest command in the GNU suite. All it does is read one file bit-by-bit and spit the output into another file. If either an input file or an output file is unspecified, stdout or stdin will be used instead. Thus, dd can be useful when chained in an array of pipes. At its simplest form, dd just passes bits from stdin to stdout, functioning as a pipe:
cat file.txt | dd > file.txt # how useless can we get?
Of course, what we’re trying to do here is actually get data off of a hard drive and squirt it into a file. Does this actually help us? Linux 101: everything is a file. And when I say everything, I mean disks, peripherals, PCI devices, network devices, everything. This means that if we really wanted to, we could use dd to copy the entire contents of our hard drive and send it to our printer! (though I’m not entirely sure why we would want to) More usefully, this means that it’s trivial to copy data from a drive into a file since the drive is really a file to begin with. And what is dd but a fancy way to copy bits from one file to another?
dd if=/dev/sda of=my-drive.img # image the first sata/scsi drive into "my-drive.img"
Grand! The problem here is that when the smoke clears several hours later (dd is unfortunately quite slow), we will have an image file roughly twice the size of the hard drive itself. (oh, caveat time. You should not try to do this while the computer is booted. Such a command should be run from a live CD onto an external drive or a duel-boot OS) The reason this file is so huge is that data is represented exactly the same both on the disk, and in the image file. Files of course are themselves represented in a disk in a certain way, and thus there’s a great deal of overhead here. An entire file system is being packaged within a file, bit-by-bit, and then itself being packed onto another, possibly completely dissimilar file system. Naturally there’s going to be quite a bit of inefficiency.
The nice thing about most file systems is when you read the data bit-by-bit from the drive, you will notice large blocks of repeated values or patterns. Anyone who knows anything about compression will tell you that this is a sure sign of high compressibility. Sure enough, if you pipe dd through bzip2, the resultant file will be almost exactly the size of the data on the disk. So even if you have a 250 GB hard drive, if you’ve only used 30 GB, the resulting .img.bz2 will only be 30 GB (more or less). This is really nice; and while not as efficient as some systems like Ghost, this image size should do nicely for our needs. The problem here is that the bzip compression algorithm is insanely slow on compression. It’s quite fast to decompress a file, but its use would extend the imaging process for a 60 GB drive from roughly 4 hours to well over 12.
A good compromise in this situation would be to use gzip instead. Gzip, while not as efficient a compression algorithm, is much faster on compression than bzip. It’s quite a bit slower to decompress, but not inordinately so. Gzip’s other problem is that it’s nowhere near as efficient a compression algorithm as bzip. 30 GB of data on a 60 GB disk will compress down to a roughly 45 GB image file using gzip. That’s 15 GB more than bzip2, but well worth it in terms of compression speed in my book.
Using the magic of bash command piping, we can accomplish the drive imaging in a single command:
dd if=/dev/sda | gzip > image-`date +%m%d%y`.img.gz
This will produce a file with a name like “image-101907.img.gz”, depending on the current date. To push the image back onto the drive, we use a little more bash piping:
zcat image-*.img.gz | dd of=/dev/sda
All I needed to do was use these two commands, imaging onto an NFS share on my server and I could backup my full OS image at will. It’s incredibly simple, and simple plans always work…right?
Take Warning Dear Friends
Unfortunately (for me), I didn’t account for the fact that hard drives inevitably fail, usually at the worst possible moment. I imaged my Windows Vista installation, then proceeded to install Ubuntu 7.10. Everything went smoothly in the install, but when it came to the tasks I perform on a daily basis, the stability just wasn’t up to snuff. After about six hours of fiddling with settings, kernel recompilations and undocumented package requirements (hint, if Compiz doesn’t work with your ATI card out of the box, manually apt-get install the xserver-xgl package) I decided to revert back to Windows. The frustration just wasn’t worth it (more about this in a future post).
So I booted my computer back into the live CD, NFS mounted the server, imaged the Linux install (just in case) and started the reimaging process only to discover that the server’s 1 TB hard drive had corrupted the image file. I was quite literally horrified. Fortunately I had anticipated something like this when creating the image, so I had cp‘d the image file over onto a separate RAID array prior to overwriting the drive. I juggled the NFS mounts and tried imaging from that file only to discover that it was incomplete! It seems that the image file had been corrupted on the disk as it was created, meaning that it wasn’t a valid image file when I copied it to the separate drive.
Needless to say, I was quite upset. I don’t even have a DVD copy of Windows Vista (it was preinstalled on my computer), so I have to shell out $15 and wait a week for Microsoft to ship me a replacement disk. In the mean time, I can’t just do without my computer, so I fired up dd again and pushed the Linux image back onto my drive. Of course, not being the OS I really wanted, it worked perfectly…
All of my data was backed up, so I haven’t really lost anything but the time and frustration of having to do a clean reinstall of Windows on my system (a significantly more challenging task than an install of Linux). The moral of the story is: if you have data which you’re really going to need, even if it’s only a few hours down the road, make absolutely sure that it’s on a stable drive. What’s more, if you do store it redundantly (as I attempted to), compare checksums to make sure the redundant copy actually worked. If you only spend a few minutes verifying the correctness and integrity of your critical backups, little red flags should be triggered mentally. In short, don’t try this at home – I wish that I hadn’t.