Ricochet is the best place on the internet to discuss the issues of the day, either through commenting on posts or writing your own for our active and dynamic community in a fully moderated environment. In addition, the Ricochet Audio Network offers over 50 original podcasts with new episodes released every day.
We’ve covered the physical aspects of a hard disk drive, tonight we’ll touch on the way data is organized on the drive, by covering those two most important topics; keeping secrets and ferreting other people’s out.
We’ll start by deleting files: Let’s say that I’ve got a backlog of old and worn-out memes to purge. That’s no problem, you just move them from the exquisitely detailed and organized archive of these things into the trash can, but that doesn’t actually erase anything. Bill Gates, knowing that we mere mortals are flawed and prone to regret, keeps your trashed files around in case your stale jokes may, someday in the future, be called for again. But we’re stronger than that. So we empty the trash folder (or, pro-tip; on a Windows box if you hold down ‘Shift’ as you delete a file the file doesn’t go to the trash at all; it empties automatically.)
In fact, I’ll do you one better. Let’s say that I’ve got an eight-year brony habit that I have to purge before my friends and relatives discover it. Nothing for it but to format the drive. That’ll make sure those meddlesome snoops can’t read my deepest, darkest, pinkest and fluffiest secrets, right? Well, not if they’re determined enough. Let’s take a bit of time to talk about how data gets organized on hard drives.
We’ve seen how you write individual bits on the platter and whatnot, but we lightly stepped over a critical question. How does your machine know where to get its bits? To answer that, let me step into the realm of analogy.
Organizing Your Files for Fun and Profit
Let’s say your hard drive is your local public library, or better yet, my local public library; it’s better than yours because I’m familiar with it. Now let’s say you’ve got a particular file you want to get out of it; you’re going to give the Federalist Papers another go. So you go through the library, one book at a time, and read it until you find the Federalist Papers. Hmm… maybe not the quickest way to go about it. Okay, you scan the titles one by one to find the book. (Green Eggs and Ham. Goodnight Moon… alphabetically, we’re getting closer but still going to take you a while.) Eventually, you work up the nerve to talk to the stupendously sexy librarian (hey, my analogy, we do things my way), and she directs you to the card catalog. The card catalog! Why didn’t I think of that?
Back to the hard drives. As you’ve probably guessed by now, the library is the drive and the books are the files on it. The card catalog tells you where the books are. Better think of it in terms of one of the old-time cabinets with actual cards in it; if you’re thinking of a digital catalog you could slip into recursion. On a drive, you have a file system to act as your card catalog. It divides the one-and-zero soup of your drive into individual volumes and keeps track of where those volumes are. (These are the ‘sectors’ of your drive, and they vary by file system type, which I’m going to ignore entirely.)
Let’s say you buy a brand new flash drive (which is an SSD; this discussion of file systems applies to them too). Great! You’ve got a whole empty library to fill up. My groaning bookshelves are envious. Happily, you start copying files onto it. (Scarne on Dice — highly recommended. The Essential Calvin and Hobbes — even more so. Vector Analysis; eh, you’ve got the space. Don Quixote It’s a classic, right?) Sometimes you’ll get a book that’s too long for one volume (Gibbon I and Gibbon II), and it gets split into multiple sectors. Heck, sometimes you get a whole encyclopedia set and you’ve got three dozen volumes to shelve. Pretty soon your drive is filling up.
Now, if you were to go down and ‘browse the shelves’ so to speak, you could read the titles off of a shelf, and it’d go like this: Scarne on Dice. An Encyclopedia volume (Chicago to Death). Calvin and Hobbes. Gibbon II. Don Quixote. Another encyclopedia volume (Pope to Reformation) and so on. No logical order that you can see, but so long as the catalog knows where the books are it can retrieve them for you and you don’t have to understand that.
Now, let’s say you want to read the encyclopedia cover-to-cover-to-cover. Okay, probably not, but let’s say you’re watching a cat video. If the individual sectors containing cat pixels are stored willy-nilly on your drive then it’s going to take longer to load, right? I mean you watch the video in sequence, wouldn’t you want the file pieces lined up so that it’s quicker to go from one to the next in order? Well, yeah; that scattered encyclopedia I mentioned up there is a badly fragmented file, when you defrag your hard drive you line your data up in sequential sectors so that your suspension doesn’t have to hunt around to find the next piece of your file. (And note that the argument applies to moving the arm around; defragging doesn’t do anything for you on SSDs.)
That’s all well and good, but I’ve got secrets to hide.
How do I hide my secrets? We’ll go back over the library analogy, this time in a view of deleting books. I know Cervantes wrote a classic, but I don’t need to keep a copy of it; I can always get it at my local library (gah! recursion! abort!). When you erase the file (permanent delete, not just trash-can it) your file system doesn’t go to the shelf and toss the book out the window. It 1) forgets that it was there, and 2) lists that space as ready for allocation. Next time you find a book to store (G. Gordon Liddy’s autobiography? How can I say no!) it says “I’ve got a space on the shelf to store that.” It goes to the empty spot on the shelf and it stores it there. No matter if there’s a Spaniard occupying the space already; the hard drive doesn’t care if you’re writing or overwriting data.
And if you’re deliberately trying to lose data? That book is still there. If you were to do the equivalent of an end-of-year audit at the library and go to each sector and figure out what it holds. “What’s this ‘Fifty Shades of Grey?'” “That’s not mine; I was holding it for a friend!” It’s more difficult if you format the drive; that’s the metaphorical equivalent of burning your card catalog. All the books are still there but you don’t know where to find any of them. A patient investigator could read through them and piece things together.
That’s why you get something like BleachBit. You wouldn’t want to trouble the FBI with your yoga routines, would you? Programs like BleachBit are designed to not just delete the files, but then overwrite with meaningless data in order to foil gumshoes. If the drive has nothing but zeros on it you can’t ready any information off of it, right? This, for any never-gonna-be-presidents in my audience, is what it means to wipe a drive.
There’s a wrinkle to that. It used to be (at least) that you could pull out a disk platter, put it under a scanning electron microscope, and read what the data was before being written to by observing the areas around where the bit used to be. (Reading it each one or zero manually, painfully long and expensive but in theory possible.) This gives rise to things like the DoD wipe (write ones to every spot on the drive seven times, then seven times with zeros, then seven times with random ones-and-zeros). These days the individual spaces are so small that you’re safe writing once with just zeros. Probably; if you’re a former Secretary of State with super-secret wedding planning to obscure you should probably take the time to be paranoid.
Lastly, the fun stuff. Physical destruction of the data. Drill a hole through your drive. Take it out back and hit it a couple times with an axe. Claw hammers are handy too. I recommend taking it to the range and using it for target practice; target practice is always fun and useful. Don’t try wiping it with a honking big magnet; the drive casing is designed to shield the stuff inside (and outside) from magnetic interference, and you’ve got a VIP (very VIP)’s name to hide. Don’t want to chance that.
Let’s say that you just put a drill through your drive, without deleting any data. Could someone read that? The sections underneath the drill bit are going to be pretty well thrashed, and the platters are actually made of glass, so expect to shatter all of ’em. You ain’t gonna spin that drive up again. But a sufficiently motivated organization could, in theory, piece the shards together and read the physical ones and zeros on the drive with an electron microscope. So wipe your disk. Then shoot it. Then grind it into dust, and use the dust to cut your cocaine stockpile. Wait, not that last thing. Before you go all out on your paranoia though, remember that the cost to recover that data is higher than the value of anything short of the Colonel’s eleven herbs and spices.
If you want to learn more about the subject; I highly recommend this half-hour video about data carving. That’s the term for going through a library where someone’s lost the catalog. I’m a big fan of Paul’s Security Weekly, entertaining and interesting stuff. Doug White’s Secure Digital Life series is pretty great too; I’m behind and need to catch up. In the meantime, join us fortnight next for “A Practical Lesson in Applied Deconstruction” or “Therapeutic Uses for Duct Tape”
This is part twenty-seven of my ongoing series on building a computer, the Lizard People way. You may find previous parts under the tag How to Build a Computer. This week’s post has been brought to you by the Illuminati! The Illuminati would like to remind you that there’s no such thing is as them. Fnord!