Basic bad hard drive recovery

I get asked about this all the time. People who are considering paying for expensive data recovery ask me how I’m able to pull the majority of their data from a drive that sounds like a prop from Short Circuit. A lot of the time you hear something down the line of ‘I don’t care what happens, I just need all the Word documents and picture files’, and a lot of the time I can provide more than they expected.

So, what’s the deal? Firstly you should always pull any drive that is making noises, secondly you should first confirm that the noises aren’t coming from your CD/DVD drive (yes, that’s happened). When you have the drive you must keep it cool, a cheap house fan on its highest setting pointed at a drive which is on a solid surface or otherwise secured in the best possible way to prevent vibration will do the trick. In fact I’ve even directed the output from an air-conditioner onto the drive to keep it cool. I don’t have an expert analysis on why it should be that keeping it cool helps but I believe the cold helps prevent issues caused by low tolerances internally preventing the rotors from functioning when they heat up. In summary, do whatever you need to keep the drive cold and ensure it doesn’t vibrate when you start running it.

You’ve pulled the drive as soon as it started clicking, or a friend gave you it and made strange shapes with their face to explain the noise it was entertaining their dog with. You know something is wrong because the drive doesn’t boot; it does start, but then crashes and blue screens or kernel panics. Now we need the data from it, as much data as the drive has the ability to give you; to do this we use GNU ddrescue.

Getting the most data possible with GNU ddrescue

Plug your drive in to your computer in while keeping it cold and vibration free and start up a console. Firstly we’ll pull all the data from it in sectors without retrying; this means that each time it hits an error from the drive it’ll continue on. However GNU ddrescue has an excellent additional feature, it logs these sectors when it gets the error. Once you have the sectors with easy to get data you can concentrate on the ones with the harder to get data and pull what you can from these. Sometimes these seem to trigger the failure of the hard drive so it’s sensible to get the data that you can get from the drive as quickly as possible. To a certain extent with drive recovery you’re always fighting time and statistical chance of another failure compounding the problem into a irrecoverable mess. To recap; we’ll grab the data we can easily then return for the harder data using different trimming and access modes. This is not a quick operation, the data recovery time-lines I usually give are measured in days and not hours.

Pull as much data as possible:
ddrescue --no-split /dev/sdb sdb.iso sdb.log

Return for the missing data, this time we go direct and home in on the remaining sectors (everything already read will be ignored):
ddrescue --direct --max-retries=3 /dev/sdb sdb.iso sdb.log

Still missing data? Retry marking all the failed blocks as non-trimmed to force it to retry full sectors (sometimes this causes it to get data when it hasn’t for the last three passes):
ddrescue --direct --retrim --max-retries=3 /dev/sdb sdb.iso sdb.log

At this point if you don’t have all the data you need to start thinking about cool down periods, the drive has been running for several hours straight and it probably needs some rest. If you’re doing this from a dedicated computer shut everything down and unplug power cables. Let all the power drain from capacitors, call your client (if you have one) and tell them the status while letting everything sit dry for several hours – I would recommend a full 12 hour stretch minimum. Sleep on it, try to remember backups for missing data.

Now we’re probably at least on day two, your drive has sat cold for several hours and you’re ready to go again. Make sure the drive will stay cool and without vibration. This is usually where you start getting desperate, the data doesn’t seem to be coming back but it still might. Lets try it again, retrim with direct access:
ddrescue --direct --retrim --max-retries=25 /dev/sdb sdb.iso sdb.log

Hopefully you have all your data now; if you don’t most people deem it as irrecoverable (on a software level).

I’m still missing data!
Really this data should be written off as gone, if you’re in a business situation then hopefully you’ve got some risk transference and you can start procedure for insurance or firing random members of someone else’s team for not taking backups. In a home situation you need to be looking at backup solutions, there are plenty and some are free.

The data you have may be enough to rebuild everything, your file system and OS may cope with very small losses without much more intervention (look at the next section).

So now we’re talking last resort data recovery. Plenty of people have advice, they range from freezing the drive to tapping it. It’s even possible to enclose your drive in an anti-static plastic bag and put it within an icebox in an empty fridge, just be careful with any frosted on ice because hard drives and water don’t play especially well. Another thing reported to help is trying different positions, turning the drive upside down or on it’s side.

While doing whatever last ditch attempts to recover the data that you choose you should be running with infinite retries:
ddrescue --direct --max-retries=-1 /dev/sdb sdb.iso sdb.log

I have all the data I can get, where next?
I’ll discuss recovering data from the image at a later date but here’s your quick and dirty ‘will it boot’ method for people to pull their data back off a new functional drive.

Copy (dd) the image onto a new drive:
dd if=sdb.iso of=/dev/sdc

When the copy is done give the drive time to spin down, make sure nothing mounted and unplug it then replug it to ensure everything was read correctly. Unmount anything that mounts automatically (or better yet have auto mount off). Now list your file systems and go through them with fsck one by one to confirm that the file-systems are usable. For example:
fsck.msdos /dev/sdc1
fsck.ext3 -f /dev/sdc2

Put the drive in the target system and boot. Actually, if you want to be forensically correct you should pull all the important data off the drive before booting, but that’s a subject for another day.

I’ll be back with more information on what to do with your recovered image and what to do if the system won’t accept it as a valid partition.

Regards, Robert.

This entry was posted in Uncategorized. Bookmark the permalink.

1 Response to Basic bad hard drive recovery

  1. Pingback: Recovering a hardware RAID from failing disks | Robert Small, The blog thereof.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s