Home »
Podcasts
»
2007 Podcasts
•
Listen to the podcast: Your hard disk is
more likely to fail than you think.. 
Transcript
This is Leo Notenboom for askleo.info.
Several weeks ago Google released a paper detailing an analysis of the consumer-grade hard drives they use in their data centers. As you can imagine, Google has a lot of hard drives.
They were looking at what could be learned from hard disk failure rates.
One surprising result was that they determined that hard disks Self Monitoring, Analysis and Reporting Technology, or "SMART" as it's known, could not be used to accurately predict when or if a drive was about to fail. Drives reporting SMART errors often lasted for years, while other drives were just as likely to fail without any SMART diagnostic information prior to the failure.
Much more importantly, though, are what I think are some very scary numbers about hard drive failure rates. For example, for drives older than 2 years, Google reports seeing about a 7% failure rate per year. Put another way, one out of ever 14 drives will fail within a year.
That's way higher than I would have predicted.
It also makes me very nervous.
Disk drives are cheap, but the cost of replacing one can be enormous. For example, unless you're doing true full backups a sudden failure means you're going to have to reinstall and reconfigure your operating system and all the applications you had on a failed drive. If you've been backing up your data you may not experience data loss but you'll definitely lose a chunk of time for the rebuild.
And if you haven't even been backing up your data - well, you've got a serious problem.
There are several possible approaches to minimizing the risk of a hard drive failure, but for the average consumer nothing, and I mean nothing can replace a good full backup strategy.
In fact, after hearing these new statistics, it's a change I made myself. In
the past I'd been backing up my important data, of course, but not my operating
system and applications. As of earlier this week I now do a nightly backup of
the entire hard disk on my primary computer using Acronis Trueimage
Home.
Regular listeners and readers of Ask Leo! will know that I've always stressed the importance of backing up. Google's latest report only makes me even more convinced that disaster prevention isn't just a good thing, it's a requirement.
Check out the show notes for links to Google's whitepaper and to a similar study performed at Carnegie Mellon University with similar results. I've also linked to an episode of the highly recommended Security Now podcast with Steve Gibson and Leo Laporte which covers this issue as well.
I'd love to hear what you think. Visit askleo.info and enter 11293 in the go to article number box to access the show notes and to leave me a comment. While you're there, browse over 1,000 technical questions and answers on the site.
Till next time, I'm Leo Notenboom, for askleo.info.
Related:
Google's Report (pdf)
Security Now Episode 81 - Hard Drive Unreliability
Article C2973 - March 24, 2007
7%? That is Much higher than we have, we've been lucky i guess.
Posted by: Louis Benn at March 26, 2007 3:58 AM1 failure in 3 years. :)
7%-- I have had # failures in the last ywar. one only lasted 4 months--i back up full all the time.
Posted by: Gerald Malcolm at March 30, 2007 7:09 PMAcronis True Image can make image backups either from within Windows or outside Windows running from a bootable CD. I would only trust an image backup made from outside Windows, but that's me. The entire True Image v8 manual however was geared to backups from within Windows.
Recently I tried to make True Image backups from a boot CD on a machine without a PS/2 port. Despite being Linux, both v7 and v8 failed to detect the USB keyboard.
But my biggest gripe with True Image is that it can not make a true image. That is, it is incapable of doing the basic job of an imaging product, make a sector by sector copy. For details see http://www.computergripes.com/trueimage.html
Posted by: Michael Horowitz at March 30, 2007 9:03 PMI think Google probably runs their hard discs pretty hard that way they get a better ROI. I doubt any of us are running HDs 24/7. From a HDs point of view you could easily think of 2 years work at Google being the equivalent of 5 years in a home or office environment.
Posted by: Gideon at March 30, 2007 11:39 PMI just listened to the podcast on hard drive failure. Expensive, yes,to the tune of about $465. I wish I had known this sooner. Thank you for your newsletter. I'll keep reading!
Posted by: Ann E. Hynes at March 31, 2007 8:09 PMBought a new HP Pavilion notebook and the hard disk died after only 5 months of use. HP shipped me a new one, but I wasted 3 days rebuilding everything. I now, as Leo, use Acronis TrueImage to backup my hard disk so next time my disk crashes or is trashed I'll save those wasted 3 days.
Posted by: Ken Crook at March 31, 2007 10:51 PMI've had one failure in the 15-20 years that I have owned a computer. That drive did fail during its first year.
Posted by: Vicki Williams at December 29, 2007 4:42 AMI teach computer maintenance, & I always stress that the reason for making backups is NOT "in case your hard drive fails" because I can guarantee that one day it WILL fail & that day could be as soon as the day after you bought the pc. If you change your mindset to "backup for WHEN it fails" you are much more likely to do regular backups. Incidentally, I can recommend DriveImage XML for "hot" system drive backups. I used to use PowerQuest DriveImage (no connection) but it was always breaking after MS system updates. The XML version has been rock solid.
Posted by: berny marsden at January 19, 2009 10:04 AMAll the comments refer to stopping an external drive. There was a reference to the internal drive always spinning when the PC is running. Not so. In the "Power" pull-down you can set the internal to shut down after any number of minutes or never if you wish. Much data being entered appears to go to another place, either RAM or maybe a cache. When needed, the indicator LED shows the internal being reactivated, and then writes to it. It may not save much elec, but would it not extend bearing life?
Posted by: Darius Dinshah at January 22, 2009 12:38 PMS.M.A.R.T technology uses threshold values to estimate the health status of a hard disk. The estimation of a failing date is like a trend estimation on how attribute value will change based on past values, it's just a statistical algorithm.
Posted by: Serge at February 8, 2009 1:50 PM