Helping people with computers... one answer at a time.
RAID is a valuable technology for improving apparent disk speed and fault tolerance, but it is in no way a replacement for backing up.
Do you think RAID 1 is a viable alternative for backing up?
•
No.
In fact...
No. No. No. No.
And NO!
RAID is not a backup and no RAID array should ever be considered a replacement for backup.
I'll review what RAID is, and most importantly, what it is not.
•
RAID stands for Redundant Array of Inexpensive Disks.
It can be used to improve two things:
RAID 1 (which is what you're asking about) uses what's called "mirroring" to improve the reliability - or more correctly, the fault tolerance - of a disk drive. The two drives appear as a single device. Whenever data is written to the logical drive that your operating system sees (perhaps C:), that data is simultaneously written to both physical drives by the RAID controller.
Should either one of the drives fail, the other is still present and available. The RAID controller will run in single-drive mode until the failed drive is repaired or replaced. Some RAID controllers actually allow this to happen without powering down at all.
Throughout all of this, you continue to see the logical drive (i.e. C:) continue to work.
The system as a whole is now more tolerant of drive failure - a physical drive can actually fail completely and the system can keep on running.
RAID 0 uses what's called "striping" to improve the apparent speed of your hard disk. Striping uses techniques that vary from RAID controller to RAID controller to spread your data across the two (or more) physical hard drives. Once again, they are combined transparently by the RAID controller to look like a single drive, perhaps your C: drive.
The increase in speed comes from the fact that the hard disk head movement and rotation speed both limit the rate at which data can be retrieved from hard disk media. For example, by alternating every other sector of your data across two physical drives, the apparent data rate can theoretically be doubled.
Important: RAID 0 should never actually be used as it reduces fault tolerance, almost doubling your risk of hard drive failure. If either of the two drives fails, then the entire logical drive will have failed. I use it here as an example of a basic RAID technique, which can be built upon to mitigate that increased risk as we'll see shortly.
The two techniques that I've discussed can be combined in various ways, if you add additional drives.
A common technique uses both redundancy of data across multiple drives and distribution of data across multiple drives to achieve both improved speed and improved fault tolerance.
Consider this equation:
A + B = Z
Let's think of A and B as our data (we can also think of them as bytes or sectors - it doesn't matter), and we'll call Z a check sum.
A, B, and Z are each placed on separate hard drives. These three drives together are managed by the RAID controller to look like a single drive.
When you write data to the drive, A and B each get written to their separate drives; the RAID controller calculates A+B and writes that to the third drive as Z.
Why'd we do all that?
If a drive fails (and it could be any of the three drives), whatever was on it can be re-calculated from the remaining two. The RAID controller can do this so that your system can continue running until the failed drive has been replaced. This gets you the fault tolerance that I discussed as characteristic of RAID 1.
Your data is spread across two drives - A and B. This allows the RAID controller to stream your data off of those two drives; this simultaneously get you the speed improvement of a RAID 0 configuration.
Best of both worlds.
Naturally, I've oversimplified, and indeed, there are many ways to configure RAID arrays, but these are the fundamental concepts that pretty much apply across the board.
You might be tempted to look at RAID 1 and say, "Hey, my data is on two drives. That's backed up, right?"
Nope.
Your data is on one drive: C:. Yes, you might be more tolerant of a hard disk failure, and that's a nice thing, but it's not a backup.
If your system is infected with a virus, RAID won't be something you can restore to, like a backup can.
If you accidentally delete a file, you won't be able to restore it from a RAID array, like you can from the most recent backup.
If your system goes up in flames, a RAID array is not going to be a copy of your data safely stored elsewhere - like a backup could be.
In general, there are two great rules of thumb for backups that you can apply to any backup approach:
A backup should never be kept on the same machine. Technically, external drives actually violate this rule, but they're at least a separate physical box which removes some of the major concerns relating to this rule.
A backup should never be on the same drive as the thing being backed up. By drive here, I mean logical drive (C: for example) regardless of how many physical drives that might actually be "under the hood." The reason is simple: software (and users) operate at the logical drive level. If you accidentally instruct your computer to delete all of the files on your drive (don't laugh, it happens more often than you think - and it has happened to me), that would then delete both the original and backup. A virus, software bug, or any number of other scenarios could produce the same results. And, of course, if the drive fails - be it a single drive, as is most common, or the raid controller controlling several physical drives - then the backup is once again lost with the original.
Relying on RAID 1 as some kind of backup violates both of these rules.
RAID is an important technology to deliver potentially both speed and fault tolerance. Most higher-end servers, including the server hosting the Ask Leo! site, use some form of RAID for one or both of those purposes.
But don't confuse it with a backup. Having RAID does not impact your need for proper backups.
Article C4372 - July 15, 2010
I work in a DNA sequencing environment. The demand for storage capacity for the latest high throuput data is a bottleneck for our field. I am a newbie trying to learn the issues. I hear people comment on RAID 4, 5 and 6. I hear statements like disaster resistant (not good) and disaster tolerant (better). Do all you comments on RAID I apply to RAID 4 through 6?
22-Jul-2010
Posted by: Patrick Leahy at July 20, 2010 9:12 AM
I think that the question was too broad to answer. For example I believe that a RAID 5 NAS system is an excellent choice for a backup - assuming that it is being used as a backup. Your no, no, no, is an incorrect response to a very broad question. Granted I agree with you completely that RAID 1 is not a backup. However a backup using RAID 5 in a free standing NAS unit (that can even be in a different physical location) is a hard to beat choice.
22-Jul-2010
Posted by: Larry, Canton, Ohio at July 20, 2010 10:06 AM
Larry, Canton, Ohio, the set-up you describe doesn’t use a RAID backup. The backup is the NAS (if it is set-up that way, not just used as extra storage) and the NAS uses RAID to make the backup more reliable through redundancy and to minimise downtime if a disc fails. The RAID itself isn’t backup.
Posted by: Paul Higgins at July 20, 2010 12:17 PMI have a RAID, how can I tell whether it is Raid 0 or Raid 1? How can you get rid of it?
22-Jul-2010
Posted by: Gary Daniel at July 21, 2010 12:57 PM
If I am setting up a raid to my existing computer will I have to erase my hard drives to do that ?
16-Aug-2010
Posted by: craig at August 14, 2010 8:44 AM