RAID Failure – Does It Happen Often?

RAID Failure Hurts!

RAID was developed as a backup solution in many ways, just as much as it was developed to improve performance – so not many people want to discuss the possibility RAID failure!

With multiple drives, each sharing the workload and mirroring each other, it is possible to have the data recovery when one drive fails, to access data on another drive, while also improving the performance of the computer.

Of course, as with anything made by humans, errors can happen and the entire RAID array can fail. It is not always common, but when it happens, it can grind everything to a halt.

Does RAID Fail?

Of course it fails. There is a belief among many that RAID failures do not happen because of the built-in fault tolerance functions, as well as the option to rebuild. Anything can fail, and RAID is no different. It may not happen as often, but it is impossible for it not to fail given the laws of statistics.

Scenarios When RAID Failure Happens

There are several scenarios that may result in RAID failure. Understanding these scenarios can help prevent you from suffering catastrophic consequences of potential RAID failures.

  • The RAID server typically exists on a single controller, and the failure of that controller could result in a huge single point of failure.
  • If there is a power surge, a controller, or several disks in the array, may fail and that could cause the total loss of some or even all of your data. The RAID configuration setting of NVRAM in the controller card can also get corrupted if there is a power surge.
  • If a single hard drive fails and there is no hot standby, the RAID array will begin to run in degraded mode. It can sometimes take a day or two to order and install a new drive, and that increases the possibility of another drive failure which would in turn disable the whole RAID array.
  • RAID configurations with fault tolerance protect against physical failure but not against the entire system corruption, virus infection or inadvertent deletion by end users or system administrator error.
  • When replacing a faulty drive to rebuild the RAID volume, so it is once again in a healthy state, procedures can be performed out of sequence, resulting in a partial rebuild, or a system breakdown once the rebuild is finished and RAID re-activated.

What to do if there is a RAID failure

If you do suffer a RAID failure, you need to follow the right steps to protect your data immediately. The first thing you need to do, of course, is to shut off the array. If the array runs for too long in a degraded mode, the danger of further damage is going to be greatly increased.

Unless you know exactly what you are doing, never attempt to perform any physical repairs of the system. In order to do a proper repair, you need to be in a clean room. If there is dust in the room, it can not only create static electricity that could cause the complete erasing of your data with one static shock, but it can also cause irreparable damage to the disk that you are working on.

If ever in doubt, the best course of action is to contact a RAID data recovery specialist to handle the repair of the system and the retrieving of your files.

As for whether your data can be recovered or not, that depends on the cause of the problem on the RAID implementation, and just how severe the problem happens to be. In most cases, data can be recovered, but it will take a professional with specialized software tools to get it.

Always make notes about what happened before the failure, including anything unusual with your computer (noises, noticeably slower performance, etc). Any information you can provide to the RAID data recovery company will help in the repair of the array.

If you have a RAID array, it probably won’t fail but if it does – you can minimize the potential damage by remembering these steps and how to respond to the RAID failure properly.