Need advice replacing broken RAID drive

I have a computer which is set up in a RAID 1 configuration (mirrored).

The master drive is fine, but the Intel RAID software says that the recovery drive is dead.

I plan on buying a new new drive that is the same size as the master drive.

I assume that I should

1. Pull the dead drive
2. Boot the computer on the master drive to make sure that I pulled the correct drive (probably not labelled - not set up by me)
3. Back up the precious data on the master drive (backup runs nightly - but will get a fresh one just in case).
4. Clone the disk (has propriety software on it that I cannot install myself if I make a mistake)
5. Install the new recovery drive

---

This is where I need some help.

The master drive is partitioned into 4 parts.

Will I have to partition the new drive to match the existing drive?

Or will the Intel RAID software be smart enough to mirror the whole master drive and make the recovery drive match?

Any insight or suggestions are welcomed.

I'd suggest making a backup before you do ANYTHING else. Like, you should be doing it right now, to another computer on your network, not waiting for the new drive to show up. RAID is not a backup system... it prevents against downtime from drive failure, but there's a zillion kinds of data loss it doesn't do anything to prevent. So the very first thing you should be doing, before changing or fiddling with anything, is getting a good backup somewhere safe. That means NOW, even before you finish reading this post. It'll still be here when you get back, but your data may not be. Go. Shoo. Come back and finish this once you've got a backup started.

Once you've done that, how to proceed really depends on the software, and I'm not familiar with that kind of softraid -- I either use Linux mdraid or hardware cards, personally. First you have to determine which drive is dead. Your idea of boot-testing to find out which drive is bad is NOT a good one. Even if the RAID controller marks a drive as 'failed', it will probably still boot the system successfully. If you test that way, you could very easily end up choosing the bad drive as your master, and wiping out your good disk. You need to identify exactly which drive is bad, with zero guesswork.

There's probably no way to flash a light on the dead drive (you get that with the hardware cards and enclosures), but what you can probably do is look in the Intel software to see which number port the dead drive is attached to. Once you've got that number, crack out the motherboard manual and see if you can figure out which port that is on the motherboard. Be aware that the numbering in the Intel software will almost certainly count only Intel controllers, and many current boards will come with additional SATA ports from other companies, like Marvell or JMicron. So Intel Port 2 may be, say, Motherboard Port 4. Hopefully the motherboard manual will be explicit enough so that you can narrow it down exactly.

Once you know what port it's plugged into, trace that cable, and use a Sharpie to put an X on the dead drive, so you're never confused about it later. Then (and you have your backup, right?) test to be sure that the computer will boot with only the non-Xed drive connected. Once you're sure it boots that way, swap in the new drive, and try to tell the Intel software to use it as a new spare.

The way most RAID systems work is that the RAID volume is 'under' the partitions. That is, you mirror a drive, and then you partition the RAID volume that's exposed to you by the software. So because the partitions are layered on top of the mirror, theoretically you should just be able to add the new drive, and have the Intel software resync it, and then everything will just work.

With flexible RAID software, it's possible to partition drives directly, and then RAID the individual partitions, but that's the wrong way to do it, and the Intel software probably doesn't even expose that functionality. There are a few weird cases where that might be useful, but they're so infrequent that consumer-oriented RAID will probably hide that possibility from you completely.

Ok, go check to be sure your backup is running.

Malor wrote:

I'd suggest making a backup before you do ANYTHING else. Like, you should be doing it right now, to another computer on your network, not waiting for the new drive to show up.

No, he should already have a system of regular automated backups in place, provided this is data that he cares about keeping.

He did say: "backup runs nightly - but will get a fresh one just in case". I'll assume that he means backups to another system and not snapshots saved on the same drive array.

But IF I'm wrong about that, then this rant:

BACK YOUR SH*T UP.

This is no longer IT nerd stuff. Anyone can throw CrashPlan on any* system they have and back everything up remotely for $5 a month (or less if you buy a year or more in advance).

If you've got the resources to have another local system to backup to, then that's great, but cloud services like this have long erased any excuse for not having anything.

If there's no backup of this data at all, FIX THAT ASAP.

(*: any meaning any Windows, Mac, Linux, or Solaris system that's reasonably recent and has or can install a JRE)

The only thing about a remote backup system like CrashPlan is that it might take a month or so to push the entire backup to them, unless you go the route of having them send you an external drive to perform the initial backup on. This isn't a criticism though so much as something to be aware of.

complexmath wrote:

The only thing about a remote backup system like CrashPlan is that it might take a month or so to push the entire backup to them, unless you go the route of having them send you an external drive to perform the initial backup on. This isn't a criticism though so much as something to be aware of.

That's why you get this crap started long before you have to worry about your data.

CrashPlan has become my recommendation for backups because it is pretty damn fast. My wife's laptop got ~200GB uploaded in a few days, and the machine was not left open at all times.

Carbonite, meanwhile, throttles speed and puts daily transfer limits on your account. We had Carbonite on that same laptop before and it took forever.

I'm going to be putting CrashPlan Pro on our 1.5TB file server. I'll report on how long it takes to push that much up.

Sounds like you have a good upstream

Does CrashPlan have the ability to put a cap on how much upload bandwidth it uses?

Thanks for the advice guys.

The data backups up nightly to an external hard drive. I will start to institute off site backups in the future but I have some work policy issues to address before that happens.