HW RAID or SW RAID?


#1

That’s the simple question: which one is more reliable? Forget the performance. What’s the better choice to avoid a catastrophe?


#2

Obviously hardware should be more reliable.

But, you know, when things really go south, nothing can really stop it.

It’s probably better to have a solid recovery plan.


#3

Except when your hardware controller dies.


#4

Hardware for performance, software for data integrity - has been my take/experience. But my experience w/ HW RAID is pretty limited.

Over to you @Francisco or @Ryan - better to ask someone who has gone through hundreds (if not thousands) of raid cards :stuck_out_tongue:


#5

not with SSDs.


#6

There will always be pros and cons, all I care about is that people don’t use Raid 0. We added Software Raid to our Auto Deploy and it’s always a downer when they don’t choose the default Raid 10, then cry when a drive fails. In my experience, I prefer LSI for SSD and Adaptec for HDD. When I think Software Raid, my preference would be SSD, but then again any raid/redundancy is better than YOLO raid or no Raid.


#7

@Ryan do you remember any failure rate for HW and SW?

I remember @Hivelocity saying they don’t offer HW RAID on (not really) “low cost” offers because they are another point of failure. Isn’t SW RAID equally another (virtual) component that can fail? Probably they have really bad stories to tell.


#8

In the 5 years I have been at Incero, I haven’t seen a significant number of failures that would cause me to avoid H/W Raid.


#9

Running more boxes in YOLO RAID than I should… biggest being my Plex server. But it’s not the end of the world when that dies.


#10

One HW Raid advantage for our colo customers is that I monitor loud alarms and will track down servers with failed drives and alert the customer. S/W raid doesn’t perform this DC tech alerting function. Only sounds I like are the wonderful airplane fans. So like the annoying alarm sounds resolved :slight_smile: Monitoring is key no matter which one you choose.


#11

Agreed here. Also, when your provider notifies you of an alarm, do something about it right away. We notice alarms going off on hardware RAID, notify customers, then frequently don’t hear from them until the other drive in the span fails and they’ve lost all data.


#12

I prefer LSI based Hardware RAID


#13

Yet anyone hosting an application that requires uptime and integrity should be monitoring their own hardware with 3rd party or in house tools, not relying on a DC tech to ‘hear an alarm’. So I assume you guys do not need above 8 drives? Once you go higher the only option is an HBA and not very many of them have onboard raid capability. A lot of the newer filesystems such as ZFS and REFS like to have direct access to the drives with no raid controller.


#14

I regret listening to people who said that software raid was fine and had no significant issues with performance on modern hardware.

Easy to talk when you’re not packing servers for high volume production workloads.


#15

Software raid for personal/home setups & hardware raid for anything professional.


#16

Who told you that?

SW is fine for my personal stuff, and even a chunk of work stuff where we’re running like SW RAID 1 SSDs, SW RAID 5/6/10 or ZFS backup and file servers, etc. For the serious stuff we’re running HW RAID, or using a provider who does in the case of our ‘cloud’ stuff.


#17

Far too many people on LET who obviously had never tossed their systems a load like I put on mine. Makes life fun when you end up hovering at 30% iowait average, and then you’re trying to do backups and/or migrate on top of that, without causing downtime :smiley:


#18

There are plenty of direct to drive bay (e.g. no expander needed) options for over 8 drives, like the Adaptec 72405, etc. We have many such configs deployed without issue.

Personally I use both H/W and S/W RAID.


#19

One serious advantage of Software raids, is that the OS sees the drives. This is really nice combined with SMARTMON, which will alert you if something has gone amiss. Hardware Raids need a tooling to get this to work, and generally have lacking functionality out of the box when it comes to setting this up.


#20

The modern supermicro boards with LSI 3108 RAID on them show the drive health in IPMI info including the number of SMART errors, array health, etc. This is handy when you’re already running health checks via IPMI for masses of servers. Pretty cool.

http://162.212.59.221/7t9qWMAw1w.png
http://162.212.59.221/oIr85NgHpj.png