Do you think RAID-5 is ubiquitous? Straightforward? Been there, done that?
When you determine that your application needs an underlying RAID solution, do you think that RAID is commodity? Do you think that it's old? Do you pick from a menu of RAID solutions based on cost?
I'm here to tell you that not all RAID solutions are the same. And there are two very, very important questions that to ask of any RAID solution:
Got DIBs? And if so, where are they?
DIBs are Data Integrity Bits
DIBs are table stakes in implementations that involve parity updates. Write operations to a RAID solution require the creation of redundant parity information; this parity information must be written to disk along with customer data. If this parity information gets out of sync, the integrity of the customer's data is jeopardized. With this in mind, I would like to quote from my last post in regards to handling of failures:
"no matter what failure might occur, preservation and correctness of customer data would be paramount".
DIBs track the completion of parity operations. They are "extra" bits that watch your data. If you don't have them, I guarantee you that at some point in time your RAID implementation will happily return you incorrect information and say that all is well. This is where I pull out quote #2 from my previous post:
"Yes RAID is meant to be fast. But fast and incorrect is unacceptable."
So please, when you're choosing a RAID solution, make sure you got DIBs.
So where are your DIBs?
This is the second question you should be asking. I'm sorry to have to take it to another level of detail. But you need to know. After all, this is your data. And you've chosen RAID because you want protection from failures. Which brings me to my next point.
DIB location influences data integrity. The further away your DIBs are from your data, the greater the likelihood that something will go wrong.
We need some examples that highlight degrees of separation of DIBs from data. I'll choose four: host-based RAID, NOVRAM-based RAID, file-system RAID, and journal-based RAID.
Host-based RAID
Host-based RAID frightens me. Put RAID algorithms in a server, use dumb disks underneath. If the RAID implementation uses DIBs at all, it stores them on the server itself, perhaps on a local drive. Lose the server, lose your data integrity.
NOVRAM-based RAID
RAID solutions can employ a NOVRAM at the server, the switch, or the storage level. DIBs get put in the NOVRAM. RAID solution gets the DIBs from the NOVRAM after a crash, parity gets fixed, everyone's happy. Until the CPU fails and the customer tries to pop said NOVRAM off the board using a screwdriver, screaming "My DIBs are in there!".
Filesystem-based RAID
Some RAID solutions employ file system techniques to keep track of parity placement and ensure parity completion. So DIBs essentially become file system "i-nodes". Everything becomes a two-stage lookup. First I have to find my DIBs. Then I have to look at my DIBs to find my data. This runs contrary to the original requirement of RAID to be "fast". It also introduces additional failure permutations. What happens if an i-node becomes corrupt (this can happen in the most mature file systems).
Journal-based RAID
Another DIB technique is to create a separate journal to track parity completions. If I write my customer data to a set of five disks, I may write my DIBs to a separate mirrored-pair of disks. Problem: if I lose this mirrored pair, I no longer can write data with integrity, even though all of the disks holding customer data are healthy.
As Close as You Can Get.....
This is a long post, longer than I'd like. But I'm almost done.
I mentioned in my last post that it's my belief that CLARiiON's continued growth in the RAID industry was a result of fundamental decisions made in the 80s. You know, back when RAID was relevant.
So here's the first decision we made: we grew the on-disk sector size from 512 bytes to 520. And do you know what we put in those extra eight bytes? That's right. DIBs.
Every single block of data ever written to a CLARiiON has got DIBs. Protection for customer data is interspersed with the data itself. This means two things:
1. If I can get to my data, I can get to my DIBs.
2. If I can get to my DIBs, I can trust my data.
I don't have to rely on a NOVRAM, turn my DIBs into an i-node, place them on a separate spindle, etc., etc., etc. Remember, the further away your DIBs are from your data, the greater the likelihood that something will go wrong.
With CLARiiON, the DIBs are right there with your data.
Where they belong.
Steve
I like your blog it's very nice to see an actual engineer have one, especially one as accomplished as you.
I have a few questions:
How are DIBS employed on the Symmetrix?
Has anyone ever lost any data on a Clariion?
NetApps use Filesystem-based RAID, not a great way of doing RAID but would you know if they lost any customer data?
Thanks for the great blog!!
Posted by: Terry | February 09, 2008 at 07:50 PM
Hi Terry,
Thanks for the comment. Certainly there have been CLARiiON customers that have experienced the failure of two or more disk drives in their RAID configuration. In these cases it is not possible to read data from failed drives because of the multiple failures, and a restore from backup becomes necessary.
As for the Symm and NetApp questions, I don't know the answers, and wouldn't want to speculate!
Steve
Posted by: Steve Todd | February 10, 2008 at 07:19 AM