Several weeks ago my friends at Texas Memory Systems told me they’d soon have a Solid State Disk based upon NAND Flash technology. Right before I hopped on my flight back from Oracle HQ last Friday I got email from them with some technical info about the device. It looks like it is all public knowledge at this point since this PCWorld article hit the wire about 1 hour ago.
The product is called the RamSan-500. Hang on a second, I need to parse that. Let’s see, two syllables: “Ram” and “San.” SAN? A device that supports roughly .2ms reads (that’s the service times with a cache miss!) on a SAN with FCP? Sort of seems like funneling bullets through a garden hose.
What’s in a name, right? Well, this turns out to be a bit of a misnomer, because just like its cousin-the RamSan 400-the device will soon support 4x Infiniband (in spite of the fact that the current material suggests it is “in there” today). Time to market aside, that will make it a SCSI RDMA Protocol (SRP) target. Do you recall my recent post about Oracle’s August 2007 300GB TPC-H result that used Infiniband storage via SRP?
Now don’t get me wrong. Accessing this thing via FCP will still allow you to benefit from the tremendous throughput the device offers. And, indeed, regardless of the FCP overhead, returning all I/Os with sub-millisecond response time is going to pay huge dividends for sure. It’s just that more and more I’m leaning towards Infiniband-especially for transfers from a device with read service times that range from 15us (cache hit) to .2ms (cache miss). What about writes? It can handle 10,000 IOPS with 2ms service times. How would you like to stuff your sort TEMP segments in that? Oh what the heck, how about I just quote all their speed and feed data:
- Cache reads/writes:
- 15 microsecond access time
- Cache miss reads (reads from Flash):
- 100,000 random IOPS
- 2GB/second sustained bandwidth
- 200 microsecond latency
- Cache miss writes (writes to Flash)
- 10,000 random IOPS
- 2GB/second sustained bandwidth
- 2 millisecond latency.
Under the Covers
The RamSan-500 is more or less the same style of controller head the RamSan-400 has, so what’s so special? Well, putting NAND Flash as the backing store increases the capacity up to 2TB. T w o T e r a b y t e s!
So the question of the day is, “Where are you going to put your indexes, TEMP segements, hottest tables and Redo?” Anyone else think dangling round, brown spinning thingies off of orange glass cables doesn’t quite stack up?
Here’s a link to the RamSan-500 Web Page
QoS
I’ll blog at some point about QoS. After all, with a bunch of NAND Flash that looks and smells like a disk, storage management software that is smart enough to automagically migrate “hot” disk objects from slower to faster devices really starts to make sense.
Men in Black (DOJ)
I don’t pretend to know where TMS gets their Flash components, but it looks like “The Law” might end up being on their side if they get components from any of the alleged Flash providers suspected of a price-fixing scheme.
now why didn’t this thing exist when I needed it, July last year?
Let us know when you test the RAM-SAN performance in RAID-5 degraded mode.
Migrating hot objects – wasn’t that HP-Autoraid? Aren’t all Oracle files hot objects?
“Let us know when you test the RAM-SAN performance in RAID-5 degraded mode.”
…I would if I could
“Migrating hot objects – wasn’t that HP-Autoraid? Aren’t all Oracle files hot objects?”
No and no.
As I understood Autoraid, if a block was flushed to disk, it would be promoted from RAID-5 to mirrored, if it wasn’t already – it was considered hot. If it didn’t stay hot, it would eventually migrate back. ( http://www.cs.berkeley.edu/~brewer/cs262/AutoRAID.pdf )
As I understand Oracle, the datafile headers are updated a whole lot, unless you tell it not to, which has performance implications, to say the least.
Joel,
When I talk about QoS, I’m not talking about moving entire files. Autoraid doesn’t fit the model. I’ll blog about what I *am* alluding to soon enough. I used the abstract term “objects” for a reason but you took that to mean “datafile.”
No, actually I used it differently than I meant to [blush]. I meant blocks. The risks of drive-by blogging in between high-pressure tasks :-O
Looking forward to what you blog.