BLOG UPDATE 2012.01.28: A lot has changed since this blog post so I need to point out that my mention herein about iDB ports to SPARC is clearly outdated. The production manifestation of the SPARC SuperCluster offers 6 Exadata Storage Servers in the full-rack configuration connected to the T4-4 hosts via the Exadata iDB protocol.
This blog entry is too long.
From Oracle Storage Strategy Update To TPC-C And Back (I Hope)
My recent blog entry entitled I Can See Clearly Now. Exadata Is Better Than EMC Storage! I Have Seen The Slides! Part I was pretty heavily read (over 7,000 views). I was concerned that blogging about something that happened two weeks ago might not be all that interesting. But, since my analysis (opinions) about the June 30, 2011 Oracle Storage Strategy webcast seems to resonate I thought I’d put out this installment.
What Do Transaction Processing Council Benchmarks Have To Do With The Oracle Storage Strategy Update?
I’ve been eagerly anticipating which of IBM or HP would be first to audit a TPC-C with the Xeon E7 (formerly Westmere EX) processor. These vendors have value-add systems componentry that properly extend the vanilla Xeon E7 + QPI capabilities to include scalable 8-socket and very large memory support.
IBM’s x3850 with MAX5 supports 96 32GB low-voltage DIMMS for a total of 3TB RAM with just 4 Sockets. IBM proved the strength of the x3850 several months ago with a 4-socket Nehalem EX (Xeon 7500) result of a little over 2.3 million TpmC. So, part of me was not all that surprised to find that they were able to stay with the recipe and publish a result of just over 3 million TpmC with the Xeon E7 processor and MAX5 chipset (July 11, 2011). But that has nothing to do with the Oracle Storage Strategy webcast and, in fact, since it was a DB2 number with Linux it has very little to do with Oracle. So why am I blogging this?
While the 3 million TpmC result represents roughly 30% improvement over the Nehalem EX-based result for IBM, I’m saddened the entry was not an 8-socket result. Why? Well, I’ll put it this way. If IBM and HP can’t seem to make 8-socket Xeon boxes able to scale contentious workloads (like TPC-C) then it’s quite likely nobody can. It looks like 8-socket Xeon scalability is still out of reach for us. That is just too bad. But that has nothing to do with the Oracle Storage Strategy webcast. So why am I blogging this? I’m getting to it, trust me.
While perusing the main TPC-C all-results page I noticed three interesting things and one of them actually has to do with the Oracle Storage Strategy webcast!
The three things that caught my eye were:
- There are non-clustered Xeon results in the top ten! Sure, the prior IBM x3850 result was in the top ten but when it was published I didn’t catch on to that fact. It wasn’t too long ago that non-clustered x86 boxes were so far down the list as to not matter.
- In the ranks of the top-ten results there are two submissions that are less than $1.00/TpmC. I think that is quite significant when you compare to historical costs. Top ten TPC-C results with Xeon at < $1.00/TpmC—wow.
- None of the products mentioned in the Oracle Storage Strategy webcast appear in the top ten TPC-C nor TPC-H for that matter. The last Oracle TPC-H result was a 3TB scale M9000 result with Sun Storage 6000 (Sun Storage 6000 is LSI Engenio hardware and the Engenio brand is now owned by Netapp for what it’s worth).
So, obviously, point 3 in the list is what brings me back to the Oracle Storage Strategy Update June 29, 2011 (slides). If one publishes an industry benchmark that performs 3x over the closest competitor—as Oracle did with the SuperCluster 30 million TpmC result—wouldn’t the system (including storage) used to do so be considered a premiere system offering? One would think so—especially when the workload is an I/O intensive workload! But no, generally speaking the configurations used in TPC benchmarks are not to be confused with systems intended for production.
Concept Car or Production Car
The difference between TPC configurations and production configurations is a lot like the difference between a concept car and a car offered by the same manufacturer that is actually sitting on a lot with a price sticker on it. The concept car and the production car have a lot in common—but the differences are usually pretty obvious as well. We shouldn’t have a problem with this. I still think TPC benchmarks are good for certain purposes. An example of one such purpose is to see just how small the line is getting between the “concept car” and the “production car.”
SuperCluster Storage or Oracle Storage Strategy Line-up?
No, the “SuperCluster Storage” that was used for the 30 million TpmC result is not in the Storage Strategy line-up. So then what was the 30 million TpmC “concept car” storage? Take a peek at this link or let me summarize. The SuperCluster storage consisted of the following main ingredients:
- 97 Sun X4270M2 servers with one Intel Xeon removed. The 4270 servers ran Solaris and COMSTAR. As such, the servers play the role of “array heads” in order to perform protocol exchange between SAS and Fibre Channel. Why? Because the storage networking was Fibre Channel (108 8GFC Fibre Channel HBAs connecting the 27 Real Application Clusters nodes (4 HBAs each) to the COMSTAR heads and SAS from the COMSTAR heads to the storage.
- 138 Sun Storage F5100 Flash Array devices. That bit was $22,000,000. Remember the analogy about the concept car.
So a high-level schematic of the flow of data was F5100 SAS->COMSTAR head (SAS to FC)-> FC switches-> Sun T3-4 Servers. Don’t be alarmed by that many “hops” because they don’t really matter. Indeed, the 30 million TpmC SuperCluster delivered an average New Order response time of 0.35s, which is 69% faster than the IBM p780 result of 1.14 seconds. That’s a point Oracle marketing pushes vigorously. Oracle marketing doesn’t, however, seem to push the fact that while HP was still Oracle’s premiere hardware partner they teamed with HP to deliver what was, at the time, a world record TPC-C using the recently-shunned Itanium processor. Moreover, they most certainly don’t push the fact that the circa-2007 Itanium TPC-C with Oracle10g delivered New Order average service times of 0.24s—which was 32% faster service times than the SuperCluster! Fine details matter.
Concept Car to Oracle Storage Strategy Update
No, there is no evolution from concept to reality where the COMSTAR+F5100 approach is concerned. In fact, Oracle spelled out quite clearly how the storage recipe for these SuperClusters will be “Sun ZFS Storage 7420” which means either FC, iSCSI or NFS—but no Exadata since there is no port of Exadata iDB to SPARC (as of the publish date of this article). I think the ZFS Storage Appliance is a reasonable product but I wouldn’t want to stick my arm in the unified storage meat-grinder with the likes of EMC VNX and Netapp.
So, no, the storage used for the SuperCluster TPC-C shows no promise at this time of evolving from concept to production. However, Oracle customers should be glad because yet another addition to the storage strategy would be all too confusing in my opinion.
Final Words About That IBM x3850 Xeon E7 TPC-C Result
The Oracle SuperCluster result of 30 million TpmC (.353s average New Order service time) didn’t beat out the service times of the ancient Itanium 2 based SuperDome New Order transactions, but at least it also failed to beat the IBM x3850 average service times!
The IBM x3850 pumped out over 3 million TpmC with average New Order service times of .272s and all that for $.59/TpmC. How? Well, the storage wasn’t a concept. The lion’s share of the I/O was serviced by 136 SFF SAS SSDs! That’s about 1/50th the cost for storage for 1/10th the transaction throughput when compared to the SuperCluster. And faster transaction service times too.
Intel Xeon is my concept car of choice—and you can run about any software you so choose on it so that makes it even better. And regardless of what software I chose to run I would rather it not be stored in “concept storage.”
This blog entry was too long.