Sun Oracle Database Machine: The Million Oracle Database IOPS Machine or Marketing Hype? Part I.

This one goes out to the Not Million IOPS DBA.

Several blog readers have emailed me to ask why I have not been blogging about the 1 million read IOPS capability of Sun Oracle Database Machine. My first inclination was to blog sarcastically that the reason is because Sun Oracle Database Machine is not, in fact, a 1 million IOPS platform! It turns out that the 1 million read IOPS figure that everyone is touting is actually a bit conservative! And yes, we are talking about real Oracle Database read IOPS—physical Oracle datafile I/O operations buffered in the System Global Area (SGA). But I won’t be approaching the topic by simply echoing the million IOPS marketing mantra.

This blog entry is aimed at the many folks at Open World 2009 who asked me why the Sun Oracle Database Machine (with its million read IOPS capabililty) should be interesting to them given their (much less than 1 million) IOPS rates and to all other DBAs asking that very same question.

Sun Oracle Database Machine – The Million IOPS Capable Platform
The point, in my mind anyway, is that while I’m 99.9999942% (have to include the requisite five “9s” in there) certain that your application does not demand anything close to a million IOPS, the Sun Oracle Database Machine is a million read IOPS capable platform and that should still be important to folks considering Sun Oracle Database Machine for read-intensive ERP/OLTP. The platform is a million read IOPS capable platform not based on bandwith specifications or datasheet, but based on true end-to-end proof. Why is this important to you?

The Soothing Salve of Head Room
I can’t count how many times I’ve been involved with customer performance problems over the years where spikes in the IOPS rate caused poor performance. “No kidding” you say. As I best recall, each and every one of those tuning exercises included attempts to determine what the end-to-end IOPS capacity of the configuration actually was to include what the host processors would handle.

Keep in mind that with OLTP workloads the cost of a read I/O (e.g., db file sequential read) starts with an SGA cache miss—and cache misses have cost (in terms of CPU).  In addition to that is the overhead to set up the read and the LRU and chaining actions associated with the newly buffered block (there is of course more to it, but the lion’s share is worth discussing). These costs are paid in CPU cycles, but I was never quite able to work out any magic decoder ring for the overhead associated with these specific I/O related tasks since it varied by platform, OS and I/O protocol.  I knew one thing for certain though—I generally ran out of CPU before I ran out of IOPS capacity. Of course I’ve had the luxury of defining and working on balanced configurations throughout my career. Now, before I forget, I need to remind you that I am blogging about end-to-end Oracle Database IOPS (ERP/OLTP) as opposed to a synthetic low-level microbenchmark such as Orion. Orion will help show you what the hardware (I/O path, not CPU) can do by way of IOPS, but it cannot predict the end-to-end IOPS capability of your platform.

Arguing the Oracle Marketing Message to the “Not Million IOPS DBA”
I am not arguing the Oracle marketing message on this matter. I’m just coming at it from a different angle. After all, I said I was reasonably certain (99.9999942%) that your application does not demand a million IOPS. My thin value add to the Oracle messaging on the matter therefore relates to you—the Not Million IOPS DBA. Think of this way. Oracle accurately claims the Sun Oracle Database Machine is a million read IOPS platform. Why not 10 million? Well, probably because it hit a bottleneck somewhere between 1 million and 10 million. That somewhere is host CPU. Oh no, he’s admitting that there are conceivable bottlenecks in the Exadata architecture! Yes, of course, all systems have bottlenecks somewhere.

Getting back to the point. We can safely presume that there is some “nebulous” limit in the Sun Database Machine that throttles IOPS to around 1 million. How helpful is that information? Well, if you are a Sun Oracle Database Machine customer running applications that, in aggregate, demand less than 1 million read IOPS you can simply rule out IOPS as the cause of performance problems. That is, so long as your application is not doing unnecessary I/O I suppose. Further, if your read IOPS rate is, say, 250,000 you know you are well below the proven capacity (be keenly ware, however, that a full rack Exadata configuration is limited to 50,000 write IOPS with normal ASM redundancy). Why do I say “applications” and “in aggregate?” Think consolidation.

The point of this blog entry was not to bore you to death. The fact is most storage vendors market their storage I/O bandwidth stated in IOPS. What I have never seen them do is build an end-to-end Oracle Database configuration and actually do the IOPS driven by an Oracle database. This may change now that Oracle is showing database end-to-end IOPS numbers as opposed to synthetic block ops via something like Orion.

Summary
Oracle and Sun partnered to bring to market a million read IOPS capable platform (in a single 42u rack) and, no, it is certainly not just marketing hype! Does that mean you don’t really need a million read IOPS capable platform if your application doesn’t demand a million IOPS? I don’t think so.

8 Responses to “Sun Oracle Database Machine: The Million Oracle Database IOPS Machine or Marketing Hype? Part I.”


  1. 1 George October 22, 2009 at 2:01 pm

    I’m sure there is a developer out there than will be able to out request a million ipos. jsut a matter of time.

    For now it is to get customers to see the big picture of the consolidation, see past the word Sun and Oracle and rather see a very fast database machine with everything done for them, in other words not try and shoot holes in the architecture since it does not fit their OS flavor or hardware sticker.

    G

  2. 2 Glenn Fawcett October 22, 2009 at 2:22 pm

    In my experience working on the largest of Sun’s servers, I have seen customers with dozens of instances on a single machine. After talking with customers at OOW, I found that indeed some people are seeing the value of Exadata V2 for consolidation. Exadata V2 is really grid computing properly pre-configured with all the best practices ready for you to deploy.

    http://bit.ly/m3w3I

  3. 3 Andrew Gregovich October 28, 2009 at 3:56 am

    Kevin

    IMO, the last missing piece of the puzzle for total control of consolidated applications on a single Oracle instance is implementation of robust QoS for I/O. With traditional hard disks this sounds like an almost impossible task due to the physical layout of data, but with flash disks it looks a lot more viable.

    Any plans to enhance Oracle Resource Manager with I/O throttling capabilities?

    Cheers

    Andrew

  4. 5 Ravi January 27, 2010 at 6:52 pm

    If I am consolidating, I would like to know what kind of partitioning capabilities exist on the infrastructure. Can you please comment to that?

  5. 6 Dmitri February 14, 2012 at 7:19 am

    It’s interesting to reread some of these articles now that you have left Oracle. It seems clear now that you were in some sense straining against the shackles of being an employee – for example your warning about the limit of 50000 write IOPS (which I assume is therefore 25000 when using normal redundancy right? So even less if you want to be able to do online patching, meaning you need triple mirroring in ASM…).
    I remember when the v1 came out the Exadata storage cells were described as “the First-Ever Smart Storage Designed for Oracle Data Warehouses”. Then when the v2 came out with the 5.3TB of flash on the storage cells it suddenly became “The First Database Machine for OLTP”. I know there were other changes between v1 and v2 (upgrading from 11.1 to 11.2 software for a start) but it sort of feels like Oracle just threw a load of flash card in and then said hey, now we can do OLTP as well.
    In my mind – and I guess I have to thank you for making this clear – it can’t realistically be a contender for large OLTP workloads with such a skewed ratio of reads to writes.


  1. 1 making IT happen : an infrastructure blog » Blog Archive » With the hardware these days… Trackback on October 23, 2009 at 1:30 am

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.




DISCLAIMER

I work for Amazon Web Services. The opinions I share in this blog are my own. I'm *not* communicating as a spokesperson for Amazon. In other words, I work at Amazon, but this is my own opinion.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 743 other subscribers
Oracle ACE Program Status

Click It

website metrics

Fond Memories

Copyright

All content is © Kevin Closson and "Kevin Closson's Blog: Platforms, Databases, and Storage", 2006-2015. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Kevin Closson and Kevin Closson's Blog: Platforms, Databases, and Storage with appropriate and specific direction to the original content.