Oracle Exadata Storage Server. Part I.

Brute Force with Brains.
Here is a brief overview of the Oracle Exadata Storage Server key performance attributes:

  • Intelligent Storage. Ship less data due to query intelligence in the storage.
  • Bigger Pipes. Infiniband with Remote Direct Memory Access. 5x Faster than Fibre Channel.
  • More Pipes. Scalable, redundant I/O Fabric.

Yes, it’s called Oracle Exadata Storage Server and it really was worth the wait. I know it is going to take a while for the message to settle in, but I would like to take my first blog post on the topic of Oracle Exadata Storage Server to reiterate the primary value propositions of the solution.

  • Exadata is fully optimized disk I/O. Full stop! For far too long, it has been too difficult to configure ample I/O bandwidth for Oracle, and far too difficult to configure storage so that the physical disk accesses are sequential.
  • Exadata is intelligent storage. For far too long, Oracle Database has had to ingest full blocks of data from disk for query processing, wasting precious host processor cycles to discard the uninteresting data (predicate filtering and column projection).

Oracle Exadata Storage Server is Brute Force. A Brawny solution.
A single rack of the HP Oracle Database Machine (based on Oracle Exadata Storage Server Software) is configured with 14 Oracle Exadata Storage Server “Cells” each with 12 3.5″ hard drives for a total of 168 disks. There are 300GB SAS and 1TB SATA options. The database tier of the single-rack HP Oracle Database Machine consists of 8 Proliant DL360 servers with 2 Xeon 54XX quad-core processors and 32 GB RAM running Oracle Real Application Clusters (RAC). The RAC nodes are interconnected with Infiniband using the very lightweight Reliable Datagram Sockets (RDS) protocol. RDS over Infiniband is also the I/O fabric between the RAC nodes and the Storage Cells. With the SAS storage option, the HP Oracle Database Machine offers roughly 1 terabyte of optimal user addressable space per Storage Cell-14 TB total.

Sequential I/O
Exadata I/O is a blend of random seeks followed by a series of large transfer requests so scanning disk at rates of nearly 85 MB/s per disk drive (1000 MB/s per Storage Cell) is easily achieved. With 14 Exadata Storage Cells, the data-scanning rate is 14 GB/s. Yes, roughly 80 seconds to scan a terabyte-and that is with the base HP Oracle Database Machine configuration. Oracle Exadata Storage Software offers these scan rates on both tables and indexes and partitioning is, of course, fully supported-as is compression.

Comparison to “Old School”
Let me put Oracle Exadata Storage Server performance into perspective by drawing a comparison to Fibre Channel SAN technology. The building block of all native Fibre Channel SAN arrays is the Fibre Channel Arbitrated Loop (FCAL) to which the disk drives are connected. Some arrays support as few as 2 of these “back-end” loops, larger arrays support as many as 64. Most, if not all, current SAN arrays support 4 Gb FCAL back-end loops which are limited to no more than 400MB/s of read bandwidth. The drives connected to the loops have front-end Fibre Channel electronics and-forgetting FC-SATA drives for a moment-the drives themselves are fundamentally the same as SAS drives-given the same capacity and rotational speed. It turns out that SAS and Fibre drives, of the 300GB 15K RPM variety, perform pretty much the same for large sequential I/O. Given the bandwidth of the drives, the task of building a SAN-based system that isn’t loop-bottlenecked requires limiting the number of drives per loop to 5 (or 10 for mirroring overhead). So, to match a single rack configuration of the HP Oracle Database Machine with a SAN solution would require about 35 back-end drive loops! All of this math boils down to one thing: a very, very large high-end SAN array.

Choices, Choices: Either the Largest SAN Array or the Smallest HP Oracle Database Machine
Only the largest of the high-end SAN arrays can match the base HP Oracle Database Machine I/O bandwidth. And this is provided the SAN array processors can actually pass through all the I/O generated from a full complement of back-end FCAL loops. Generally speaking, they just don’t have enough array processor bandwidth to do so.

Comparison to the “New Guys on the Block”
Well, they aren’t really that new. I’m talking about Netezza. Their smallest full rack has 112 Snippet Processing Units (SPU) each with a single SATA disk drive-and onboard processor and FPGA components-for a total user addressable space of 12.5 TB. If the data streamed off the SATA drives at, say, 70 MB/s, the solution offers 7.8 GB/s-42% slower than a single-rack HP Oracle Database Machine.

Big, Efficient Pipes
Oracle Exadata Storage Server delivers I/O results directly into the address space of the Oracle Database Parallel Query Option processes using the Reliable Datagram Sockets (RDS) protocol over Infiniband. As such, each of the Oracle Real Application Clusters nodes are able to ingest a little over a gigabyte of streaming data per second at a CPU cost of less than 5%, which is less than the typical cost of interfacing with Fibre Channel host-bus adaptors via traditional Unix/Linux I/O calls. With Oracle Exadata Storage Server, the Oracle Database host processing power is neither wasted on filtering out uninteresting data, nor plucking out columns from the rows. There would, of course, be no need to project in a colum-oriented database but Oracle Database is still row-oriented.

Oracle Exadata Storage Server is Intelligence Storage. Brainy Software.
Oracle Exadata Storage Server truly is an optimized way to stream data to Oracle Database. However, none of the traditional Oracle Database features (e.g., partitioning, indexing, compression, Backup/Restore, Disaster Protection, etc) are lost when deploying Exadata. Combining data elimination (via partitioning) with compression further exploits the core architectural strengths of Exadata. But what about this intelligence? Well, as we all know, queries don’t join all the columns and few queries ever run without a WHERE predicate for filtration. With Exadata that intelligence is offloaded to storage. Exadata Storage Cells execute intelligent software that understands how to perform filtration as well as column projection. For instance, consider a query that cites 2 columns nestled in the middle of a 100-column  row and the WHERE predicate filters out 50% of the rows. With Exadata, that is exactly what is returned to the Oracle Parallel Query processes.

By this time it should start to make sense why I have blogged in the past the way I do about SAN technology, such as this post about SAN disk/array bottlenecking.  Configuring a high-bandwidth SAN requires a lot of care.

Yes, this is a very short, technically-light blog entry about Oracle Exadata Storage Server, but this is day one. I didn’t touch on any of the other really exciting things Exadata does in the areas of I/O Resource Management, offloaded online backup and offloaded join filters, but I will.

37 Responses to “Oracle Exadata Storage Server. Part I.”


  1. 1 Doug Burns September 24, 2008 at 10:17 pm

    Nice work. Good that you can talk about it now and great to see a decent Keynote announcement

  2. 2 Alex Gorbachev September 24, 2008 at 10:37 pm

    Hurray! Kevin is back to blogging. And very good timing down to the minutes? Tell me how long were you waiting?

  3. 3 jarneil September 24, 2008 at 11:31 pm

    Maybe this is a future posting:

    “With Exadata that intelligence is offloaded to storage”

    Is that in the controller, or somwhere else? I mean is this modified hardware, or custom software?

    Can this secret sauce help with write speeds?

    Also, I realise the target market is HUGE data volumes, but Solid State devices are gaining traction, I’m supposing that there would be nothing theoretically seeing them in this one day.

    I mean you guys are selling one of the cells by themselves, not just the full database machine rack aimed at gazallions of terabytes?

    be good to be reading more of your postings in the coming weeks by the way! Welcome back from radio silence.

    jason.

    • 4 Richard Britton April 29, 2011 at 1:50 am

      Back in December we struggled a great deal just before go-live of our first Oracle database migrations to Exadata. The problem was in the seemingly unorganized field support for our 3 Exadata platforms. I posted a question on this site to see if any of other customers had similar experiences with their Exadata systems?
      The responses came back to me through various means and shortly after the post the darkness lifted, the updates flowed smoothly, and within a few days we were live…a little late and a bit more road-weary than expected, but nonetheless we were live.
      The crux of the problem was that we had not purchased ACS services – presuming we could successfully install Exadata updates on our own. We were wrong, we needed some experts and without ACS we weren’t prioritized high enough in the support queue. The minute we corrected the ACS problem, we turned the corner on the project.
      We are well supported by Oracle on our Exadata systems now, and I’m happy to report we are getting the 22x improvement we expected from our pre-purchase testing. We’re glad to be on the platform!

  4. 5 accidentalSQL September 25, 2008 at 12:02 am

    Very interesting design. Do you anticipate solid state storage being used in Exadata in the near future?

  5. 6 Arnoud Roth September 25, 2008 at 6:38 am

    This is so incredibly cool! I can’t wait to get my hands on one of these… When will they be available for purchase?
    One question: Am I right in assuming (given the fact that you’re talking about ProLiant DL Servers with Xeon processors) that the Exadata Storage Server is running on Oracle Enterprise Linux?
    Arnoud

  6. 7 dombrooks September 25, 2008 at 7:59 am

    So that’s what’s been keeping you busy.
    I don’t know what half of this means, but it sounds cool.

  7. 8 Maarten Vinkhuyzen September 25, 2008 at 9:03 am

    Kevin, you did make me so curiuos with the silence on your blog that I stayed up last night to see de webcast of Larry’s keynote.

    I am not dissapointed. Tuesday I had a meeting with the datacenter guys how to get the oracle IO at the level I needed for my DWH queries. And the solution was pronto delivered. 🙂

    Hope you will be back to blogging frequently.
    I missed your insights these last months.

    regards

    Maarten

  8. 9 Bala September 25, 2008 at 12:17 pm

    I have been eagerly waiting to hear from you about this whole xtreme thing, here you are finally. Hopefully we will have more technical details soon.

    Thanks

  9. 10 Bruce September 25, 2008 at 1:50 pm

    I like your initial overview of the product, but I believe that you need to compare both Netezza and Exadata side by side in real-world scenarios to gauge their performance. You’re basing your performance assumptions purely on hardware optimizations. I’d like to see a comparison between Netezza with their compression engine against Oracle.

    Bruce

  10. 11 David Aldridge September 25, 2008 at 2:13 pm

    I know it’s artificial and all, but some TPC results would be interesting to see.

    Anyway, very exciting news.

  11. 12 Val September 25, 2008 at 2:41 pm

    1. Re. sequential IO:

    You write: “Exadata I/O is sequential, so scan rates of 80 MB/s per disk drive”

    At the disk level, IO may be truly sequential under rare circumstances only, e.g. a full table scan AND absence of any concurrent activity. If you have a concurrent non-sequential access, your full table scan won’t result in sequential disk IO either. Even with a concurrent full table scan, your disk head will be jumping to and fro albeit in a predictable manner thus making sequential access impossible.

    2. Re. comparison to Netezza.

    It’s bit of apple to oranges, really. You assume 80MB/s per disk for Exadata and for some reason only 70MB/s per disk for Netezza. Also, you have 168 disk spinning in parallel on Exadata and 112 on Netezza. Had your assumptions been tha same, sequential IO throughput would be similar, at least theoretically.

    Other than that, it is a very interesting piece of information, thanks for sharing your thoughts.

    Would be interesting to see how the Exadata configuration might fare on TPC-H benchmarks.

  12. 13 Ofir September 25, 2008 at 4:04 pm

    Hi,
    just wanted to say that most of the questions are answered in the great 22-page whitepaper linked in the next post…
    and… Oh, this is so cool!
    🙂

  13. 14 Henry Poras September 25, 2008 at 4:16 pm

    Just wondering how the intelligent filtering will effect the buffer cache. If “Exadata Storage Cells … perform filtration as well as column projction[. … and] that is exactly what is returned to the Oracle Parallel Query processes”, the data block and dba wouldn’t be stored in the buffer cache. How far down the caching mechanism does this trickle?

    Henry

  14. 15 John Franklin September 26, 2008 at 12:06 am

    So, where is Polyserve CFS in all of this?

  15. 17 lscheng September 26, 2008 at 7:14 pm

    I wonder if there will be another Appliances with other hardware vendors such as Dell/EMC or SUN?

  16. 18 David Aldridge September 26, 2008 at 7:47 pm

    “Just wondering how the intelligent filtering will effect the buffer cache. If “Exadata Storage Cells … perform filtration as well as column projction[. … and] that is exactly what is returned to the Oracle Parallel Query processes”, the data block and dba wouldn’t be stored in the buffer cache. How far down the caching mechanism does this trickle?”

    Parallel query bypasses the buffer cache (generally), so although that’s a good thought I think it’s not a real issue. The data that is sent to the PQ slaves is meant for their consumption only even in regular Oracle database architectures.

  17. 19 Krishna Manoharan September 27, 2008 at 9:34 pm

    Hi Kevin,

    Can you please let us know about the tactical aspects of exadata and the oracle database machine?

    1. Supportability – Oracle software support has always been spotty. Now with a combination of Oracle Linux, Oracle database and HP hardware, it is going to be interesting to see how it all comes together – especially upgrades, patches etc. How easy or difficult is it to maintain? Do we need to build specialized skills inhouse or is it hands-off like Teradata?

    2. Ease of use – Can I simply move an existing oracle warehouse instance to the new database machine and can use it day 1? How easy or difficult is it? Do I need to spend significant time like with a RAC instance – partitioning etc?

    3. The Exadata storage concept is excellent – more storage comes with additional CPU and Cache – Can we use it for non-oracle applications – such as Log processing etc?

    4. Why would I want to use Oracle rather than Teradata or Netezza which is proven?

    5. Backup using RMAN – RMAN backups are not really geared for big databases, so is there any other off host alternatives available?

    Thanks
    Krishna Manoharan

  18. 20 Jamie Church September 28, 2008 at 10:42 pm

    Ischeng, YES there is a product from Sun that is “Intelligent” it is the 4500 server. Two AMD chips and 48, that is right 48 1TB drives inside. They have partnered with Greenplum for OVER a year now with this exact thing. Now that Oracle has said it is viable seems that everyone thinks it is a great idea all the sudden?? I am confused. What is the truth is this, Oracle is awesome at OLTP they are I applaud them for that. They have been getting beat fair and square however in the large data warehouse BI game by Greenplum, Netezza and Terradata plain and simple.

    http://www.greenplum.com/news/blogs/

    Decide for yourselves, I just don’t like Red Koolaid!

  19. 21 mark madsen September 30, 2008 at 9:32 am

    Kevin,

    How does this architecture deal with data distribution and redistribution? It seems like that’s still going to be a problem with joining data that isn’t distributed the same way. Does all the data then go back to the RAC?

    If I do a big query and sort, will that bottleneck one of the RAC nodes? Is temp space managed at the storage layer or on the RAC nodes?

    Mark

  20. 22 Sydney October 1, 2008 at 1:40 am

    Kevin;

    I agree with Val, I don’t believe you present an apples to apples comparison to Netezza. If you do the math, the outputs of both are essentially the same in an apples to apples to comparison. I realize you are blogging from a marketing perspective, but after reading this, I almost feel this is another product that is a little-too-late-to-market, much like Oracle’s since defunct product: ‘Oracle PowerObjects’. Here’s hoping the launch succeeds but there is a lot of catching up to Teradata and Netezze before I swallow the kool-aid. Do enjoy the blog though. Keep up the good work.

  21. 23 kevinclosson October 1, 2008 at 2:17 am

    Sydney,

    I’m not blogging from a marketing perspective. I have an engineering perspective. So, please, tell us how my comparison to Netezza is not apples-apples on a per-drive basis.

  22. 24 lscheng October 2, 2008 at 9:35 am

    Jamie

    I meant if Oracle would partner with other hardware vendors and not only HP to release more appliances.

  23. 25 JEF October 2, 2008 at 1:29 pm

    I think Kevin did a great job explaining the exadata architecture and did not exagerate the comparaison..
    I worked with both teradata and netezza and I can tell that having exadata will help me do near real time activities along with mixed worload that I am struggling with on Teradata! big issues are : 1-locking mecanism 4 locks are all what I can have on the row level then it upgrades to table level 2- tables redistribution when joining tables that are distributed differently (pain in the neck)3- not supporting new data types like XML etc..4- not supporting new disk drives with more capacity not ceritfied yet…i can with my complain for a long time 😉

  24. 26 Brian Ganly October 28, 2008 at 2:01 pm

    Oracle comparisons with Netezza io streaming speeds are only concerned about how fast the data can be streamed from the storage back to the RAC system. This is in contrast with Netezza where the data is processed as it is streamed off the disks.

    Also the overhead of indexes etc in Oracle is fairly wasteful. I have experience of moving data occupying 2TB of space out of oracle into 800GB used in Netezza.

    I too am eager to see the results of a side by side comparison

  25. 27 maclean November 3, 2010 at 7:13 am

    This storage server has a strong bandwidth , but still too expensive!

  26. 28 Richard December 20, 2010 at 8:11 pm

    I can appreciate the incredible design of Exadata, as well as all the hype that began on this post last September. We bought 3 of Exadata systems at the end of August and this is a mixed blessing.

    The system is performing approximately 22x faster than our previous Solaris environment – and that is exactly matches our test results. We are also seeing some improved performance by removing indices – which is remarkable.

    However, the patching process to go to Oracle 11gR2 v.11.2.2.2.2 is equally disappointing. We are still in the muck with problems from a raid controller, to Infiniband disconnects, to patches that either blowup on their own, or firmware that burns out the controller cards they are updating. That’s bad enough, but the real problem is the miserable support we are getting to fix these problems. We get the impression all the experts who invented Exadata have vanished. Unacceptable.

    In our more experienced opinion, Exadata is a great concept and design but until Oracle puts some brain-muscle behind their support it shouldn’t be considered for production purposes.


  1. 1 Technical details on the Exadata Storage Server Trackback on September 24, 2008 at 10:59 pm
  2. 2 Exadata: Oracle finally answers the data warehouse challengers | DBMS2 -- DataBase Management System Services Trackback on September 25, 2008 at 12:02 am
  3. 3 Oracle Exadata Programmable Storage Server « Beyond Oracle Trackback on September 25, 2008 at 9:54 am
  4. 4 Oracle Open World 2008 Diaries Trackback on September 25, 2008 at 10:32 pm
  5. 5 Infology.Ru » Blog Archive » Exadata: Oracle наконец отвечает бросившим вызов в области хранилищ данных Trackback on September 26, 2008 at 4:21 am
  6. 6 Lots of data - little information « Charlie Wertz’s blog Trackback on January 7, 2009 at 7:55 pm
  7. 7 Oracle Exadata Storage Server and Oracle Database Machine Related Posts « Everything Exadata Trackback on February 23, 2009 at 9:02 pm
  8. 8 Configurer sa baie de stockage pour une base de données « EASYTEAM LE BLOG Trackback on October 14, 2010 at 5:13 pm
  9. 9 Two years on, Virgin happy with Exadata | Delimiter Trackback on June 25, 2012 at 9:47 pm

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.




DISCLAIMER

I work for Amazon Web Services. The opinions I share in this blog are my own. I'm *not* communicating as a spokesperson for Amazon. In other words, I work at Amazon, but this is my own opinion.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 743 other subscribers
Oracle ACE Program Status

Click It

website metrics

Fond Memories

Copyright

All content is © Kevin Closson and "Kevin Closson's Blog: Platforms, Databases, and Storage", 2006-2015. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Kevin Closson and Kevin Closson's Blog: Platforms, Databases, and Storage with appropriate and specific direction to the original content.