I know Nothing About Data Warehouse Appliances and Now, So Won’t You – Part V. Why GreenPlum is Better Than Oracle Exadata Storage Server.

October 9, 2008. BLOG CORRECTION: I had a bizarre miscalculation in the table at the bottom. I compared a net scan rate for GreenPlum to a gross scan rate of Exadata. The GreenPlum deficit should have read 4.6X versus 2.3X.

NOTE: My other installments on this threads are indexed at the following link: Index of DW/BI Musings

In my recent blog post Oracle Exadata Storage Server. Part I, a reader asked:

I wonder if there will be another Appliances with other hardware vendors such as Dell/EMC or SUN?

I’m fairly certain the reader wondered whether the Oracle Exadata Storage Server would be made available on platforms other than HP. However, another reader presumed the question was whether Oracle Exadata Storage Server is the only appliance on the market and followed up with the following:

[...] there is a product from Sun that is “Intelligent” it is the 4500 server. Two AMD chips and 48, that is right 48 1TB drives inside. They have partnered with GreenPlum for OVER a year now with this exact thing. Now that Oracle has said it is viable seems that everyone thinks it is a great idea all the sudden?? I am confused. What is the truth is this, Oracle is awesome at OLTP they are I applaud them for that. They have been getting beat fair and square however in the large data warehouse BI game by GreenPlum, Netezza and Terradata plain and simple.

GreenPlum

For those who don’t know, GreenPlum is a shared-nothing approach to data warehousing. The solution is based on the Sun Fire X4500 platform affectionately referred to as “Thumper.” I know Thumper, I’ve used Thumper and I can state safely that for a lot of purposes Thumper is a good system-most notably due to its density. You see, Sun Fire X4500 has two (underpowered by current standards) AMD Opteron 290 dual core CPUs on HyperTransport 2.0 and, according to Sun’s own measurements Sun Fire X4500:

Achieved up to 1 GBps from disks to network and up to 2 GBps from disk to memory

I’m sure there are a few GreenPlum customers deployed on Sun Fire X4500, but current deployments would be based on the Sun Fire X4540 platform which sports two quad-core Socket-F “Barcelona” processors. The Sun Fire X4540 site says the following about the throughput capabilities of the Sun Fire X4540:

Highest throughput rates (2.0GB/s from disks to network, 3.0GB/s to memory)

Cramming 10lbs of Rocks into a 5lb bag

Well, first off, I seriously doubt the 1 GB/s throughput number from disk to network cited for the X4500 because to get 1 GB/s from disk to network you have to read 1 GB/s and copy 1 GB/s to the network send buffers, which is a heafty task for HyperTransport 2.0. Nonetheless, the Sun Fire X4540 webpage states that the Socket-F 2356 CPU stuffed into the same HyperTransport 2.0 somehow doubles bandwidth from disk to network as quoted above. I don’t believe that number either, but I’m not blogging about the fact that you’d have to rub my nose in solid, scratch-and-sniff evidence of HT 2.0 sustaining 2 GB/s from disk to network before I’d believe the numbers. No, I’ll let them have 2 GB/s-no problem. It all starts with scan throughput and I’ll even be nice enough to let them claim 3 GB/s from disk to memory on HT 2.0.

Who Cares About These Speed and Feed Numbers Anyway?

Oracle Exadata Storage Server is the product of an exercise that goes beyond software. Oracle Exadata Storage Server is a balanced system. Running an instance of a shared-nothing RDBMS server (ala GreenPlum with PostreSQL) or even Oracle Exadata Storage Server for that matter in a system like the Sun Fire X4540 results in a horribly unbalanced solution. Allow me to explain.

The following table shows a comparison of Exadata versus GreenPlum focusing on how long it takes to scan the net storage space (gross / 2 for mirroring), using the advertised scan throughput. Since both technologies perform filtration and other database intelligence at the “storage level”, I’ll only focus on disk to memory and then assume (safely in the Exadata case but dubious in the GreenPlum case) that there is still bandwidth to sustain the memory accesses required to filter the data.

Technology User Addressable Storage (TB) Advertised Throuput Disk->Memory (GB/s) Net Storage Scan Time (Seconds)
GreenPlum Sun Fire X4540 24 3 8192
Exadata DL180 SAS Option 1.8 1 1800

As I said, I really am being entirely too generous allowing Sun to claim they can read 3 GB/s from disk into memory with HT 2.0, nonetheless, even with that inflated number GreenPlum starts out at a 4.6X deficit at the fundamental level for this sort of a solution when compared to Oracle Exadata Storage Server.

I’ve never liked unbalanced systems.

8 Responses to “I know Nothing About Data Warehouse Appliances and Now, So Won’t You – Part V. Why GreenPlum is Better Than Oracle Exadata Storage Server.”


  1. 1 Anonymous October 6, 2008 at 3:11 am

    Is Oracle now a hardware company?

    Are you saying that Oracle/Exadata is a better database because it has better hardware?

    Let’s get real: Oracle is limping toward shared nothing because they have to. Oracle/Exadata is weak, not as weak as say “GridSQL”, but it’s weak.

    Now that Oracle is finally admitting they need different technology to compete in DW, they’ve got years of catching up to do.

  2. 2 kevinclosson October 6, 2008 at 3:39 am

    Anonymous,

    No, I am saying, specifically, that Oracle Exadata Storage Server is a balanced system and PostreSQL on Sun Fire X4540 is not. My blog post was not targeting the emotions of my readers…more of a left-brain appeal, if you will.

  3. 3 Alex October 6, 2008 at 2:18 pm

    Hi Anonymous
    “Let’s get real: Oracle is limping toward shared nothing because they have to. Oracle/Exadata is weak, not as weak as say “GridSQL”, but it’s weak.”

    Can you pls provide any technical reasons why exactly Exadata is “weak” ?

  4. 4 Mark Callaghan October 6, 2008 at 2:47 pm

    Why has the comparison been limited to sequential IO performance? A Thumper with 48 disks should do much better for random IO throughput than 1 Exadata DL-180 with many fewer SAS disks. But then you might be able to buy several Exadata DL-180 servers for the cost of 1 loaded Thumper, so a comparison might need to consider that as well. Unless Oracle with Exadata is limited to hash join, random IO performance is still a big deal.

  5. 5 Matt October 9, 2008 at 2:12 am

    1-Offs there a plenty available.

    Balance is the key to scale.

  6. 6 kevinclosson October 9, 2008 at 3:24 am

    Matt,

    Thanks for stopping by, but your comment reads a little like Yoda (Star Wars).

  7. 7 Dmitry Potapov October 10, 2008 at 5:14 pm

    Re Mark Callaghan’s point about needing
    more disks for better random IOPS,
    that was true in the past, but thats not so
    anymore. Check out Intel X25-M and X25-E,
    and do the math.

    thanks
    Dmitry


  1. 1 Oracle Exadata Storage Server and Oracle Database Machine Related Posts « Everything Exadata Trackback on February 23, 2009 at 9:02 pm

Leave a Reply




Disclaimer

The views expressed on this blog are my own and do not reflect the views of Oracle Corporation. The views and opinions expressed by visitors on this blog are theirs, not mine.
All information and materials provided here are provided "as-is"; Oracle disclaims all express and implied warranties, including, the implied warranties of merchantability or fitness for a particular use. Oracle shall not be liable for any damages, including, direct, indirect, incidental, special or consequential damages for loss of profits, revenue, data or data use, incurred by you or any third party in connection with the use of this information or these materials.
website metrics