Oracle Switches To Columnar Store Technology With Oracle Database 11g Release 2?

I see fellow OakTable Network member Tanel Poder has blogged that Oracle Database 11g Release 2 has switched to offer columnar store technology. Or at least one could infer that from the post.

I left a comment on Tanel’s blog but would like to make a quick entry here on the topic as well. Oracle Database 11g Release 2 does not offer column-store technology as thought of in the Vertica (for example) sense. The technology, available only with Exadata storage, is called Hybrid Columnar Compression. The word hybrid is important.

Rows are still used. They are stored in an object called a Compression Unit. Compression Units can span multiple blocks. Like values are stored in the compression unit with metadata that maps back to the rows.

So, “hybrid” is the word. But, none of that matters as much as the effectiveness. This form of compression is extremely effective.

32 Responses to “Oracle Switches To Columnar Store Technology With Oracle Database 11g Release 2?”


  1. 1 Andrew Gregovich September 2, 2009 at 4:24 am

    Kevin

    Is the restriction of HCC to Exadata due to technical reasons or is it a marketing ploy? if it’s the latter, seems like a bit of a gamble to me…

    Regards

    Andrew

  2. 3 Dani Rey September 2, 2009 at 7:02 am

    Kevin,

    the marketing guys at Oracle did exactly what you never liked in marketing guys (of other companies). They claim: DWH query performance is 10-fold with Exadata Storage Server and Database Machine. Full stop. 😉

    You can see it on http://www.oracledatabase11g.com/Main/Questionnaire/EducationQuiz.html in the first question

    Nevertheless, we all know that Oracle Exadata Storage Server is 485x Faster Than…Oracle Exadata Storage Server.

    Regards
    Dani

  3. 4 Tanel Poder September 2, 2009 at 1:33 pm

    Hi Kevin,

    Thanks for the correction! However the new features overview doc is somewhat misleading in this case..

    “Hybrid columnar compression is a new method for organizing how data is stored. Instead of storing the data in traditional rows, the data is grouped, ordered and stored one column at a time.”

    • 5 kevinclosson September 2, 2009 at 3:09 pm

      The Exadata documentation is much more clear. I don’t think anything is misleading. It’s just not perfectly clear. I see no problem with that though because HCC is not generically available as it is specific to Exadata. To that end, the Exadata documentation is the best place to go into the topic deeply. That’s just my opinion.

  4. 6 Darryl September 2, 2009 at 3:32 pm

    The “switch” to column storage is eluded to at the Structured Data bloag as well…

    http://structureddata.org/2009/09/01/oracle-database-11g-release-2-is-now-generally-available/

  5. 9 Darryl September 3, 2009 at 3:00 pm

    Gents,
    I wasn’t inferring any disagreement. I was just pointing out this topic was mentioned on another blog.
    Thanks

  6. 10 Daniel Abadi September 3, 2009 at 3:10 pm

    Kevin,

    I’m quite confused by your post. You say:

    “Rows are still used. They are stored in an object called a Compression Unit. Compression Units can span multiple blocks. Like values are stored in the compression unit with metadata that maps back to the rows.”

    What is “they” referring to in “they are stored in an object…”? Is it rows? If so, then what does it mean that “like values” are stored in the compression unit? “Like values” would seem to indicate values from the same column.

    In short, it is unclear from your post if a compression unit contains multiple rows, but each column is stored separately inside the compression unit, or if it is just a bunch of data from one particular column.

    Cheers,
    Daniel Abadi
    Assistant Professor, Yale University

    • 11 kevinclosson September 3, 2009 at 4:13 pm

      Hi Daniel,

      The hybrid columnar compression feature is documented in the Exadata Storage Server product documentation. However, I’m not permitted to expose the contents of that documentation here. My main objective in following on to my friend Tanel Poder’s post was to ensure folks don’t run off with the idea that Oracle Database 11g Release 2 for Exadata is a columnar store database.

      I can answer some of your questions safely though. A compression unit is a collection of data blocks. Multiple rows are stored in a compression unit. Columns are stored separately within the compression unit. Likeness amongst the column values within a compression unit yields the space savings. There are still rowids (that change when a row is updated by the way) and row locks. It is a hybrid.

      So, in short, no a compression unit does not have values from only one column.

      I apologize for the confusing information. There are times I find myself trying to make bricks without straw as it were.

  7. 12 Daniel Abadi September 3, 2009 at 4:24 pm

    Thanks Kevin, that’s what I wanted to know.

    I’m also curious if Exadata performs selection predicates directly on the HCC data, or if it is decompressed into regular tuples before the predicate is evaluated. However, if this is restricted information, I understand.

    Cheers,
    Daniel Abadi

    • 13 kevinclosson September 3, 2009 at 4:39 pm

      Exadata performs filtration and projection (Smart Scan) on compressed data in Oracle Database 11g Release 2 just as it did in 11gR1. With the columnar compression, Smart Scans send uncompressed data to the database grid though.

  8. 14 Ganesh September 3, 2009 at 6:31 pm

    Hello:

    Just one clarification..

    So with HCC, not only will we have compression (and grouping) within the blocks, also there will be compression units that will have another grouping of
    data based on the *likeness* of the column values among blocks?

    Can you please explain?

    Thanks
    Ganesh

    • 15 kevinclosson September 3, 2009 at 6:39 pm

      Well, “we” of course means Exadata customers. I do understand the curiosity. Think of the blocks in a compression unit as buckets for column values. Highly compressible columns wont take up much space in the “buckets” but non-compressible columns will.

  9. 16 Daniel Abadi September 3, 2009 at 11:02 pm

    Kevin,

    Thanks for your responses. The information you shared was enough for me to take a stab at placing the Oracle scheme in my taxonomy of hybrid row/column storage schemes. Please see: http://dbmsmusings.blogspot.com/2009/09/tour-through-hybrid-columnrow-oriented.html

    If you disagree with anything I said, please let me know.

    Cheers,
    Daniel Abadi

  10. 17 Sebastián September 4, 2009 at 4:23 pm

    Why have you compared with Vertica instead of Sybase IQ? As far as I know Sybase IQ is much mature and also there are many great references!

    • 18 kevinclosson September 4, 2009 at 4:37 pm

      Sebastian,

      It was a contrast, not a comparison and I’m not here to prop one or the other of those two companies. I’m not on either of their teams so I don’t care which of those win their piece of the pie.

  11. 19 Rajan January 21, 2010 at 12:08 am

    Kevin,

    What is the size of the compression unit ? Is it the size of a physical I/O typically 1 MB ?

    Thanks
    Rajan

  12. 21 Gaurav Mittal September 3, 2012 at 10:14 am

    Column Store is getting very critical in DWH environments and its all about comressing data. Please have a look on the below article

    [ link removed ]

    • 22 kevinclosson September 3, 2012 at 11:27 am

      @Guarav : Columnar orientation is not “getting critical” it is critical. I’m in R&D with EMC’s Greenplum I do indeed know the value of columnar vis a vis compression efficacy. Note, I removed the link to your post. If you’d like feel free to send a trackback.


  1. 1 Oracle 11gR2 has been released – and with column oriented storage option | Tanel Poder's blog: Core IT for Geeks and Pros Trackback on September 2, 2009 at 11:48 am
  2. 2 The future of the database is… plaid? — Too much information Trackback on September 2, 2009 at 5:16 pm
  3. 3 Oracle Exadata Hybrid Columnar Compression | DBMS2 -- DataBase Management System Services Trackback on September 3, 2009 at 9:33 am
  4. 4 Thoughts on the New DW Features in Oracle Database 11gR2 | Oracle Trackback on September 9, 2009 at 7:12 am
  5. 5 Kerry Osborne’s Oracle Blog » Blog Archive 11gR2 - My Top Ten List of Interesting New Features - Kerry Osborne’s Oracle Blog Trackback on October 22, 2009 at 12:05 am
  6. 6 Columnar compression vs. column storage | DBMS 2 : DataBase Management System Services Trackback on February 6, 2011 at 9:24 am
  7. 7 列压缩与列存储 Trackback on February 8, 2011 at 8:08 am
  8. 8 Teradata Database 14 « CarlosAL Trackback on October 10, 2011 at 4:52 pm
  9. 9 Exadata Hybrid Columnar Compression (HCC) for (storage) dummies « Dirty Cache Trackback on August 3, 2012 at 3:54 am
  10. 10 Exadata Hybrid Columnar Compression | Saurabh K. Gupta's Oracle Blog Trackback on May 4, 2013 at 12:33 pm

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.




DISCLAIMER

I work for Amazon Web Services. The opinions I share in this blog are my own. I'm *not* communicating as a spokesperson for Amazon. In other words, I work at Amazon, but this is my own opinion.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 743 other subscribers
Oracle ACE Program Status

Click It

website metrics

Fond Memories

Copyright

All content is © Kevin Closson and "Kevin Closson's Blog: Platforms, Databases, and Storage", 2006-2015. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Kevin Closson and Kevin Closson's Blog: Platforms, Databases, and Storage with appropriate and specific direction to the original content.