Seven Fundamentals Everyone Should Know About Exadata

I speak to a lot of customers, prospects and co-workers about Exadata. Even though Exadata has been in production for two years I still do not presume everyone has a grasp of some of the more important fundamentals of Exadata. I’ll routinely get asked about how very large SGA buffering can enhance Exadata Smart Scan or how Storage Indexes might improve OLTP workloads and other such non sequiturs.

There are a lot of sessions about Exadata being offered at Oracle OpenWorld 2010 and for good reason. Exadata is exciting technology! It dawns on me, however, that a few words explaining some of the more fundamental aspects of Exadata might help folks absorb more of what they are hearing in the sessions they attend next week.

I consider the following seven terms and definitions utterly important for folks to know before sitting through an Exadata presentation. In fact, there may even be some sessions offered by presenters who could also benefit from the following 242 words?

Cell Offload Processing.
- Work performed by the Storage Servers that would otherwise have to be executed in the database grid. Includes functionality like Smart Scan, datafile initialization, RMAN offload, Hybrid Columnar Compression (HCC) decompression.
Smart Scan.
- Most relevant Cell Offload Processing for improving Data Warehouse / Business Intelligence query performance. Smart Scan is the agent for offloading filtration, projection, Storage Index exploitation and HCC decompression.
Full Scan or Index Fast Full Scan.
- The required access method chosen by the query optimizer in order to trigger a Smart Scan.
Direct Path Reads.
- Required buffering model for a Smart Scan. The flow of data from a Smart Scan cannot be buffered in the SGA buffer pool. Direct path reads can be performed for both serial and parallel queries. Direct path reads are buffered in process PGA (heap).
Result Set.
- Data returned by the SQL processing layer. The SQL processing layer is in the Oracle Database. The data flowing from a Smart Scan is not a result set.
Exadata Smart Flash Cache.
- Flash Cache in each of the Storage Servers. Not to be confused with Database Flash Cache which is Flash in the database grid and not compatible with Exadata. Smart Scan aggressively scans both HDD and Flash media concurrently. When data is present in the flash cache scan rates of 50 GB/s on Exadata Version 2 hardware are the norm for full rack configurations. Maximum theoretical scan rates (a.k.a., datasheet scan rates) for Exadata are *only* possible for fully offloaded scans. A fully offloaded scan is generated by a SQL query that finds no rows. Blog Update: Please consider viewing the following 2 minute Youtube video with a demonstration of how complex SQL processing throttles Exadata Smart Scan to roughly 10% of maximum theoretical scans rates:http://www.youtube.com/watch?v=JuWVjSp42yM
Storage Index.
- Dynamic, in-memory indexes. The role of Storage Index technology is not to aid in locating data faster but instead to eliminate I/O. With Storage Indexes the Exadata Storage Server software can determine whether or not a given storage region contains rows relevant to the query and decide to not read the storage region. Storage Indexes are only examined during a Smart Scan.

I hope you’ll find this helpful.

13 Responses to “Seven Fundamentals Everyone Should Know About Exadata”

Feed for this Entry Trackback Address

1 Syed Jaffar Hussain September 18, 2010 at 7:26 pm

Indeed these are very valuable points, at least to me as I am new to exadata.

2 jametong September 19, 2010 at 2:13 am

Can you show your blog post in the rss as full text article? or can you just post a vote as Jonathan Lewis do .

http://jonathanlewis.wordpress.com/2010/08/30/subscribers/

- 3 kevinclosson September 24, 2010 at 11:48 pm
  
  yep
  
4 jametong September 28, 2010 at 10:54 am

I got it. Thank you.

5 Daniel Buzatu November 10, 2010 at 4:43 pm

Kevin, thanks for the above. Could you clarify one point, please? All 7 concepts seem to revolve around smart scan, and smart scan is defined as “the most relevant offloading process for improving *DW/BI* query performance”.

Is there a good reason you specify “DW/BI query” or can “IO-intensive query” be substituted. Based on the descriptions of each technology, I would assume yes, but I’m just wondering if I’m missing something, especially in light of the history of Exadata as particularly relevant for DW-type processing.

Thanks!

- 6 kevinclosson November 10, 2010 at 5:33 pm
  
  Hi Daniel,
  
  Quite simple. At this time Offload Processing is not optimized for transactional workloads. Transactional workloads generally get rows by ROWID or do very short small table scans neither of which get a boost from offload processing.
  
  I do suppose I/O-intensive would be an acceptable substitution, so long as the access method is FULL and the buffering is direct (so, not scattered reads). Am I still clear as mud? 😦
  
  - 7 Daniel Buzatu November 11, 2010 at 4:49 pm
    
    Hi, Kevin, thanks – that makes sense. I’m not sure what magic I was hoping for, but I guess something like the optimizer, when realizing that lots of reads are gonna happen, starts a smart scan. I guess there is some hope with the FFS…
    
    - 8 kevinclosson November 11, 2010 at 5:30 pm
      
      Hi Daniel,
      
      Remember that the product of a smart scan cannot go into the SGA. People (everywhere!) routinely forget that fundamental concept. If you consider the SGA critical to your OLTP/ERP then put all that in perspective 🙂
      
9 Deepak Gupta October 28, 2011 at 7:28 pm

I would like to subscribe your blog.

10 oraclebhola July 1, 2013 at 4:52 am

Hi Kevin,

You have mentioned in above context:-
“Smart Scan aggressively scans both HDD and Flash media concurrently”… but I think SMART SCAN is anti FLASH.:)

Smart Scans ignores FLASH and scan only disk. But if object is created/altered with CELL_FLASH_CACHE=KEEP, than only Smart scans will use flash and disk.

Please correct me if I am wrong here. I know I am raising question to someone who is master in this. And we all are still getting knowledge by reading your blog and your “comments” on other’s blog.:)

I think there must be some type of “RSS” which I can use for your blog and for the comments you are raising in other’s blog.

Regards,
Sunil Bhola

- 11 kevinclosson July 1, 2013 at 9:53 am
  
  Yes, one must KEEP the object to get max theoretical scan rates. If the scanned object doesn’t fit in the aggregate of flash cache then don’t KEEP it. When scanning only HDD (High performance drives) full-rack scan throughput drops from 100GB/s to 25GB/s as per the datasheet.

1 Exadata « Oracle Scratchpad Trackback on September 18, 2010 at 7:36 am
2 OBIEE performance – get your database sweating « RNM Trackback on May 19, 2011 at 3:02 pm

	David Zheng on Announcing pgio (The SLOB Meth…
	Oracle redo log perf… on File Systems For A Database? C…
	Oracle redo log perf… on Yes, File Systems Still Need T…
	kevinclosson on Announcing SLOB 2.5.4
	pgio nutzen? - I/O W… on So pgio Does Not Accurately Re…

Kevin Closson's Blog: Platforms, Databases and Storage