Archive Page 2

When Storage is REALLY Fast Even Zero-Second Wait Events are Top 5. Disk File Operations I/O: The Mystery Wait Event.

The SLOB code that is generally available here differs significantly from what I often test with in labs. Recently I was contorting SLOB to hammer an EMC XtremIO All-Flash Array in a rather interesting way. Those of you in the ranks of the hundreds of SLOB experts out there will notice two things quickly in the following AWR snippet:

1)   Physical single block reads are being timed by the Oracle wait interface at 601 microseconds (3604/5995141 == .000601) and this is, naturally for SLOB, the top wait event.

2)   Disk file operations I/O is ranking as a top 5 timed event. This is not typical for SLOB.



The 601us latencies for XtremIO are certainly no surprise. After all, this particular EMC storage array is an All-Flash Array so there’s no opportunity for latency to suffer as is the case with alternatives such as flash-cache approaches. So what is this blog post about? It’s about Disk file operations I/O.

I needed to refresh my memory on what the Disk file operations I/O event was all about. So, I naturally went to consult the Statistics Description documentation. Unfortunately there was no mention of the wait even there so I dug further to find it documented in the Description of Wait Events section of the Oracle Database 11g documentation which states:

This event is used to wait for disk file operations (for example, open, close, seek, and resize). It is also used for miscellaneous I/O operations such as block dumps and password file accesses.

Egad. A wait is a blocking system call. Since open(2)/close(2) and seek(2) are non-blocking on normal files I suppose I could have suffered a resize operation–but wait, this tablespace doesn’t allow autoextend.  I suppose I really shouldn’t care that much given the fact that the sum total of wait time was zero seconds. But I wanted to understand more so I sought information from the user community–a search that landed me happily at Kyle Hailey’s post on here. Kyle’s post had some scripts that looked promising for providing more information about these waits but unfortunately in my case the scripts returned no rows found.

So, at this point, I’ll have to say that the sole value of this blog post is to point out the fact that a) the Oracle documentation specifically covering statistics descriptions is not as complete as the Description of Wait Events section and b) the elusive Disk file operations I/O wait event remains, well, elusive and that this is now part I in a multi-part blog series until I learn more. I’ll set up some traces and see what’s going on. Perhaps Kyle will chime in.




Interesting SLOB Use Cases – Part I. Studying ZFS Fragmentation. Introducing Bart Sjerps.

This is the first installment in a series of posts I’m launching to share interesting use cases for SLOB. I have several installments teed up but to put a spin on things I’m going to hit two birds with one stone in this installment. The first bird I’ll hit is to introduce a friend and colleague, Bart Sjerps, who I just added to my blogroll. The other bird in my cross-hairs is this interesting post Bart wrote some time back that covers a study of ZFS fragmentation using SLOB.

Bart Sjerps on ZFS Fragmentation. A SLOB study.

As always, please visit the SLOB Resources Page for SLOB kit and documentation.


EMC XtremIO – The Full-Featured All-Flash Array. Interested In Oracle Performance? See The Whitepaper.

NOTE: There’s a link to the full article at the end of this post.

I recently submitted a manuscript to the EMC XtremIO Business Unit covering some compelling lab results from testing I concluded earlier this year. I hope you’ll find the paper interesting.

There is a link to the full paper at the bottom of this block post. I’ve pasted the executive summary here:

Executive Summary

Physical I/O patterns generated by Oracle Database workloads are well understood. The predictable nature of these I/O characteristics have historically enabled platform vendors to implement widely varying I/O acceleration technologies including prefetching, coalescing transfers, tiering, caching and even I/O elimination. However, the key presumption central to all of these acceleration technologies is that there is an identifiable active data set. While it is true that Oracle Database workloads generally settle on an active data set, the active data set for a workload is seldom static—it tends to move based on easily understood factors such as data aging or business workflow (e.g., “month-end processing”) and even the data source itself. Identifying the current active data set and keeping up with movement of the active data set is complex and time consuming due to variability in workloads, workload types, and number of workloads. Storage administrators constantly chase the performance hotspots caused by the active dataset.

All-Flash Arrays (AFAs) can completely eliminate the need to identify the active dataset because of the ability of flash to service any part of a larger data set equally. But not all AFAs are created equal.

Even though numerous AFAs have come to market, obtaining the best performance required by databases is challenging. The challenge isn’t just limited to performance. Modern storage arrays offer a wide variety of features such as deduplication, snapshots, clones, thin provisioning, and replication. These features are built on top of the underlying disk management engine, and are based on the same rules and limitations favoring sequential I/O. Simply substituting flash for hard drives won’t break these features, but neither will it enhance them.

EMC has developed a new class of enterprise data storage system, XtremIO flash array, which is based entirely on flash media. XtremIO’s approach was not simply to substitute flash in an existing storage controller design or software stack, but rather to engineer an entirely new array from the ground-up to unlock flash’s full performance potential and deliver array-based capabilities that are unprecedented in the context of current storage systems.

This paper will help the reader understand Oracle Database performance bottlenecks and how XtremIO AFAs can help address such bottlenecks with its unique capability to deal with constant variance in the I/O profile and load levels. We demonstrate that it takes a highly flash-optimized architecture to ensure the best Oracle Database user experience. Please read more:  Link to full paper from

Oracle Exadata Database Machine: Proving 160 Xeon E7 Cores Are As “Slow” As 128 Xeon E5 Cores?

Reading Data Sheets
If you are in a position of influence affecting technology adoption in your enterprise you likely spend a lot of time reading data sheets from vendors.  This is just a quick blog entry about something I simply haven’t taken the time to cover even though the topic at hand has always be a “problem.” Well, at least since the release of the Oracle Exadata Database Machine X2-8.

In the following references and screenshots you’ll see that Oracle cites 1.5 million flash read IOPS as an expected limit for both the full-rack Oracle Exadata Database Machine X3-2 and the Oracle Exadata Database Machine X3-8. All machines have limits and Exadata is no exception. Notice how I draw attention to the footnote that accompanies the flash read IOPS claim. Footnote number 3 says that both of these Exadata models are limited in flash read IOPS by the database host CPU. Let me repeat that last bit for anyone scrutinizing my words for reasons other than education: The Oracle Exadata Database Machine data sheets explicitly state flash read IOPS are limited by host CPU.

Oracle’s numbers in this case are SQL-driven from Oracle instances. I have no doubt these systems are both capable of achieving 1.5 million read IOPS from flash because, truth be told, that isn’t really all that many IOPS–especially when the IOPS throughput numbers are not accompanied by service times. In the 1990s it was all about “how much” but in modern times it’s about “how fast.” Bandwidth is an old, tired topic. Modern platforms are all about latency. Intel QPI put the problem of bandwidth to rest.

So, again, I don’t doubt the 1.5 million flash read IOPS citation. Exadata has a lot of flash cards and a lot of host processors to drive concurrent I/O. Indeed, with the concurrent processing capabilities of both of these Exadata models, Oracle would be able to achieve 1.5 million IOPS even if the service times were more in line with what one would expect with mechanical storage. Again, we never see service time citations so in actuality the 1.5 million number is just a representation of how much in-flight I/O the platform can handle.

Here is the new truth: IOPS is a storage bandwidth metric.

Host CPU Limited! How Many CPUs?
Here’s the stinger: Oracle blames host CPU for the 1.5 million flash read IOPS number. The problem with that is the X3-2 has 128 Xeon E5-2690 processor cores and the X3-8 has 160 Xeon E7-8870 processor cores. So what is Oracle’s real message here? Is it that the cores in the X3-8 are 20% slower than those in the X3-2 model? I don’t know. I can’t put words in Oracle’s mouth. However, if the data sheet is telling the truth then one of two things is true, either a) the E5-2690 processors are indeed 20% faster on a per-core basis than the E7-8870 or b) there is a processing asymmetry problem.

Not All CPU Bottlenecks Are Created Equal
Oracle would likely not be willing to dive into technical detail to the same level I do. Life is a series of choices–including who you chose to buy storage and platforms from. However, Oracle’s literature is clear about the number of active 40Gb QDR Infiniband ports there are in each configuration and this is where the asymmetry comes in. There are 8 active ports in both of these models. That means there are 8 streams of interrupt handling in both cases–regardless of how many cores there are in total.

As is the case with any networked storage, I recommend you monitor mpstat -P ALL output on database hosts to see whether there are cores nailed to the wall with interrupt processing at levels below total CPU-saturation.  Never settle for high-level aggregate CPU utilization monitoring. Instead, drill down to the per-core level to watch out for asymmetry. Doing so is just good platform scientist work.

Between now and the time you should find yourself in a proof of concept test situation with Exadata, don’t hesitate to ask Oracle why–by their own words–both 128 cores and 160 cores are equally saturated when delivering maximum read IOPS in the database grid. After all, they charge the same per core (list price) to license Oracle Database on either of those processors.

Nice and Concise?
By the way, is there anyone who actually believes that both of these platforms top out at precisely 1.5 million flash read IOPS?

Oracle Exadata Database Machine X3-2 Datasheet


Oracle Exadata Database Machine X3-8 Datasheet


DISCLAIMER: This post tackles citations straight from Oracle published data sheets and published literature.

SLOB 2 — A Significant Update. Links Are Here.

BLOG UPDATE: 2014.05.15:  The following link supersedes all other references to SLOB kit and patches. This will always be the up to date locale:

BLOG UPDATE 2013.12.26:   Quick link to download the kit

BLOG UPDATE 2012.05.05: Updated the tar archive distribution file with some bug fixes.  Simply preserve your slob.conf file and extract this tar archive over your prior SLOB install directory.

BLOG UPDATE 2012.05.04: The PDF README will no longer be bundled in with the tar archive. The README can be found here:  SLOB2 README.

BLOG UPDATE 2012.05.03: First time visitors should see the introductory page for SLOB.

About SLOB 2
I’ve already socialized the SLOB 2 update via twitter and a lot of friends have had early access to the kit. So, this is just a very brief blog entry to point to SLOB 2.

I’ve written a form of a release note that will be sufficient for current SLOB users to move forward rapidly with new SLOB 2 features. The note can be found here: SLOB 2 README or here.

Download The SLOB2 Kit
To download the software you can access the tar archive on EMC Syncplicity. Click SLOB 2 Tar Archive.

After downloading you should verify the md5sum:

$ md5sum 2013.05.05.slob2.tar
e1e67a68bf253a02532ebd556a2ea782  2013.05.05.slob2.tar

Announcing EMC WORLD 2013 Flash Related Sessions

Interested In EMC Flash Products Division Technology?
This is just a quick blog entry to announce sessions at EMC WORLD offered by speakers from EMC’s Flash Products Division.  The sessions I’m speaking at is the one about accelerating SQL Server and Oracle with EMC XtremSW Cache.



My First Words on Oracle’s SPARC T5 Processor — The World’s Fastest Microprocessor?

On March 26, 2013, Oracle announced a server refresh based on the new SPARC T5 processor[1].  The press release proclaims SPARC T5 is the “World’s Fastest Microprocessor”—an assertion backed up with a list of several recent benchmark results included a published TPC-C result.

This article focuses on the recent SPARC T5 TPC-C result–a single-system world record that demonstrated extreme throughput. The SPARC T5 result bested the prior non-clustered Oracle Database result by 69%! To be fair, that was 69% better than a server based an Intel Xeon E7 processor slated to be obsolete this year (with the release of Ivy Bridge-EX). Nonetheless, throughput is throughput and throughput is all that matters, isn’t it?

What Costs Is What Matters
There are several ways to license Oracle Database. Putting aside low-end user-count license models and database editions other than Enterprise Edition leaves the most common license model which is based on per-processor licensing.

To layman, and seasoned veteran alike, mastering Oracle licensing is a difficult task. In fact, Oracle goes so far as to publish a Software Investment Guide[2] that spells out the necessity for licensees to identify personnel within their organization responsible for coping with license compliance. Nonetheless, there are some simple licensing principles that play a significant role in understanding the relevance of any microprocessor being anointed the “fastest in the world.”

One would naturally presume “fastest” connotes cost savings when dealing with microprocessors.  Deploying faster processors usually should mean fewer are needed thus yielding cost savings spanning datacenter physical and environmental savings as well as reduced per-processor licensing.  Should, that is.

What is a Processor?
Oracle’s Software Investment Guide covers the various licensing models available to customers. Under the heading “Processor Metric” Oracle offers several situations where licensing by the processor is beneficial. The guide goes on to state:

The number of required licenses shall be determined by multiplying the total number of cores of the processor by a core processor licensing factor specified on the Oracle Processor Core Factor Table

As this quoted information suggests, the matter isn’t as simple as counting the number of processor “sockets” in a server. Oracle understands that more powerful processors allow their customers to achieve more throughput per core.  So, Oracle could stand to lose a lot of revenue if per-core software licensing did not factor in the different performance characteristics of modern processors. In short, Oracle is compelled to charge more for faster processors.

As the Software Investment Guide states, one must consult the Oracle Processor Core Factor Table[3] in order to determine list price for a specific processor. The Oracle Processor Core Factor Table has a two-columns—one for the processor make and model and the other for the Licensing Factor. Multiplying the Licensing Factor times the number of processor cores produces list price for Oracle software.

The Oracle Processor Core Factor Table is occasionally updated to reflect new processors that come into the marketplace. For example, the table was updated on October 2, 2010, September 6, 2011 and again on March 26, 2013 to correspond with the availability of Oracle’s T3, T4 and T5 processor respectively.  As per the table, the T3 processor was assigned a Licensing Factor of .25 whereas the T4 and T5 are recognized as being more powerful and thus assigned a .5 factor.  This means, of course, that any customer who migrated from T3 to T4 had to ante-up for higher-cost software—unless, of course, the T4 allowed the customer to reduce the number of cores in the deployment by 50%.

The World’s Fastest Microprocessor
According to dictionary definition, something that is deemed fast is a) characterized by quick motion, b) moving rapidly and/or c) taking a comparatively short time. None of these definitions imply throughput as we know it in the computer science world. In information processing, fast is all about latency whether service times for transactions or underlying processing associated with transactions such as memory latency.

The TPC-C specification stipulates that transaction response times are to be audited along with throughput. The most important transaction is, of course, New Order. That said, the response time of transactions on a multi-processing computer have little bearing on transaction throughput. This fact is clearly evident in published TPC-C results as will be revealed later in this article.

Figure 1 shows the New Order 90th-percentile response times for the three most recently published Oracle Database 11g TPC-C results[4]. Included in the chart is a depiction of Oracle’s SPARC T5 demonstrating an admirable 13% improvement in New Order response times compared to current[5] Intel two-socket Xeon server technology. That is somewhat fast. On the contrary, however, one year—to the day—before Oracle published the SPARC T5 result, Intel’s Xeon E7 processors exhibited 46% faster New Order response times than the SPARC T5. Now that, is fast.

This is a Caption

Figure 1: Comparing Oracle Database TPC-C Transaction Response Times. Various Platforms. Smaller is better.

Cost Is Still All That Matters
According to the Oracle Technology Global Price List dated March 15, 2013[6], Oracle Database Enterprise Edition with Real Application Clusters and Partitioning has a list price of USD $82,000 “per processor.” As explained above in this article, one must apply the processor core factor to get to the real list price for a given platform. It so happens that all three of the processors spoken of in Figure 1 have been assessed a core factor of .5 by Oracle. While all three of these processors are on par in the core factor category, they have have vastly different numbers of cores per socket. Moreover, the servers used in these three benchmarks had socket-counts ranging from 2 to 8. To that end, the SPARC T5 server had 128 cores, the Intel Xeon E7-8870 server had 80 cores and the Intel Xeon E5-2690 server had 16 cores.

Performance Per Oracle License
Given the core counts, license factor and throughput achieved for the three TPC-C benchmarks discussed in the previous section of this article, one can easily calculate the all-important performance-per-license attributes of each of the servers. Figure 2 presents TPC-C throughput per core and per Oracle license in a side-by-side visualization for the three recent TPC-C results.

Figure 2: Comparing Oracle Database TPC-C Performance per-core and per-license. Bigger is better.

Figure 2: Comparing Oracle Database TPC-C Performance per-core and per-license. Bigger is better.

The Importance of Response Times
In order to appreciate the rightful importance of response time in characterizing platform performance, consider the information presented in Figure 3. Figure 3 divides response time into TPC-C performance per core. Since the core factor is the same for each of these processors this is essentially weighing response time against license cost.

To add some historical perspective, Figure 3 also includes an Oracle Database 11g published TPC-C result[7] from June 2008 using Intel’s Xeon 5400 family of processors which produced 20,271 TpmC/core and .2 seconds New Order response times. It is important to point out that the core factor has always been .5 for Xeon processors. As Figure 3 shows, SPARC T5 outperforms the 2008-era result by about 35%. On the other hand, the Intel two-socket Xeon E5 result delivers 31% better results in this type of performance assessment. Finally, the Intel 8-socket Xeon E7 result outperformed SPARC T5 by 76%. If customers care about both response time and cost these are important data points.

Figure 3: Performance Per Core weighted by Transaction Response Times. Bigger Is Better.

Figure 3: Performance Per Core weighted by Transaction Response Times. Bigger Is Better.

Parting Thoughts
I accept the fact that there are many reasons for Oracle customers to remain on SPARC/Solaris—the most significant being postponing the effort of migrating to Intel-based servers. I would argue, however, that such a decision amounts to postponing the inevitable. That is my opinion, true, but countless Oracle shops made that move during the decade-long decline of Sun Microsystems market share. In fact, Oracle strongly marketed Intel servers running Real Application Clusters attached to conventional storage (mostly sourced from EMC) as a viable alternative to Oracle on Solaris.

I don’t speak lightly of the difficulty in moving off of SPARC/Solaris. In fact, I am very sympathetic of the difficulty such a task entails. What I can’t detail, in this blog entry, is a comparison between re-platforming from dilapidated SPARC servers and storage to something 21st-century—such as a converged infrastructure platform like VCE.  It all seems like a pay-now or pay-later situation to me. Maybe readers with a 5-year vision for their datacenter can detail for us why one would want to gamble on the SPARC roadmap.

[4] Oracle SPARC T5 3/26/2013, Intel Xeon E5-2690, Intel Xeon E7-8870

[5] As of the production date of this article, 2013 is the release target for the Ivy Bridge-EP 22nm die shrink next-generation improvement in Intel’s Xeon E5 family

EMC Employee Disclaimer

The opinions and interests expressed on EMC employee blogs are the employees' own and do not necessarily represent EMC's positions, strategies or views. EMC makes no representation or warranties about employee blogs or the accuracy or reliability of such blogs. When you access employee blogs, even though they may contain the EMC logo and content regarding EMC products and services, employee blogs are independent of EMC and EMC does not control their content or operation. In addition, a link to a blog does not mean that EMC endorses that blog or has responsibility for its content or use.

This disclaimer was put into place on March 23, 2011.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 2,023 other followers

Oracle ACE Program Status

Click It

website metrics

Fond Memories


All content is © Kevin Closson and "Kevin Closson's Blog: Platforms, Databases, and Storage", 2006-2013. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Kevin Closson and Kevin Closson's Blog: Platforms, Databases, and Storage with appropriate and specific direction to the original content.


Get every new post delivered to your Inbox.

Join 2,023 other followers