I thought a comment on one of my recent blog entries deserved handling in a blog entry. A reader posted:
Have you done any comparisons of the HP DL585 with an HP DL580? Is the DL580 a NUMA machine? Which one would you by today for a RAC cluster?
I’ll answer these out of order. The DL580 is not a NUMA system. Although it stands to reason that if HP continues the DL580 product line into the future to the point where they bake in the CSI interconnect then at that time the DL580 would be a NUMA system. So, the short answer to whether or not a DL580 is a NUMA system is no, it is not. I think long answers are more fun.
In my series of posts about Oracle on NUMA, I think I must have said it about umpteen times, but I’ll say it again concisely in this post. I’m talking about what “NUMA-aware” software means. I routinely hear that Oracle is NUMA-aware. It is, and it isn’t. The reason I say this is because there are widely varying degrees of NUMA-awareness that varies between hardware platforms and Oracle ports. I made the point in my recent post about Oracle Database 10g 10.2.0.4 that 10.2.0.4 contains NUMA-related fixes, and it does. However, that isn’t saying it is the fullness of NUMA-aware, because it isn’t. However, the only question that matters is whether it is sufficiently NUMA-aware for today’s NUMA systems, and I’d have to say that it is.
I’ll give a hint: No Linux Oracle release can be fully NUMA-aware until processes (e.g., shadow processes, PQO slaves, etc) can quickly and cheaply detect what CPU they are currently executing on and prefer memory resources based on that locality. Way back in 1996 I was in Advanced Oracle Engineering at Sequent and we were in the late stages of producing the first commercial NUMA system. It was my early Oracle work on Sequent’s NUMA that begat the GETENGNO(3SEQ) API, which was an extremely inexpensive call for processes to check what CPU they are executing on.
Let’s fast forward to today. The Linux development folks are considering the Linux corollary for Sequent’s GETENGNO() with the vgetcpu() call. The problem is that the call is very, very slow compared to the 4-6 cycles that Sequent required to inform a process what CPU it was executing on. Nonetheless, the point is that until vgetcpu() works, and Oracle exploits it, the pinnacle of NUMA-awareness has not been met. And while that may not matter given today’s AMD situation, it will certainly matter when Intel system are NUMA (e.g., CSI based). I guess I shouldn’t equate Linux NUMA with AMD since IBM’s x3950 is a building block for large NUMA systems, and there are others as well. But I was focusing on commodity-level NUMA systems which the x3950 most certainly is not.
There are a lot of factors in selecting hardware, but since I’m asked about DL585 vs DL580, I’d say DL580-so long as it is a DL580 G5. I have tested DL585 and DL580 side by side. However, that was a pretty old DL580 G3 (1066 MHz FSB). I see that the DL580 is now fit with the “Penryn” Xeons (e.g., 5460), which have a front-side bus speed of 1333MHz. There are G5′s that are fit with “Tigerton” Xeons which are 1066 MHz FSB. I’ve seen benchmark results that suggest there is some 21% to gain from going with a 5460-based G5 over a 7350-based G5. So, look closely at the specification. Also, I think a shrewd shopper would try to read the crystal ball to see when the DL580 G5 will be fit with the Xeon 5462 which has a 1600 MHz FSB. As always, with Oracle you want big pipes.