I’ve received questions about HugePages_Rsvd a few times in the last few months. After googling for HugePages_Rsvd +Oracle and not seeing a whole lot, I thought I’d put out this quick blog entry.
Here I have a system with 600 hugepages reserved:
# cat /proc/meminfo | grep HugePages
HugePages_Total: 600
HugePages_Free: 600
HugePages_Rsvd: 0
Next, I boot up this 1.007GB SGA:
SQL*Plus: Release 11.1.0.6.0 - Production on Tue Jul 8 11:25:14 2008 Copyright (c) 1982, 2008, Oracle. All rights reserved. Connected to an idle instance. SQL> startup ORACLE instance started. Total System Global Area 1081520128 bytes Fixed Size 2166960 bytes Variable Size 339742544 bytes Database Buffers 734003200 bytes Redo Buffers 5607424 bytes Database mounted. Database opened. SQL>
Booting this SGA only used up 324 pages:
# cat /proc/meminfo | grep HugePages HugePages_Total: 600 HugePages_Free: 276 HugePages_Rsvd: 195
If my buffers are 700 MB and my variable SGA component is 324 MB, why weren’t 512 hugepages used? Let’s see what happens when I start using some buffers and library cache. I’ll run catalog.sql and catproc.sql and then check hugepages again:
# cat /proc/meminfo | grep HugePages HugePages_Total: 600 HugePages_Free: 237 HugePages_Rsvd: 156
That used up another 39 hugepages, or 78 MB. At this point my SGA usage still leaves about 305 MB of unbacked virtual memory. If I were to run some OLTP, the rest would get allocated. The idea here is that it really makes no sense to do the allocation overhead until the pages are actually touched. It makes no sense to go to all the trouble in VM land if the pages might never be used. Think about an errant program that allocates a sizable amount of hugepages just to rapidly die. While that’s not Oracle, the Linux guys have to keep a pretty general-purpose mindset. This really goes back to the olden days of Unix when folks argued the virtues of pre-allocating swap to ensure there would never be a condition where a swap-out couldn’t be satisfied. The problem with that approach was that before calls like vfork() became popular there was a ton of overhead on large systems just to retire VM resources of very short lived processes, such as those which fork() only to immediately exec().
OK, so that was a light-reading blog entry, but some googler, someday, might find it interesting.
Yes, that was a come-on title…so surprising, isn’t it?
Hi Kevin,
Is this dependent on linux kernel version?
[jason@lr1 ~]$ uname -r
2.6.9-22.ELsmp
[jason@lr1 ~]$ cat /proc/meminfo | grep HugePages
HugePages_Total: 5000
HugePages_Free: 504
While on a higher kernel version
[jason@centos5build ~]$ uname -r
2.6.18-53.el5
[jason@centos5build ~]$ cat /proc/meminfo | grep HugePages
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
The behvaiour I see with the 2.6.9 kernel is that all memory for oracle is allocated from the hugepages at startup.
I think it’s jonathan lewis that says when searching for information on the internet “if it does not contain a version don’t believe it, if it is a different version from what you are running, don’t believe it” (i probably paraphrase there somewhat).
cheers,
jason.
Hi Jason,
Yes, I haven’t touched a 2.6.9 kernel in nearly a year so I spaced out on that one. Jonathan’s words are surely gold, but in this case it seems a bit obvious that it couldn’t be a 2.6.9-related topic I should think.
So, yes, prior to the very existence of the HugePages_Rsvd approach (e.g., 2.6.9) all hugepages are fully allocated upon successful completion of the shmget() call.
Thanks for pointing it out.
Jason, btw, in your 2.6.18 output you don’t have any hugepages configured at all (HugePages_total is 0).
Kevin, just a minor addition which the “random googler” could find useful is that if you definitely want to allocate all pages during isntance startup you could set pre_page_sga=true
I haven’t tested this with linux hugepages though but it ought to work..
Tanel,
Thanks for stopping by. Remember that pre_page_sga is a misnomer. It should be pre_page_buffer_cache…you’ll still be leaving the shared pool out of the mix.