Oracle Database Doesn’t Use Hugepages Correctly. What’s Better, Reserved or Used?

I’ve received questions about HugePages_Rsvd a few times in the last few months. After googling for HugePages_Rsvd +Oracle and not seeing a whole lot, I thought I’d put out this quick blog entry.

Here I have a system with 600 hugepages reserved:

# cat /proc/meminfo | grep HugePages
HugePages_Total: 600
HugePages_Free: 600
HugePages_Rsvd: 0

Next, I boot up this 1.007GB SGA:

SQL*Plus: Release 11.1.0.6.0 - Production on Tue Jul 8 11:25:14 2008

Copyright (c) 1982, 2008, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area 1081520128 bytes
Fixed Size                  2166960 bytes
Variable Size             339742544 bytes
Database Buffers          734003200 bytes
Redo Buffers                5607424 bytes
Database mounted.
Database opened.
SQL>

Booting this SGA only used up 324 pages:

#  cat /proc/meminfo | grep HugePages
HugePages_Total:   600
HugePages_Free:    276
HugePages_Rsvd:    195

If my buffers are 700 MB and my variable SGA component is 324 MB, why weren’t 512 hugepages used? Let’s see what happens when I start using some buffers and library cache. I’ll run catalog.sql and catproc.sql and then check hugepages again:

#  cat /proc/meminfo | grep HugePages
HugePages_Total:   600
HugePages_Free:    237
HugePages_Rsvd:    156

That used up another 39 hugepages, or 78 MB. At this point my SGA usage still leaves about 305 MB of unbacked virtual memory. If I were to run some OLTP, the rest would get allocated. The idea here is that it really makes no sense to do the allocation overhead until the pages are actually touched. It makes no sense to go to all the trouble in VM land if the pages might never be used. Think about an errant program that allocates a sizable amount of hugepages just to rapidly die. While that’s not Oracle, the Linux guys have to keep a pretty general-purpose mindset. This really goes back to the olden days of Unix when folks argued the virtues of pre-allocating swap to ensure there would never be a condition where a swap-out couldn’t be satisfied. The problem with that approach was that before calls like vfork() became popular there was a ton of overhead on large systems just to retire VM resources of very short lived processes, such as those which fork() only to immediately exec().

OK, so that was a light-reading blog entry, but some googler, someday, might find it interesting.

Yes, that was a come-on title…so surprising, isn’t it? 🙂

9 Responses to “Oracle Database Doesn’t Use Hugepages Correctly. What’s Better, Reserved or Used?”


  1. 1 jarneil July 11, 2008 at 7:25 am

    Hi Kevin,

    Is this dependent on linux kernel version?

    [jason@lr1 ~]$ uname -r
    2.6.9-22.ELsmp

    [jason@lr1 ~]$ cat /proc/meminfo | grep HugePages
    HugePages_Total: 5000
    HugePages_Free: 504

    While on a higher kernel version

    [jason@centos5build ~]$ uname -r
    2.6.18-53.el5

    [jason@centos5build ~]$ cat /proc/meminfo | grep HugePages
    HugePages_Total: 0
    HugePages_Free: 0
    HugePages_Rsvd: 0

    The behvaiour I see with the 2.6.9 kernel is that all memory for oracle is allocated from the hugepages at startup.

    I think it’s jonathan lewis that says when searching for information on the internet “if it does not contain a version don’t believe it, if it is a different version from what you are running, don’t believe it” (i probably paraphrase there somewhat).

    cheers,

    jason.

  2. 2 kevinclosson July 11, 2008 at 2:42 pm

    Hi Jason,

    Yes, I haven’t touched a 2.6.9 kernel in nearly a year so I spaced out on that one. Jonathan’s words are surely gold, but in this case it seems a bit obvious that it couldn’t be a 2.6.9-related topic I should think.

    So, yes, prior to the very existence of the HugePages_Rsvd approach (e.g., 2.6.9) all hugepages are fully allocated upon successful completion of the shmget() call.

    Thanks for pointing it out.

  3. 3 tanelp July 15, 2008 at 11:16 am

    Jason, btw, in your 2.6.18 output you don’t have any hugepages configured at all (HugePages_total is 0).

    Kevin, just a minor addition which the “random googler” could find useful is that if you definitely want to allocate all pages during isntance startup you could set pre_page_sga=true

    I haven’t tested this with linux hugepages though but it ought to work..

  4. 4 kevinclosson July 15, 2008 at 4:58 pm

    Tanel,

    Thanks for stopping by. Remember that pre_page_sga is a misnomer. It should be pre_page_buffer_cache…you’ll still be leaving the shared pool out of the mix.

  5. 5 S. Mann August 18, 2009 at 4:49 pm

    Hello,

    we are trying to use the Hugepages in Oracle 10.2… env

    Environment :-

    ——————————————————————————–
    Linux Version :- 2.6.9-55.ELsmp
    Oracle Version :- Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 – 64bit

    ——————————————————————————–

    What I understand

    ——————————————————————————–
    * should be Hugepages * Hugepagesize > SGA
    * Hugepages only used by SGA not by PGA ?
    *If all the SGA don’t fit in memory allocated through Hugepages then there is no allocations of SGA through HugePages but it uses the regular memory thus wasting HugePages_Total
    it means
    HugePages_Total:
    HugePages_Free:

    ——————————————————————————–

    Problem :- We have Total System Global Area 5251268608 bytes ; almost 5GB

    Our HUGEPAGE SETTING is following

    ——————————————————————————–
    HugePages_Total: 1550
    HugePages_Free: 854
    Hugepagesize: 2048 kB

    ——————————————————————————–

    I assume it should not use the PGA memory as the SGA is 5GB and total hugepagesize is 3GB ,
    so it should not allocate any HUGEPAGES BASED memory. I have seen HugePages_Free being changing which I understand that can change but I have following questions

    Questions :-1) Why hugepages are being used at first place when the SGA can’t fit?

    2) 1s 1550 *2MB wasting here ?
    3) Where is (1550 – 854)*2MB used ?

    Thanks,
    Sarab

    • 6 kevinclosson August 18, 2009 at 10:20 pm

      PGA and hugepages are totally unrelated in today’s world. So, at such a time as there are only 854 free (of the 1550), are there also no existing shared memory segments shown in ipcs -m output?

      I think the important question to ask is why allocate 1550 when your SGA is 5G… you are just losing that ~3G of memory…it cannot (other than this seemingly phantom 696 page usage) be used unless you boot an instance with an SGA smaller than 3G…

  6. 7 Johann October 18, 2011 at 8:47 pm

    I just loaded 11.2.0.2 on RHEL 6.1 and cannot figure out how to configure 1GB huge pages (instead of the default 2MB huge pages). I’m also interested to know if the PGA can take advantage of transparent huge pages.

    If anyone knows, please post!

    • 8 kevinclosson October 18, 2011 at 10:04 pm

      Do you have the kernel bootstrings set correctly for 1GB hugepages? Do you get an error? Are you just getting 2MB hugepages? Are you sure Oracle is calling shmget(SHM_HUGETLB) ? Perhaps if you attach strace to the shadow that is booting the instance and cut/paste the call to shmget we can get further along …

  7. 9 scottrochford February 15, 2015 at 7:44 pm

    I came here to try and remind myself what HugePages_Rsvd means, and while this article explains the delayed allocation of hugepages by the kernel and corresponding reduction of HugePages_Free, I don’t think it clarifies the role of HugePages_Rsvd:

    Meanwhile I found this page again which clarified it again:

    http://yong321.freeshell.org/oranotes/HugePages.txt

    So basically HugePages_Rsvd is the portion of HugePages_Free that will potentially be used by Oracle but hasn’t been allocated by the kernel yet. To my mind when I look at those values in /proc/meminfo I assume that they are separate portions of the hugepages configured on the system, when in fact they overlap as described in that link.

    Regards,
    Scott


Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.




DISCLAIMER

I work for Amazon Web Services. The opinions I share in this blog are my own. I'm *not* communicating as a spokesperson for Amazon. In other words, I work at Amazon, but this is my own opinion.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 743 other subscribers
Oracle ACE Program Status

Click It

website metrics

Fond Memories

Copyright

All content is © Kevin Closson and "Kevin Closson's Blog: Platforms, Databases, and Storage", 2006-2015. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Kevin Closson and Kevin Closson's Blog: Platforms, Databases, and Storage with appropriate and specific direction to the original content.