Linux Thinks It’s a CPU, But What Is It Really – Part I. Mapping Xeon 5500 (Nehalem) Processor Threads to Linux OS CPUs.

Thanks to Steve Shaw, Database Technology Manager, Intel for pointing me to the magic decoder ring for associating Xeon 5500 (Nehalem) processor threads with Linux OS CPUs. Steve is an old acquaintance who I would gladly refer to as a friend but I’m not sure how Steve views the relationship. See, I was the technical reviewer of his book (Pro Oracle Database 10g RAC on Linux), which is a role that can make friends or frenemies I suppose. I don’t have any bad memories of the project, and Steve is still talking to me, so I think things are hunky dory.  OK, joking aside…but first, a bit more about Steve.

Steve writes the following on his website intro page (emphasis added by me):

I’m Steve Shaw and for over 10 years have specialised in working with the Oracle database. I have a background with Oracle on various flavours of UNIX including HP-UX, Sun Solaris and my own personal favourite Dynix/ptx on Sequent.

Sequent? I’ve emerged from my ex-Sequent 12-step program! Indeed, that is a really good personal favorite to have. But, I’m sentimental, and I digress as well.

The Magic Decoder Ring
The web resource Steve provided is this Intel webpage containing information about processor topology. There is an Intel processor topology tool that really helps make sense of the mappings between processor cores and threads on Nehalem processors  and Linux OS CPUs.

What’s in the “Package?”
As we can see from that Intel webpage, and the processor topology tool itself, Intel often use the term “package” when referring to what goes in a socket these days. Considering there are both cores and threads, I suppose there is justification for a more descriptive term. I still use socket/core/thread nomenclature though. It works for me.  Nonetheless, let’s see what my Nehalem 2s8c16t system shows when I run the topology tool. First, let’s see the output from “package” number 0 (socket 0). There is a lot of output from the command. I recommend focusing on line 20 and 21 in the following text box:


Package 0 Cache and Thread details

Box Description:
Cache  is cache level designator
Size   is cache size
OScpu# is cpu # as seen by OS
Core   is core#[_thread# if > 1 thread/core] inside socket
AffMsk is AffinityMask(extended hex) for core and thread
CmbMsk is Combined AffinityMask(extended hex) for hw threads sharing cache
       CmbMsk will differ from AffMsk if > 1 hw_thread/cache
Extended Hex replaces trailing zeroes with 'z#'
       where # is number of zeroes (so '8z5' is '0x800000')
L1D is Level 1 Data cache, size(KBytes)= 32,  Cores/cache= 2, Caches/package= 4
L1I is Level 1 Instruction cache, size(KBytes)= 32,  Cores/cache= 2, Caches/package= 4
L2 is Level 2 Unified cache, size(KBytes)= 256,  Cores/cache= 2, Caches/package= 4
L3 is Level 3 Unified cache, size(KBytes)= 8192,  Cores/cache= 8, Caches/package= 1
      +-----------+-----------+-----------+-----------+
Cache |  L1D      |  L1D      |  L1D      |  L1D      |
Size  |  32K      |  32K      |  32K      |  32K      |
OScpu#|    0     8|    1     9|    2    10|    3    11|
Core  |c0_t0 c0_t1|c1_t0 c1_t1|c2_t0 c2_t1|c3_t0 c3_t1|
AffMsk|    1   100|    2   200|    4   400|    8   800|
CmbMsk|  101      |  202      |  404      |  808      |
      +-----------+-----------+-----------+-----------+

Cache |  L1I      |  L1I      |  L1I      |  L1I      |
Size  |  32K      |  32K      |  32K      |  32K      |
      +-----------+-----------+-----------+-----------+

Cache |   L2      |   L2      |   L2      |   L2      |
Size  | 256K      | 256K      | 256K      | 256K      |
      +-----------+-----------+-----------+-----------+

Cache |   L3                                          |
Size  |   8M                                          |
CmbMsk|  f0f                                          |
      +-----------------------------------------------+

From the output we can decipher that Linux OS CPU 0 resides in socket 0, core 0, thread 0. That much is straightforward. On the other hand, the tool adds value by showing us that Linux OS CPU 8 is actually the second processor thread in socket 0, core 0. And, of course, “package” 1 follows in suit:


Package 1 Cache and Thread details

Box Description:
Cache  is cache level designator
Size   is cache size
OScpu# is cpu # as seen by OS
Core   is core#[_thread# if > 1 thread/core] inside socket
AffMsk is AffinityMask(extended hex) for core and thread
CmbMsk is Combined AffinityMask(extended hex) for hw threads sharing cache
       CmbMsk will differ from AffMsk if > 1 hw_thread/cache
Extended Hex replaces trailing zeroes with 'z#'
       where # is number of zeroes (so '8z5' is '0x800000')
      +-----------+-----------+-----------+-----------+
Cache |  L1D      |  L1D      |  L1D      |  L1D      |
Size  |  32K      |  32K      |  32K      |  32K      |
OScpu#|    4    12|    5    13|    6    14|    7    15|
Core  |c0_t0 c0_t1|c1_t0 c1_t1|c2_t0 c2_t1|c3_t0 c3_t1|
AffMsk|   10   1z3|   20   2z3|   40   4z3|   80   8z3|
CmbMsk| 1010      | 2020      | 4040      | 8080      |
      +-----------+-----------+-----------+-----------+

Cache |  L1I      |  L1I      |  L1I      |  L1I      |
Size  |  32K      |  32K      |  32K      |  32K      |
      +-----------+-----------+-----------+-----------+

Cache |   L2      |   L2      |   L2      |   L2      |
Size  | 256K      | 256K      | 256K      | 256K      |
      +-----------+-----------+-----------+-----------+

Cache |   L3                                          |
Size  |   8M                                          |
CmbMsk| f0f0                                          |
      +-----------------------------------------------+

So, it goes like this:

Linux OS CPU Package Locale
0 S0_c0_t0
1 S0_c1_t0
2 S0_c2_t0
3 S0_c3_t0
4 S1_c0_t0
5 S1_c1_t0
6 S1_c2_t0
7 S1_c3_t0
8 S0_c0_t1
9 S0_c1_t1
10 S0_c2_t1
11 S0_c3_t1
12 S1_c0_t1
13 S1_c1_t1
14 S1_c2_t1
15 S1_c3_t1

By the way, the CPU topology tool works on other processors in the Xeon family.

3 Responses to “Linux Thinks It’s a CPU, But What Is It Really – Part I. Mapping Xeon 5500 (Nehalem) Processor Threads to Linux OS CPUs.”


  1. 1 karlarao April 23, 2009 at 5:35 am

    Nice, 2CPU QuadCore with HT!
    the tool by Intel gives a very detailed info..

    You can also check on this document by redhat kbase (with sample outputs),

    http://kbase.redhat.com/faq/docs/DOC-7715

    If it’s okay with you, can you give the output of the following commands:

    cat /proc/cpuinfo | grep -i “model name” | uniq
    grep processor /proc/cpuinfo
    grep “physical id” /proc/cpuinfo
    grep siblings /proc/cpuinfo
    grep “core id” /proc/cpuinfo
    grep “cpu cores” /proc/cpuinfo

    Thanks!

    • 2 kevinclosson April 23, 2009 at 2:25 pm

      Hi Karlarao,

      I won’t run that, but I’ll run this: 🙂

      # cat /tmp/foo

      function filter(){

      sed ‘s/^.*://g’ | xargs echo
      }
      grep processor /proc/cpuinfo | filter
      grep ‘physical id’ /proc/cpuinfo | filter
      grep siblings /proc/cpuinfo | filter
      grep ‘core id’ /proc/cpuinfo | filter
      grep ‘cpu cores’ /proc/cpuinfo | filter
      # sh /tmp/foo
      0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
      0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
      8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
      0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3
      4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

      if you don’t get this output horizontally oriented it is a mess

      • 3 karlarao April 23, 2009 at 4:42 pm

        Cool! Nice work with the function 🙂
        I used to write/draw it on paper.. haha

        From the output of /proc/cpuinfo, the table below could also be done. It’s kind of cryptic at first but it’s another way to do it without running the script by Intel

        ——————————————————-
        OScpu#| 0 8| 1 9| 2 10| 3 11|
        Core |c0_t0 c0_t1|c1_t0 c1_t1|c2_t0 c2_t1|c3_t0 c3_t1|
        ——————————————————-
        OScpu#| 4 12| 5 13| 6 14| 7 15|
        Core |c0_t0 c0_t1|c1_t0 c1_t1|c2_t0 c2_t1|c3_t0 c3_t1|
        ——————————————————-


Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.




DISCLAIMER

I work for Amazon Web Services. The opinions I share in this blog are my own. I'm *not* communicating as a spokesperson for Amazon. In other words, I work at Amazon, but this is my own opinion.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 743 other subscribers
Oracle ACE Program Status

Click It

website metrics

Fond Memories

Copyright

All content is © Kevin Closson and "Kevin Closson's Blog: Platforms, Databases, and Storage", 2006-2015. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Kevin Closson and Kevin Closson's Blog: Platforms, Databases, and Storage with appropriate and specific direction to the original content.