Before saying anything about lscpu, one fact must first be understood: each thread of execution of a physical core capable of simultaneous multithreading is presented to the operational system as an independent processing unit. Therefore, a physical core with two threads of execution is seen by the operational system as two (logical) cores.
In Linux terminology, a CPU is the smallest hardware unit capable of executing a thread, so the term CPU will be used below as a synonym for thread whenever the context permits.
Let's start with the basics: lscpu is a very useful command which shows lots of important information regarding the CPU architecture of a system. Below is the output of lscpu for my laptop's CPU package. Let's examine what it shows:
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 2 Core(s) per socket: 2 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 37 Stepping: 5 CPU MHz: 1199.000 BogoMIPS: 4787.75 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 3072K NUMA node0 CPU(s): 0-3
Most of the information above is relatively easy to interpret. My laptop has:
- two physical cores since it has one socket and two cores per socket
- four threads of execution since it has two threads per core
- four CPUs since Linux interprets each thread as a CPU
- one NUMA node to which all threads (CPUs) are associated
Having a single NUMA node means all CPUs are equally distant from the physical memory, so the memory access time is the same for every CPU.
The output above states that my CPU architecture is Intel x86-64 and also shows the size of all caches. One important thing that it does not show is the cache-to-CPU association, i.e., how are the different cache types associated with each CPU. This information can still be retrieved by passing the -p flag (or, equivalently, --parse) to lscpu:
lscpu -p
The output is now a bit more interesting but requires a bit of effort to interpret:
# The following is the parsable format, which can be fed to other # programs. Each different item in every column has an unique ID # starting from zero. # CPU,Core,Socket,Node,,L1d,L1i,L2,L3 0,0,0,0,,0,0,0,0 1,1,0,0,,1,1,1,0 2,0,0,0,,0,0,0,0 3,1,0,0,,1,1,1,0
As the printed comment on the top says, each line in the output corresponds to a CPU with the first line corresponding to CPU 0. For each line and from left to right, the values shown correspond to the following (see also figure 1):
CPU core | the CPU index, i.e., the index of a thread of execution; since my laptop has four threads in total, these values range from 0 to 3 |
physical core | the index of the physical core to which the thread belongs; since my laptop has only two physical cores, we see that CPUs 0 and 2 are threads from core 0 while CPUs 1 and 3 are threads from core 1 |
socket number | the number of the physical socket to which the CPU belongs; since my laptop has a single physical socket, all CPUs belong to socket 0 (technically laptops do not have sockets since their CPU packages are surface mounted, but Linux treats this type of CPU package a single socket) |
NUMA node | the number of the NUMA socket node to which the CPU belongs; since my laptop does not have separate NUMA nodes, all CPUs belong to NUMA node 0 |
book number | the logical book number of the CPU (processor books exist in very few architectures); my laptop has no books so this field is empty for all CPUs |
L1 data cache (L1d) | the index of the L1 data cache associated with the CPU; my laptop has two L1d caches: one for each physical core, so threads from the same core share a single L1d cache |
L1 instruction cache (L1i) | the index of the L1 instruction cache associated with the CPU; my laptop has two L1i caches: one for each physical core, so threads from the same core share a single L1i cache |
L2 cache | the index of the L2 cache associated with the CPU; my laptop has two L2 caches: one for each physical core, so threads from the same core share a single L2 cache |
L3 cache | the index of the L3 cache associated with the CPU; my laptop has a single L3 cache which is shared by all CPUs, therefore the L3 cache index is 0 for all of them |
Fig. 1: | CPU architecture information obtained with the lscpu command. |
The comment printed on the top of the output of lscpu -p can be easily discarded by grepping it out:
lscpu -p | grep -v ^#
Finally, to get only the architectural information you really need, you can pass a list of parameters to the -p flag to specify exactly what you want. These parameters can be: cpu, core, socket, node, book, cache. As an example, to get the index of the physical socket to which each CPU belongs, run:
lscpu -p=cpu,node | grep -v ^#
The output should now contain only the columns you requested:
0,0 1,0 2,0 3,0
Above, the parameter cpu was passed to make things clear, but it can be omitted if you do not wish to have the CPU index printed.
Comments
Did you make any kind of script to create the diagram of the different cores, caches etc. directly from the lscpu output? If so, would you perhaps share it? Thanks!
In any case, Ubuntu/Debian provides a tool called "lstopo" for visualizing the CPU and cache structure similarly to what is shown on the figure above. It is part of the "hwloc" package, which can be installed with the command below:
sudo apt-get install hwloc