When programs are written, they commonly require the assistance of libraries which contain part of the functionality they require to run. Programs could, in principle, be written without invoking functions from other libraries, but that would dramatically increase the amount of source code for even the simplest programs as they would need to contain their own copies of all the necessary basic functions which are readily available in libraries provided by either the operational system or by third parties. This redundancy would also have the negative effect of forcing the developers responsible for a given project to update their code whenever bugs are found on these commonly used functions.
When a program is compiled, it can use functions present in a given available library by linking this library directly to itself either statically or dynamically. When a library is statically linked to a program, its binary contents are incorporated into that program during compilation time. In other words, the library becomes part of the binary version of the program. The linking process is done by a program called "linker" (on Linux, that program is usually ld).
This post focuses on the case where a library is only dynamically linked to a program. In this case, the contents of the linked library will not become part of the program. Instead, when the program is compiled, a table containing the required symbols (e.g. function names) which it needs to run is created and stored on the compiled version of the program (the "executable"). This library is called the "dynamic symbol table" of the program. When the program is executed, a dynamic linker is invoked to link the program to the dynamic (or "shared") libraries which contain the definitions of these symbols. On Linux, the dynamic linker which does this job is ld-linux.so. When a program is executed, ld-linux.so is first loaded inside the address space of the process and then it loads all the dynamic libraries required to run the program (I will not describe the process in detail, but the more curious reader can find lots of relevant information about how this happens in this page). It is only after the required dynamic libraries are loaded that the program actually starts running.
When a program is compiled, the path to the dynamic linker (the "interpreter") it requires to run is added to its .interp section (a description of each ELF section of a program can be found here). To make this clear, compile this very simple C program:
#include <stdio.h> int main () { printf("Hello, world!\n"); return 0; }
with the command:
gcc main.c -o main
Now get the contents of the .interp section of the executable main:
readelf -p .interp main
The output should be similar to this:
String dump of section '.interp': [ 0] /lib64/ld-linux-x86-64.so.2
In my system, /lib64/ld-linux-x86-64.so.2 is a symbolic link to the executable file /lib/x86_64-linux-gnu/ld-2.19.so. For the curious reader, I recommend you execute the equivalent file in your system and read what it displays.
Having an idea of how the dynamic libraries are loaded, the question which comes to mind is: what are the symbols which a program requires from dynamically linked libraries to run? The answer can be obtained in many different ways. One common way to get that information is through objdump:
objdump -T <program-name>
For the executable main from above, the output should be similar to this:
main: file format elf64-x86-64 DYNAMIC SYMBOL TABLE: 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.2.5 puts 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.2.5 __libc_start_main 0000000000000000 w D *UND* 0000000000000000 __gmon_start__
The output above shows a very curious fact: even to print a simple string "Hello, world!", a dynamic library is necessary, namely the GNU C Library (glibc), since the definition of the functions puts and __libc_start_main are needed. Actually, even if you comment out the "Hello, world!" line, the program will still need a definition of __libc_start_main from glibc.
NOTE: the command nm -D main is equivalent to objdump -T main; see the manual of nm for more details.
One way to get a list with the dynamic libraries which a program needs to run is to use ldd:
ldd -v <program-name>
For the program above, this is the what the output should look like:
linux-vdso.so.1 => (0x00007fffcfdfe000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f264e47d000) /lib64/ld-linux-x86-64.so.2 (0x00007f264e85f000) Version information: ./main: libc.so.6 (GLIBC_2.2.5) => /lib/x86_64-linux-gnu/libc.so.6 /lib/x86_64-linux-gnu/libc.so.6: ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2 ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
This output is very informative: it tells us that main needs libc.so.6 (glibc) to run, and libc.so.6 needs ld-linux-x86-64.so.2 (the dynamic linker) to be loaded.
ldconfig
So far we know that ld-linux.so is responsible for loading the dynamic libraries which a program needs to run, but how does it know where to find them?
This is where ldconfig enters the scene. The ldconfig utility scans the directories where the dynamic libraries are commonly found (/lib and /usr/lib) as well as the directories specified in /etc/ld.so.conf and creates both symbolic links to these libraries and a cache (stored on /etc/ld.so.cache) containing their locations so that ld-linux.so can quickly find them whenever necessary. This is done when you run ldconfig without any arguments (you can also add the -v option to see the scanned directories and the created symbolic links):
sudo ldconfig
You can list the contents of the created cache with the -p option:
ldconfig -p
The command above will show you a comprehensive list with all the dynamic libraries discovered on the scanned directories. You can also use this command to get the version of a dynamic library on your system. For example, to get the installed version of the X11 library, you can run:
ldconfig -p | grep libX11
This is the output I obtain on my laptop (running Xubuntu 14.04; notice that dynamic library names are usually in the format <library-name>.so.<version>):
libX11.so.6 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libX11.so.6 libX11.so (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libX11.so libX11-xcb.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libX11-xcb.so.1 libX11-xcb.so (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libX11-xcb.so
In words, the output above states, for example, that the symbols required from libX11.so can be found at the dynamic library /usr/lib/x86_64-linux-gnu/libX11.so. Since the latter might be a symbolic link to the actual shared object file (i.e., the dynamic library), we can get its actual location with readlink:
readlink -f /usr/lib/x86_64-linux-gnu/libX11.so
In my system, both libX11.so and libX11.so.6 are symbolic links to the same shared object file:
/usr/lib/x86_64-linux-gnu/libX11.so.6.3.0
These symbolic links are also created by ldconfig. If you wish to only create the symbolic links but not the cache, run ldconfig with the -N option; to only create the cache but not the symbolic links, use the -X option.
As a final note on ldconfig, notice that on Ubuntu/Debian, whenever you install a (dynamic) library using apt-get, ldconfig is automatically executed at the end to update the dynamic library cache. You can confirm this fact by grepping the output of ldconfig -p for some library which is not installed in your system, then installing that library and grepping again.
Seeing ld-linux.so in action
You can see the dynamic libraries being loaded when a program is executed using the strace command:
strace ./main
The output should be similar to the one shown below (the highlighted lines show the most interesting parts; I omitted some of the output for brevity):
execve("./main", ["./main"], [/* 68 vars */]) = 0 brk(0) = 0x1d9b000 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3bf95c7000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=103686, ...}) = 0 mmap(NULL, 103686, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f3bf95ad000 close(3) = 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320\37\2\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=1845024, ...}) = 0 mmap(NULL, 3953344, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f3bf8fe1000 mprotect(0x7f3bf919d000, 2093056, PROT_NONE) = 0 mmap(0x7f3bf939c000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1bb000) = 0x7f3bf939c000 mmap(0x7f3bf93a2000, 17088, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f3bf93a2000 close(3) = 0 ... exit_group(0) = ? +++ exited with 0 +++
Comments
thanks for a very good explanation of dynamic libraries.
Do you know what libc_nonshared.a is in Linux? I'm getting __cxa_atexit: undefined reference linker error in this library in ubuntu14.04 system.
https://gcc.gnu.org/ml/gcc-help/2005-07/msg00168.html