Using strace and ltrace to help with troubleshooting on Linux


Both strace and ltrace are powerful command-line tools for debugging and troubleshooting programs on Linux: Strace captures and records all system calls made by a process as well as the signals received, while ltrace does the same for library calls.

If a program acts differently than you expect, you can use these tools to see “behind the curtain” and maybe get some clues as to what is going on.

Be forewarned, though. When you use either of these commands, you will end up with a lot of output to look through. Still, that can tell you quite a bit about how a process is working and sometimes give you important insights.

strace

To run strace against a program, use a command like what is shown below—the strace command followed by the program name. The output below has been cut short.

$ strace who
execve(“/usr/bin/who”, [“who”], 0x7ffe889f45c0 /* 41 vars */) = 0
brk(NULL)                               = 0x55e7d6720000
arch_prctl(0x3001 /* ARCH_??? */, 0x7fff38c72b20) = -1 EINVAL (Invalid argument)
access(“/etc/ld.so.preload”, R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, “/etc/ld.so.cache”, O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, “”, {st_mode=S_IFREG|0644, st_size=60195, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 60195, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fd1ba4d7000
close(3)
…

The command above runs the who command and reports on the system calls that it makes. The start of each line (e.g., execve, brk and arch_prctl) shows the system call being made.

To run strace against a running process to see what it’s doing, use the command with the -p option followed by the process ID. Note that the strace output starts in the 3rd line below after the process is attached.

$ strace -p 34512
strace: Process 34512 attached
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 34516
…

NOTE: You cannot redirect the output of the strace command through a pipe, but you can have it sent to a file using the -o option. In the example below, the date command output is shown. After that, the top of the output file is displayed using the head command.

$ strace -o outputfile date
Tue May  3 03:33:52 PM EDT 2022
$ head -10 outputfile
execve(“/usr/bin/date”, [“date”], 0x7ffc30f3bd00 /* 41 vars */) = 0
brk(NULL)                               = 0x55e78c077000
arch_prctl(0x3001 /* ARCH_??? */, 0x7ffe52bf2ec0) = -1 EINVAL (Invalid argument)
access(“/etc/ld.so.preload”, R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, “/etc/ld.so.cache”, O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, “”, {st_mode=S_IFREG|0644, st_size=60195, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 60195, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f589661b000
close(3)                                = 0
openat(AT_FDCWD, “/lib64/libc.so.6”, O_RDONLY|O_CLOEXEC) = 3
read(3, “177ELF21133>1PP4”..., 832) = 832

Remember that you’re looking at system calls, not programs. So, don’t be surprised if you don’t recognize “execve”, “brk”, “arch_prctl”, etc. You can, however, examine the man pages for these system calls if you want to learn more about what they do and how they work. Notice that the man pages for system calls come from Section 2 of the man pages.

$ man execve | head -4
EXECVE(2)                  Linux Programmer’s Manual                 EXECVE(2) <==

NAME
       execve - execute program

Another useful way to use the strace command is to add the -c option. When you do this, the output provides a summary of the system calls made ordered by percentage of time spent on each system call (largest first). As you see in the example below, more time is spent on mmap (mapping files) than on any of the other system calls.

$ strace -c date
Tue May  3 03:40:25 PM EDT 2022
% time     seconds  usecs/call     calls    errors syscall
——— —————- —————- ————- ————- ————————
 28.49    0.000106          11         9           mmap
 12.10    0.000045          11         4           openat
 10.22    0.000038           9         4           mprotect
  9.95    0.000037           6         6           newfstatat
  8.87    0.000033           8         4           pread64
  6.72    0.000025           4         6           close
  4.57    0.000017           5         3           read
  3.49    0.000013          13         1         1 access
  2.96    0.000011          11         1           munmap
  2.69    0.000010          10         1           write
  2.15    0.000008           2         3           brk
  1.88    0.000007           3         2         1 arch_prctl
  1.34    0.000005           5         1           set_tid_address
  1.34    0.000005           5         1           set_robust_list
  1.08    0.000004           4         1           lseek
  1.08    0.000004           4         1           prlimit64
  1.08    0.000004           4         1           getrandom
  0.00    0.000000           0         1           execve
——— —————- —————- ————- ————- ————————
100.00    0.000372           7        50         2 total

ltrace

The ltrace command works much the same as strace, but examines library calls. In the examples below, we see that ltrace uses basically the same syntax as strace.

This first example looks at the library calls used by the who command.

$ ltrace who
strrchr(“who”, ‘/’)                              = nil
setlocale(LC_ALL, “”)                            ="en_US.UTF-8"
bindtextdomain(“coreutils”, “/usr/share/locale”) ="/usr/share/locale"
textdomain(“coreutils”)                          ="coreutils"
__cxa_atexit(0x560c6532d220, 0, 0x560c65334940, 0) = 0
getopt_long(1, 0x7ffc1f74a468, “abdlmpqrstuwHT”, 0x560c65334960, nil) = -1
setlocale(LC_TIME, nil)                          ="en_US.UTF-8"
strlen(“en_US.UTF-8”)                            = 11
__memcpy_chk(0x7ffc1f74a1f0, 0x560c65bf6370, 12, 257) = 0x7ffc1f74a1f0
strcmp(“en_US.UTF-8”, “POSIX”)                   = 21
utmpxname(0x560c65331016, 1, 0x560c65331266, 0x560c65331266) = 0
setutxent(0x560c6533101e, 0x7f2330570cc0, 5, 0)  = 1
…

This next example shows the library calls for a running process:

$ ltrace -p 35243
sigemptyset(<>)                                  = 0
sigemptyset(<>)                                  = 0
sigaction(SIGINT, { nil, <>, 0x3f679854, 0x560643c72e53 }, { 0x560643c73f80, <>, 0x9018, 0x1000 }) = 0
ioctl(2, 21523, 0x7ffe3f679870)                  = 0
malloc(48)                                       = 0x56064548b020
strlen(“LINES”)                                  = 5
malloc(6)                                        = 0x56064548dfb0
strcpy(0x56064548dfb0, “LINES”)                  = 0x56064548dfb0
strlen(“LINES”)                                  = 5
malloc(6)

Sending ltrace output to a file requires use of the -o option.

$ ltrace -o outputfile date
Tue May  3 04:17:30 PM EDT 2022
$ head -10 outputfile
strrchr(“date”, ‘/’)                             = nil
setlocale(LC_ALL, “”)                            ="en_US.UTF-8"
bindtextdomain(“coreutils”, “/usr/share/locale”) ="/usr/share/locale"
textdomain(“coreutils”)                          ="coreutils"
__cxa_atexit(0x55e8afe3f280, 0, 0x55e8afe4e000, 0) = 0
getopt_long(1, 0x7ffdc8986948, “d:f:I::r:Rs:u”, 0x55e8afe4e920, nil) = -1
nl_langinfo(0x2006c, 0x7ffdc8986948, 0, 0)       = 0x7f3afe645435
getenv(“TZ”)                                     = nil
malloc(128)                                      = 0x55e8b05b0440
clock_gettime(0, 0x7ffdc8986740, 1, 145)         = 0

Notice that the man page for library calls comes from Section 3 of the man pages.

$ man strrchr | head -4
STRCHR(3)                  Linux Programmer’s Manual                 STRCHR(3) <==

NAME
       strchr, strrchr, strchrnul - locate character in string
$ man strlen | head -4
STRLEN(3)                  Linux Programmer’s Manual                 STRLEN(3)

NAME
       strlen - calculate the length of a string

Using the -c option with ltrace, like with strace, orders the listing with the longest run times first.

$ ltrace -c pwd
/home/shs
% time     seconds  usecs/call     calls      function
——— —————- —————- ————- ——————————
 16.02    0.000428         428         1 setlocale
 10.82    0.000289          72         4 __freading
  7.93    0.000212         212         1 puts
  6.74    0.000180          90         2 fclose
  5.39    0.000144          72         2 fileno
  5.39    0.000144         144         1 getcwd
  5.35    0.000143          71         2 __fpending
  5.28    0.000141          70         2 fflush
  4.79    0.000128         128         1 free
  4.64    0.000124         124         1 bindtextdomain
  4.49    0.000120         120         1 exit_group
  4.49    0.000120         120         1 getenv
  4.19    0.000112         112         1 getopt_long
  4.00    0.000107         107         1 textdomain
  3.97    0.000106         106         1 strrchr
  3.93    0.000105         105         1 __cxa_atexit
  2.58    0.000069          69         1 __cxa_finalize
——— —————- —————- ————- ——————————
100.00    0.002672                    24 total

Installing strace and ltrace

Commands like these are used to install strace:

$ sudo apt install strace	# Debian/Ubuntu
$ sudo yum install strace	# RHEL/Centos
$ sudo dnf install strace	# Fedora

Similarly, ltrace will install with commands like these:

$ sudo apt install ltrace	# Debian/Ubuntu
$ sudo yum install ltrace	# RHEL/Centos
$ sudo dnf install ltrace	# Fedora

Wrap-Up

Both the strace and ltrace commands provide copious output, but can prove to be very helpful, especially once you get used to the output that they provide.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Copyright © 2022 IDG Communications, Inc.



Source link