3.6. Debugging Multithreaded Programs with GDB
Debugging multithreaded programs can be tricky due to the multiple streams of execution and due to interactions between the concurrently executing threads. In general, here are some things to make debugging multithreaded programs a bit easier:
-
When possible, try to debug a version of the program with as few threads as possible.
-
When adding debugging
printf
statements to the code, print out the executing thread’s ID to identify which thread is printing and end the line with a\n
. -
Limit the amount of debug output by having only one of the threads print its information and common information. For example, if each thread stores its logical ID in a local variable named
my_tid
, a conditional statement on the value ofmy_tid
can be used to limit printing debug output to one thread, as illustrated in the following example:
if (my_tid == 1) {
printf("Tid:%d: value of count is %d and my i is %d\n", my_tid, count, i);
fflush(stdout);
}
3.6.1. GDB and Pthreads
The GDB debugger has specific support for debugging threaded programs, including setting breakpoints for individual threads and examining the stacks of individual threads. One thing to note when debugging Pthreads programs in GDB is that there are at least three identifiers for each thread:
-
The Pthreads library’s ID for the thread (its
pthread_t
value). -
The operating system’s lightweight process (LWP) ID value for the thread. This ID is used in part for the OS to keep track of this thread for scheduling purposes.
-
The GDB ID for the thread. This is the ID to use when specifying a specific thread in GDB commands.
The specific relationship between thread IDs can differ from one OS and Pthreads library implementation to another, but on most systems there is a one-to-one-to-one correspondence between a Pthreads ID, an LWP ID, and a GDB thread ID.
We present a few GDB basics for debugging threaded programs in GDB. See the following for more information about debugging threaded programs in GDB.
3.6.2. GDB Thread-Specific Commands:
-
Enable printing thread start and exit events:
set print thread-events
-
List all existing threads in the program (the GDB thread number is the first value listed and the thread that hit the breakpoint is denoted with an
*
):info threads
-
Switch to a specific thread’s execution context (for example, to examine its stack when executing
where
), specify the thread by its thread ID:thread <threadno> thread 12 # Switch to thread 12's execution context where # Thread 12's stack trace
-
Set a breakpoint for just a particular thread. Other threads executing at the point in the code where the breakpoint is set will not trigger the breakpoint to pause the program and print the GDB prompt:
break <where> thread <threadno> break foo thread 12 # Break when thread 12 executes function foo
-
To apply a specific GDB command to all or to a subset of threads, by adding the prefix
thread apply <threadno | all>
to a GDB command, wherethreadno
refers to the GDB thread ID:thread apply <threadno|all> command
This doesn’t work for every GDB command, setting breakpoints in particular, so use this syntax instead for setting thread-specific breakpoints:
break <where> thread <threadno>
Upon reaching a breakpoint, by default, GDB pauses all threads until the user
types cont
. The user can change the behavior to request that GDB only pause
the threads that hit a breakpoint, allowing other threads to continue executing.
3.6.3. Examples:
We show some GDB commands and output from a GDB run on a multithreaded executable compiled from the file racecond.c.
This errant program lacks synchronization around accesses to the shared
variable count
. As a result, different runs of the program produce different
final values for count
, indicating a race condition. For example, here are
two runs of the program with five threads that produce different results:
./a.out 5 hello I'm thread 0 with pthread_id 139673141077760 hello I'm thread 3 with pthread_id 139673115899648 hello I'm thread 4 with pthread_id 139673107506944 hello I'm thread 1 with pthread_id 139673132685056 hello I'm thread 2 with pthread_id 139673124292352 count = 159276966 ./a.out 5 hello I'm thread 0 with pthread_id 140580986918656 hello I'm thread 1 with pthread_id 140580978525952 hello I'm thread 3 with pthread_id 140580961740544 hello I'm thread 2 with pthread_id 140580970133248 hello I'm thread 4 with pthread_id 140580953347840 count = 132356636
The fix is to put accesses to count
inside a critical section,
using a pthread_mutex_t
variable.
If the user was not able to see this fix by examining the C code alone,
running in GDB and putting breakpoints around accesses
to the count
variable may help the programmer discover the problem.
Here are some example commands from a GDB run of this program:
(gdb) break worker_loop # Set a breakpoint for all spawned threads (gdb) break 77 thread 4 # Set a breakpoint just for thread 4 (gdb) info threads # List information about all threads (gdb) where # List stack of thread that hit the breakpoint (gdb) print i # List values of its local variable i (gdb) thread 2 # Switch to different thread's (2) context (gdb) print i # List thread 2's local variables i
Shown in the example that follows is partial output of a GDB run of the racecond
program with 3 threads (run 3
), showing examples of
GDB thread commands in the context of a GDB debugging session.
The main thread is always GDB thread number 1, and the three spawned threads are
GDB threads 2 to 4.
When debugging multithreaded programs, the GDB user must keep track of which
threads exist when issuing commands. For example, when the breakpoint in
main
is hit, only thread 1 (the main thread) exists. As a result, the GDB
user must wait until threads are created before setting a breakpoint for only a
specific thread (this example shows setting a breakpoint for thread 4 only at
line 77 in the program). In viewing this output, note when breakpoints are
set and deleted, and note the value of each thread’s local variable i
when
thread contexts are switched with GDB’s thread
command:
$ gcc -g racecond.c -pthread $ gdb ./a.out (gdb) break main Breakpoint 1 at 0x919: file racecond.c, line 28. (gdb) run 3 Starting program: ... [Thread debugging using libthread_db enabled] ... Breakpoint 1, main (argc=2, argv=0x7fffffffe388) at racecond.c:28 28 if (argc != 2) { (gdb) list 76 71 myid = *((int *)arg); 72 73 printf("hello I'm thread %d with pthread_id %lu\n", 74 myid, pthread_self()); 75 76 for (i = 0; i < 10000; i++) { 77 count += i; 78 } 79 80 return (void *)0; (gdb) break 76 Breakpoint 2 at 0x555555554b06: file racecond.c, line 76. (gdb) cont Continuing. [New Thread 0x7ffff77c4700 (LWP 5833)] hello I'm thread 0 with pthread_id 140737345505024 [New Thread 0x7ffff6fc3700 (LWP 5834)] hello I'm thread 1 with pthread_id 140737337112320 [New Thread 0x7ffff67c2700 (LWP 5835)] [Switching to Thread 0x7ffff77c4700 (LWP 5833)] Thread 2 "a.out" hit Breakpoint 2, worker_loop (arg=0x555555757280) at racecond.c:76 76 for (i = 0; i < 10000; i++) { (gdb) delete 2 (gdb) break 77 thread 4 Breakpoint 3 at 0x555555554b0f: file racecond.c, line 77. (gdb) cont Continuing. hello I'm thread 2 with pthread_id 140737328719616 [Switching to Thread 0x7ffff67c2700 (LWP 5835)] Thread 4 "a.out" hit Breakpoint 3, worker_loop (arg=0x555555757288) at racecond.c:77 77 count += i; (gdb) print i $2 = 0 (gdb) cont Continuing. [Switching to Thread 0x7ffff67c2700 (LWP 5835)] Thread 4 "a.out" hit Breakpoint 3, worker_loop (arg=0x555555757288) at racecond.c:77 77 count += i; (gdb) print i $4 = 1 (gdb) thread 3 [Switching to thread 3 (Thread 0x7ffff6fc3700 (LWP 5834))] #0 0x0000555555554b12 in worker_loop (arg=0x555555757284) at racecond.c:77 77 count += i; (gdb) print i $5 = 0 (gdb) thread 2 [Switching to thread 2 (Thread 0x7ffff77c4700 (LWP 5833))] #0 worker_loop (arg=0x555555757280) at racecond.c:77 77 count += i; (gdb) print i $6 = 1