Lecture 30 - Debugging with GDB

Goals

learn the basics of working with gdb the GNU Debugger

Using GDB

One of our primary tools for debugging is print statements, but they are limited

they are just a snapshot -- if we want to see values change we have to write a ton of print statements and then wade through them
if we realize there is something else it would be good to know we need to add more print statements and run the code again
they clutter our code

The debugger is a much more powerful forensic tool for finding problems.

We can operate on live code
we can set variables to be displayed after every step
we can poke around in memory in the middle of a program's execution
we can establish breakpoints at critical points in the code
etc

We will be looking at the GNU Debugger (gdb)

When debugging, we want to compile the code with two new flags

-g - This flag adds the original source code into the executable so the debugger can show us the code that is being executed in C (generally easier than looking at the assembly...). Just make sure you don't leave -g on for production code -- it makes your code slow and bloated.
-O0 - That is a capital letter 'O' and zero. That turns off all compiler optimization. The compiler can make all kinds of optimizations in our code, including eliminating loops and whole functions. This can make for a very confusing experience when we are stepping through the original C code

basic commands

run args
- short name r
- start the program and pass it args
break function name | line number (condition)
- short name b
- sets a breakpoint in the code at the function or line number. Code execution will pause when it reaches the breakpoint
- the condition can look at variable values to determine when to stop
- remove a breakpoint with clear breakpoint
- can use info break to find current breakpoints
continue
- short name c
- restarts execution after a break
next
- short name n
- goes to the next line, stepping over function calls
step
- short name s
- goes to the next line, entering any function calls
print [/f] expression
- short name p
- expression is any C expression that returns a value
- the /f is an optional formatting flag (/x will display in hex)
- many other options, but that we be enough for us
display expression
- works the same as print, but repeats after each line
- turn off with undisplay n
examine [/Nuf] expression
- short name x
- allows us to inspect memory at the address that expression resolves to
- N is the number of elements
- u the size ((b)yte, (h)alfword, (w)ord, (g)iant)
- f, format (same as print)
list (line number | function )
- shows the lines of code around the specified line, function or where we are if nothing else is specified
backtrace
- short name bt
- show the call stack
help command
- we can ask for help for about anything
- apropos can be used if we can't remember the exact name of a command
info
- query the state of the program or our session
- some options
  - break - show the active breakpoints
  - reg register - show the register state
  - locals - show all of the local variables

example

simple function

We can start by looking in the code we wrote for lecture 24

int sum(int x, int y){
  int result;
  result = x + y;
  return result;

}


int main(int argc, char * argv[]){
  int a,b,c;
  a = 1;
  b = 2;
  c = sum(a,b);
}

compile for debugging and start gdb

$ gcc -g -O0 -o func1 func1.c
$ gdb func1

look at the available functions (this will also print out functions not in the code)

(gdb) info functions
All defined functions:

File func1.c:
10:     int main(int, char **);
2:      int sum(int, int);

set a breakpoint

(gdb) b sum
Breakpoint 1 at 0x401110: file func1.c, line 4.

run the program (I trimmed out the part about debuginfo)

(gdb) r
Starting program: /home/candrews/cs202/s23/debugging/func1

Breakpoint 1, sum (x=1, y=2) at func1.c:4
4         result = x + y;

look at the original source

(gdb) list
1
2       int sum(int x, int y){
3         int result;
4         result = x + y;
5         return result;
6
7       }
8
9
10      int main(int argc, char * argv[]){

broken float

I have a program that prints out all of the floats we can get with an eight bit number. When we run it, the output looks a little strange. The negative numbers are all really big compared to the positive numbers. They should be symmetric.

The problem is the negative numbers, so let's set a breakpoint at the point where the number becomes negative.

b 21 if sign == 1

When we run the code, it will zip through all of the positive numbers and the special ones

Let's take a look at the local variables

(gdb) info locals
sign = 1
exponent = 8
mantissa = 0
result = 0

Anyone see a problem?

How can the exponent be 8? it is only 3 bits!

(gdb) list
16        int sign = (f >> 7) & 1;
17        int exponent = (f >> 4);
18        int mantissa = f & 0xF;
19        float result;
20
21        if (exponent == 0){
22          result = mantissa / 16.0f;
23
24          result = result * pow(2, exponent - 3);
25
(gdb) p /x f
$2 = 0x80
(gdb) p f >> 4
$3 = 8

why are we getting an 8? We aren't masking out the three bits we want, so we are getting the sign bit as well.

Debugging assembly

What if you are working with code that doesn't have debugging data included? gdb works there as well

More commands

stepi and nexti - just like step and next, they just advance by machine instruction (short forms si and ni)
disassemble function name
- print out the assembly for the function
info reg
- can specify specific registers by putting a $ in front of them (e.g., $rax)
- we can look at the condition codes with info reg eflags
break * - break works the same way, but we need to put an * in front of the name or address to specify an instruction instead of a line number

general strategy

if we know function names we can set breakpoints in the usual way there is always a main, so you can start there we can disassemble main to find the function calls

example

I have a mystery program we are trying to figure out If I run it, I get this:

$ ./mystery
Usage: ./mystery <number>
$ ./mystery 42
3

Curious.

So, let's poke around in gdb to see if we can't figure out what the program does

Start by disassembling main

(gdb) disassemble main
Dump of assembler code for function main:
   0x000000000040116b <+0>:     push   %rbp
   0x000000000040116c <+1>:     mov    %rsp,%rbp
   0x000000000040116f <+4>:     sub    $0x20,%rsp
   0x0000000000401173 <+8>:     mov    %edi,-0x14(%rbp)
   0x0000000000401176 <+11>:    mov    %rsi,-0x20(%rbp)
   0x000000000040117a <+15>:    cmpl   $0x2,-0x14(%rbp)
   0x000000000040117e <+19>:    je     0x4011a3 <main+56>
   0x0000000000401180 <+21>:    mov    -0x20(%rbp),%rax
   0x0000000000401184 <+25>:    mov    (%rax),%rax
   0x0000000000401187 <+28>:    mov    %rax,%rsi
   0x000000000040118a <+31>:    mov    $0x402010,%edi
   0x000000000040118f <+36>:    mov    $0x0,%eax
   0x0000000000401194 <+41>:    call   0x401030 <printf@plt>
   0x0000000000401199 <+46>:    mov    $0xffffffff,%edi
   0x000000000040119e <+51>:    call   0x401050 <exit@plt>
   0x00000000004011a3 <+56>:    mov    -0x20(%rbp),%rax
   0x00000000004011a7 <+60>:    add    $0x8,%rax
   0x00000000004011ab <+64>:    mov    (%rax),%rax
   0x00000000004011ae <+67>:    mov    %rax,%rdi
   0x00000000004011b1 <+70>:    call   0x401040 <atoi@plt>
   0x00000000004011b6 <+75>:    mov    %eax,-0x4(%rbp)
   0x00000000004011b9 <+78>:    mov    -0x4(%rbp),%eax
   0x00000000004011bc <+81>:    mov    %eax,%edi
   0x00000000004011be <+83>:    call   0x401146 <calculate>
   0x00000000004011c3 <+88>:    mov    %eax,-0x8(%rbp)
   0x00000000004011c6 <+91>:    mov    -0x8(%rbp),%eax
   0x00000000004011c9 <+94>:    mov    %eax,%esi
   0x00000000004011cb <+96>:    mov    $0x402024,%edi
   0x00000000004011d0 <+101>:   mov    $0x0,%eax
   0x00000000004011d5 <+106>:   call   0x401030 <printf@plt>
   0x00000000004011da <+111>:   mov    $0x0,%eax
   0x00000000004011df <+116>:   leave
   0x00000000004011e0 <+117>:   ret
End of assembler dump.

Start by looking at the call instructions.

There are five function calls

2x printf - so printing things out
exit - we haven't used this yet, but it does what it says -- exits the program
atoi - we haven't used this one either, but it is another standard library function (it does the same thing as strtol)
calculate - that looks like an actual function in the code

We can also see a conditional on +19. If we trace that back a little we can see that it is checking if argc is 2. The printf and the exit is probably printing the usage message and exiting.

We can check this. Right before the call to printf, we can see it loading the argument registers. One of them appears to be getting the address of the first string in the argv.

let's take a look at the second one. One of the formats we can use is s for C strings

(gdb) x /s 0x402010
0x402010:       "Usage: %s <number>\n"

Confirmed, we are printing the usage message

So, jumping down under that to main+56 we can see it unpacking argv again. It is adding 0x8, so we are probably looking at the second argument now. Then it calls atoi, so it is parsing the number string into a number.

It looks like calculate only takes a single argument

Set a breakpoint right before the call and then start the program

(gdb) b *main+83
Breakpoint 1 at 0x4011be
(gdb) r 42
Starting program: /home/candrews/cs202/s23/debugging/mystery 42
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 1, 0x00000000004011be in main ()

Check out the argument

(gdb) p $edi
$1 = 42

Okay, so that is confirmed, we are going to call calculate(42)

Let's set a new breakpoint and jump to calculate (yes, we could just step there)

(gdb) b calculate
Breakpoint 2 at 0x40114a
(gdb) c
Continuing.

Breakpoint 2, 0x000000000040114a in calculate ()

Let's take a look around:

(gdb) disassemble
Dump of assembler code for function calculate:
   0x0000000000401146 <+0>:     push   %rbp
   0x0000000000401147 <+1>:     mov    %rsp,%rbp
=> 0x000000000040114a <+4>:     mov    %edi,-0x14(%rbp)
   0x000000000040114d <+7>:     movl   $0x0,-0x4(%rbp)
   0x0000000000401154 <+14>:    mov    -0x14(%rbp),%eax
   0x0000000000401157 <+17>:    and    $0x1,%eax
   0x000000000040115a <+20>:    add    %eax,-0x4(%rbp)
   0x000000000040115d <+23>:    sarl   -0x14(%rbp)
   0x0000000000401160 <+26>:    cmpl   $0x0,-0x14(%rbp)
   0x0000000000401164 <+30>:    jne    0x401154 <calculate+14>
   0x0000000000401166 <+32>:    mov    -0x4(%rbp),%eax
   0x0000000000401169 <+35>:    pop    %rbp
   0x000000000040116a <+36>:    ret
End of assembler dump.

Notice that we have skipped the stack frame setup steps

We could just try to walk through this and handle the values in our head, but let's use the debugger

First we will set up some values to watch

(gdb) display $eax
1: $eax = 42
(gdb) display /wd $rbp-0x14
2: x/dw $rbp-0x14  0x7fffffffd74c:      0
(gdb) display /wd $rbp-0x4
3: x/dw $rbp-0x4  0x7fffffffd75c:       32767
(gdb) display /i $pc
4: x/i $pc
=> 0x40114a <calculate+4>:      mov    %edi,-0x14(%rbp)

Notice how we turned the memory accesses into expressions. You should also notice that the current values in memory are garbage...

The last display is showing the contents of the PC as an instruction so we can see what the next instruction will be

(gdb) ni
0x000000000040114d in calculate ()
1: $eax = 42
2: x/dw $rbp-0x14  0x7fffffffd74c:      42
3: x/dw $rbp-0x4  0x7fffffffd75c:       32767
4: x/i $pc
=> 0x40114d <calculate+7>:      movl   $0x0,-0x4(%rbp)
(gdb) ni
0x0000000000401154 in calculate ()
1: $eax = 42
2: x/dw $rbp-0x14  0x7fffffffd74c:      42
3: x/dw $rbp-0x4  0x7fffffffd75c:       0
4: x/i $pc
=> 0x401154 <calculate+14>:     mov    -0x14(%rbp),%eax
(gdb) ni
0x0000000000401157 in calculate ()
1: $eax = 42
2: x/dw $rbp-0x14  0x7fffffffd74c:      42
3: x/dw $rbp-0x4  0x7fffffffd75c:       0
4: x/i $pc
=> 0x401157 <calculate+17>:     and    $0x1,%eax
(gdb)
0x000000000040115a in calculate ()
1: $eax = 0
2: x/dw $rbp-0x14  0x7fffffffd74c:      42
3: x/dw $rbp-0x4  0x7fffffffd75c:       0
4: x/i $pc
=> 0x40115a <calculate+20>:     add    %eax,-0x4(%rbp)
(gdb) ni
0x000000000040115d in calculate ()
1: $eax = 0
2: x/dw $rbp-0x14  0x7fffffffd74c:      42
3: x/dw $rbp-0x4  0x7fffffffd75c:       0
4: x/i $pc
=> 0x40115d <calculate+23>:     sarl   -0x14(%rbp)
(gdb)
0x0000000000401160 in calculate ()
1: $eax = 0
2: x/dw $rbp-0x14  0x7fffffffd74c:      21
3: x/dw $rbp-0x4  0x7fffffffd75c:       0
4: x/i $pc
=> 0x401160 <calculate+26>:     cmpl   $0x0,-0x14(%rbp)

It looks like we are in a loop

AND the value with 1
ADD the result to a second variable
shift the value right 1

So this is... counting the 1s in the binary representation of the number!

(gdb) p /x 42
$1 = 0x2a

42 is 0x2a or 0010 1010, thus the 3

Mechanical level

vocabulary

Skills

Last updated 05/12/2023