Debugging Tip – Use gdb To Map Assembly With Source Code

With the latest version of gdb (version 7 and above), one can easily map the source code and the assembly listing.  The /m switch when used with the disassemble command gives the assembly listing along with the source code when available.

(gdb) help disassemble
Disassemble a specified section of memory.
Default is the function surrounding the pc of the selected frame.
With a /m modifier, source lines are included (if available).

A sample listing of the disassemble command with the /m switch is pasted below:

(gdb) disassemble /m

11 int y = x +33;
0x004010c1 <+49>: mov -0x4(%ebp),%eax
0x004010c4 <+52>: add $0x21,%eax
0x004010c7 <+55>: mov %eax,-0x8(%ebp)

12 printf("%d\n",y);

0x004010ca <+58>: mov -0x8(%ebp),%eax
0x004010cd <+61>: mov %eax,0x4(%esp)
0x004010d1 <+65>: movl $0x402020,(%esp)
0x004010d8 <+72>: call 0x401198 <printf>

13 return 0;
0x004010dd <+77>: mov $0x0,%eax

14 }
0x004010e2 <+82>: leave
0x004010e3 <+83>: ret 

Note that the assembly language block corresponds to the source line just above it.

gdb – Assembly Language Debugging 101

This article introduces the basic commands required to debug assembly language in gdb.

Note: If you would like to understand an assembly language listing, jump to this article first.

Disassemble

The disassemble command provides the assembly language listing of a program and works even when a program is not running. The command works for a function name / an address / an address range.

disassemble function_name disassembles the function called function_name.

disassemble main

Dump of assembler code for function main:
0x00401180 <main+0>:    lea    0x4(%esp),%ecx
0x00401184 <main+4>:    and    $0xfffffff0,%esp
0x00401187 <main+7>:    pushl  -0x4(%ecx)
0x0040118a <main+10>:   push   %ebp
0x0040118b <main+11>:   mov    %esp,%ebp
0x0040118d <main+13>:   push   %esi
0x0040118e <main+14>:   push   %ebx
------8<---------<snip>---------8<---------------

The disassemble command can also be used for a specific address.

disassemble 0x0040120f

At times the disassembly listing of a function can get very long and to limit it, an address range can be provided as shown below.

disassemble <from_address1> <to_address2>

disassemble main main+20

disassemble 0x004011ce 0x004011f7

Developers who have debugged in assembly on various debuggers on the Windows platform may prefer the Intel instruction set instead instead of the At&T set which is the default in gdb.  The listing can be changed to use the Intel instruction set instead by setting the disassembly-flavor.

set disassembly-flavor intel

disassemble

Dump of assembler code for function main:
0x00401180 <main+0>:    lea    ecx,[esp+0x4]
0x00401184 <main+4>:    and    esp,0xfffffff0
0x00401187 <main+7>:    push   DWORD PTR [ecx-0x4]
0x0040118a <main+10>:   push   ebp
0x0040118b <main+11>:   mov    ebp,esp
0x0040118d <main+13>:   push   esi
0x0040118e <main+14>:   push   ebx
-----8<---------<snip>---------8<--------------

Controlling The Flow Of Program Execution

An instruction breakpoint can be set at a particular address using the breakpoint command.

breakpoint *0x0040118d

Take note of the asterix just before the address above.

To step into the assembly language one instruction at a time use the command:

stepi

Note that this will step into function calls that the program encounters.

To step over a function call, one can use the command:

nexti

To return from a function call that one is current stepping through, use the command:

finish

Gathering Information

To know about the values of the registers of the program being debugged use the following command:

info registers

$pc holds the program counter and it can also be used to find the instruction that will be executed next.

x/i $pc

0x40124e <main+206>:    mov    eax,ebx

Similar to regular debugging, the backtrace command prints the callstack.

bt

To get a list of the shared libraries that are loaded for the current program being debugged the following command is handy to use:

info sharedlibrary

From        To          Syms Read   Shared Object Library
0x779e1000  0x77b1bc3c  Yes         /cygdrive/c/Windows/system32/ntdll.dll
0x77901000  0x779d30c0  Yes         /cygdrive/c/Windows/system32/kernel32.dll
0x75c41000  0x75c895d0  Yes         /cygdrive/c/Windows/system32/KernelBase.dll
0x61001000  0x61450000  Yes         /usr/bin/cygwin1.dll
-----8<---------<snip>---------8<--------------

 

Note: A handy tip on using the display command during assembly language debugging with gdb was shared in a previous blog entry here.

 

Tips For Investigating Performance Issues

Performance issues usually take the longest to resolve and require a lot of patience, luck and testing iterations before a satisfactory outcome is reached.  Performance tuning is a part of most development projects that have a substantial code base or do a very CPU intensive job.

This article provides few basic tips that development and testing teams can perform to reduce the time required to resolve such issues.

Reproduce The Performance Issue

Well this is true for any bug but more so for performance issues.    Time reported in an original bug report may not necessarily match with that on a tester’s machine which in turn won’t necessarily match with that on a developer’s machine.   These numbers depend on the state of a machine, the hardware used and the version of the software being used to report the problem.

Therefore it is important to recalibrate the numbers with the latest sources and see whether the percentage of degradation reported matches with the original bug report or not.

Find A Build Or Change That Introduced The Issue

A performance issue does not necessarily require intense profiling to arrive at the cause of the issue.   At times it is important to be able to identify the build number or the change in source code that has caused the degradation.   Knowing a range of changes or the exact change that caused the degradation amounts to solving half the problem.

The maximum time taken to resolve performance problems is finding the bottleneck and the rest goes into solving the problem itself.    A profiler is not the only quickest means to find the bottleneck.   The testing team can easily identify the build where the problem was first introduced.  Likewise, the development team should try to narrow down on the exact source change that introduced the problem.

Use Release Builds Only

Do not compute performance results or perform investigations on a debug build. With optimizations turned off and extra debug code, numbers from debug builds are neither accurate nor reliable.

The reason many teams use the debug build is because the profiles generated using debug symbols look more meaningful to them as programmers. Use a release build with debug symbols instead.

Use The Same Machine

If you are comparing performance between two different versions, use the same machine to compute the numbers. This ensures that differences in processor speed, state of the machine, etc do not play a role in the difference in timings.

Catch Issues Early

The testing team should regularly run performance tests and report issues immediately. This is a huge time saver and helps narrow down the problematic code early.   All results and binaries used during the testing should be preserved so that if any issue if found late in the cycle, it should still be possible to narrow down the problematic build number.

Performance suites take time to evolve but the time spent in putting one together is worth the dividends it pays in the long run.

Profile And Know Your Profiler

Though it may sound simple to narrow down the source and solve a performance issue, it is pretty common practice to use a profiler to determine the bottlenecks in code.   The idea is not to jump to the step of profiling without trying to narrow down problems through other means.  Once you know you require a profiler, make sure you understand the technique your profiler uses behind the scenes.

Some profilers “instrument” code and insert extra code to collect the timing information. Profilers of this category are accurate for relative comparison but may crash if they do a bad job at instrumenting the code.

Some profilers “sample” the state of the program by collecting the call stack of the program at regular intervals. These profilers indicate the area of the code where the program spends the maximum time without modifying the running program. Here the sample interval and duration of sampling determine the usefulness of the generated profile.

It is important to understand how your profiler collects the data which further aids in its better interpretation.

Conclusion

Performance and profiling do go together but it helps to execute a few steps and to setup some processes in a development cycle that quickly and accurately help in narrowing down the source of the problem.

Debugging – Modifying Code At Runtime

Introduction

The build and debug cycle can be tedious especially when you are unsure whether the change you have in mind solves your problem or not. Sometimes it is good to be able to tweak the logic of your code during a debugging session. This article explains how to make minor adjustments to code without having to rebuild your application.

The article requires tweaking the assembly and dealing with opcodes. If you are not very familar with debugging in assembly, you may want to read this article first. The tips shared here are also useful while debugging libraries where you either do not have the source code or the build environment is unavailable.

Debuggers work on the common principle of launching the executable as a child program and requesting special permissions from the operating system to access and alter the state of the program. This allows it to show and/or modify the memory, registers, code, etc in a debugging session. If a debugger attaches itself to another program already loaded in memory, it requests special permission from the operating system to take control of the program for the purpose of debugging. It will now seem very obvious why the same program in memory cannot be attached by two debuggers at the same time. The operating system caters to the request on a first come first basis and refuses permissions to the second debugger.

The debugger’s access to a program’s memory and registers not only allows users to view the state of the program but also set software breakpoints. Software breakpoints are set by altering and restoring the instructions in memory and has already been covered indepth in an earlier article.

How to modify code at runtime

Altering instructions without compilation of code during debugging works on a similar logic. If one uses the debugger to alter instructions of the program which it controls, then one can essentially change the logic compiled into the executable that is currently loaded in memory. It is worth noting that altering a program in memory is temporary. When the program is restarted and reloaded in memory, all changes are lost and no real harm has been done. As this requires directly tweaking the opcodes in memory, one needs to know how to deal with assembly and this cannot be accomplished with knowledge of a high level language alone.
Independent of the operating system and debugger, the steps needed to change the logic from within the debugger are as follows -

  1. Launch the program in a debugger.
  2. Set a breakpoint in the code at the location which you want to alter execution.
  3. Execute the program and drive the program so that your breakpoint is hit.
  4. Request the debugger to display the disassembly of the code.
  5. Get comfortable with the source-assembly mapping.
  6. Identify the address of the assembly line you would like to alter. Altering usually means changing an if statement to if NOT or “jump if zero” to “jump if not zero”.
  7. Take the address and dump the memory contents at that address. You will see opcodes corresponding to the assembly you just saw.
  8. Modify the memory location with new opcodes (more on this later).
  9. Cross your fingers so that your change won’t cause a crash :-)
  10. Disable the breakpoint so that execution doesn’t stop too many times at this breakpoint as it makes debugging distracting.
  11. Continue execution and now the program will respond to the new logic in the program.
  12. To undo the effect of change, restart the program.

Example

Let us look at the following simple snippet of code below.


bool flag = true;
if( !flag )
cout &lt;&lt; "flag is false" ;
else
cout &lt;&lt; "flag is true" ;


The code when compiled and run normally would print “flag is true” and this is obvious from the code above. Say we at runtime would like to alter the logic by replacing if(!flag) with if(flag) so that the statement “flag is false” is printed instead.
Let us see how to do this with the two most commonly available debuggers.

GDB

The steps below are in the same order as the generic steps explained earlier.

  • gdb <myprogram>
  • break <line_number_of_if_statement>
  • run   Code will stop at the breakpoint set above.
  • disassemble. The dump on my machine is as follows -

disassemble

0x00401170 <main+32>:   call   0x40f260 <_alloca>
0x00401175 <main+37>:   call   0x410410 <__main>
0x0040117a <main+42>:   movb   $0x1,-0x1(%ebp)
0x0040117e <main+46>:   cmpb   $0x0,-0x1(%ebp)
0x00401182 <main+50>:   jne    0x40119a <main+74>
0x00401184 <main+52>:   movl   $0x443000,0x4(%esp)
0x0040118c <main+60>:   movl   $0x4463a0,(%esp)
0x00401193 <main+67>:   call   0x43e720 <_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc>
0x00401198 <main+72>:   jmp    0x4011ae <main+94>
0x0040119a <main+74>:   movl   $0x44300e,0x4(%esp)
0x004011a2 <main+82>:   movl   $0x4463a0,(%esp)
0x004011a9 <main+89>:   call   0x43e720 <_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc>
0x004011ae <main+94>:   mov    $0x0,%eax
  • Note that main+67 and main+89 are calls to cout. We have now identified the two important statements and the if check should be just above it. It usually helps to corelate the assembly with the source code in this manner.
  • The if check is on lines main+42, main+46 and main+50 (if !(flag==0) ). If we change jne to je, we can negate the if condition and thus alter the logic of the code. The address of the opcode jne above is 0×00401182.
  • x/6x 0×00401182 will dump the opcodes and operands in hexadecimal. In the dump below, 0×75 is the opcode for jne and the operand is 0×16. For details, refer to the Intel instruction set.
0x401182 <main+50>: 0x75 0x16 0xc7 0x44 0x24 0x04
  • set *(char*)0×00401182=0×74 where 0×74 is the opcode for je. Effectively “jump if not equal” has been changed to “jump if equal” thereby inverting the logic of the if statement.
  • cross your fingers.
  • disable 1. Disable breakpoint set earlier.
  • continue

You will observe that as we have modified the runtime code using gdb, the output of the program after the above steps would be “flag is false”.

Visual Studio

Assembly Screen Shot in Visual Studio
The above screen shot shows the debugger displaying the assembly after hitting the breakpoint at the if statement. Do make sure that when you right click in the assembly, “Show Code Bytes” and “Show Address” are checked.
From the screen shot above, the address of jne instruction is 0x003314B8 and its opcode is 0×75. If we modify the opcode at location 0x003314B8 and change it to 0×74 (opcode for je), we effectively negate the if statement at runtime.

To do this, open the memory window (Debug->Windows->Memory->Memory 1) and enter 0x003314B8 in the Address field. The first byte shows as 0×75. Put the cursor on 0×75 and type 74 to change the opcode. You can confirm this by seeing the effect in the disassembly window.

This modifies the code in memory and now when the program execution is continued, the output would be “flag is false”.

Conclusion

Another useful tip to keep in mind is that a lot of code can be disabled at runtime if all corresponding opcodes and operands are replaced with NOPs which has the opcode 0×90 on an Intel machine.  Similarly useful results can also be achived by altering the operands instead of the opcodes.

The ability to change code at runtime is an effective tool for quick debugging though it requires a good understanding of the code at the assembly level.   This technique is meant for small tweaks because it is faster to recompile in case complex changes in assembly are needed.   Moreover, when the program finishes execution, the modification needs to be redone as this technique does not modify the image on the disk.

Assembly And The Art Of Debugging

This article is an introduction to understanding Intel’s assembly language while debugging a computer program.
Debugging in assembly can be a daunting task and not every developer likes to make sense of the mnemonics that appear on the screen.  While debugging, one is normally used to viewing local variables, parameters  and syntax highlighted code in the most user friendly manner and that is the primary reason why many developers dislike debugging in assembly initially.
Why Debug In Assembly?
1.  You have a binary and its corresponding code but the build environment is not functional.  This could be because of missing tools, unavailable compilers and the lack of understanding of complex scripts and steps needed to recreate the binary.
2.  You are dealing with a release build only bug and the debug symbols are unavailable or you are dealing with a third party library and don’t have access to the source code.
3.  You like to have fun.
What You Need?
1.  A debugger that can display assembly language (Visual Studio, WinDbg, gdb, XCode are capable debuggers).
2.  A ready reference to the Intel instruction set.  Google works just fine.
3.  Patience and maybe a link to this article.  :-)
The key to understanding debugging in assembly is to understand how functions are called, parameters are passed, local variables accessed.  The rest can be understood by referring the Intel Instruction set while debugging the assembly code.
The diagram below is roughly how a stack should look like once a function call has been made.  When one encounters assembly, mapping the code to this diagram should simply the debugging experience.  Note that the stack grows downwards.
Passing Parameters To A Function
Parameters are the first to be pushed onto the stack.  The caller of the function pushes them from right to left.
Cleanup Of Parameters
Parameter cleanup may happen in the function that was called (e.g. stdcall calling convention) or by the caller of the function (e.g. cdecl).  Various kinds of calling conventions of the x86 architecture are explained in detailed in this wiki article. http://en.wikipedia.org/wiki/X86_calling_conventions  Looking right after the function call gives a good idea of what convention was being used when compiling the code.
Take note that if cleanup happens in the function being called, variable number of arguments cannot be passed to it.  Whereas it becomes possible to do this if the parameter cleanup happens outside the function.  This essentially is the main difference between stdcall and cdelc calling conventions on a x86 architecture.
The Function Call
Before control is passed to the function, the return address where the program is supposed to resume once the function has finished execution is pushed to the stack.
To recap, parameters are pushed first, followed by the return address.  Later depending on the calling convention of the function, you may see cleanup hhappening right after the function call or within the function itself.
In the diagram above, as the stack grows downwards, the parameters are at the top followed by the return address.
Example
Take a look at the snippet and the equivalent assembly below :
The Function
The function usually starts with a prolog and ends with an epilog
The prolog is either a simple ENTER instruction or more commonly, it saves the ebp register and copies the stack pointer value in it so as to use it as frame pointer.
push        ebp
mov         ebp,esp
The epilog just reverses what the prolog had done.  It is implented by a LEAVE instruction or is implemented as follows
mov         esp,ebp
pop         ebp
The ret instruction returns from the function to the return address stored in the stack.
A compiler switch allows one to omit the frame pointer (fpo – frame pointer omission) which effectively removes the prolog and epilog and uses the ebp register for other optimizations.  It is easy to demarcate function boundaries where the frame pointer is not omitted but one should be aware of the absense of these entry and exit points.
Allocation Of Local Variables
Local variables are allocated on the stack.  The total size of the local variables is computed at compile time and at runtime those many bytes are reserved on the stack.
004113A0  push        ebp
004113A1  mov         ebp,esp
004113A3  sub         esp,0E8h
Note that in the example above, 232 (0xE8) bytes are being reserved for local variables.
The above code will also help in understanding why local variable allocation is much faster than requesting memory from heap.
Local Variables And Parameters In Assembly
The most important part in assembly is to be able to identify access to locals and parameters.  The frame pointer that is set in the prolog, acts as the reference pointer using which all variables can be accessed.  If you add to the frame pointer (Remember : P for Plus and P for Parameters), you will access parameters.  If you subtract from the frame pointer, you will be able to access local variables.
mov         byte ptr [ebp-20h],3
mov         byte ptr [ebp+20h],5
The first line above access a local variable whereas the second accesses a parameter.
In the case of frame pointer omission, everything is calculated with respect to the stack pointer.  Therefore [esp + 20h] might refer to a local or a parameter depending on where the stack pointer currently points to.  And if say a register is pushed on the stack, the same variable will now be referred using [esp + 24h].  Debugging functions that have optimized out the frame pointer is not that easy as the changes made to the stack pointer need to be constantly tracked.

This article is an introduction to understanding Intel’s assembly language while debugging a computer program.

Debugging in assembly can be a daunting task and not every developer likes to make sense of the mnemonics that appear on the screen.

While debugging, one is normally used to viewing local variables, parameters  and syntax highlighted code in the most user friendly manner.  This is the primary reason why many developers dislike debugging in assembly initially.

Why Debug In Assembly?

Debugging in assembly is not an optional skill to have.   Every developer encounters a situation where there is no other alternative other than cracking open the assembly code.  Here are a few reasons why a developer should get their hands dirty with this skill -

  1. You have a binary and its corresponding code but the build environment is not functional.  This could be because of missing tools, unavailable compilers or the lack of understanding of complex scripts and steps needed to recreate the binary.
  2. You are dealing with a release build only bug and the debug symbols are unavailable or you are dealing with a third party library and don’t have access to its source code.
  3. You like to have fun :-)

What You Need?

  1. A debugger that can display assembly language (Visual Studio, WinDbg, gdb, XCode are capable debuggers).
  2. A ready reference to the Intel instruction set.  Google works just fine.
  3. Patience and maybe a link to this article.  :-)

The key to debugging in assembly is to understand how functions are called, parameters are passed, local variables accessed.  The code flow and logic can be understood by looking up the Intel instruction set .

The diagram below (click for larger image)  is roughly how a stack should look like when a function is called.  Mapping the assembly code to this diagram should simplify the debugging experience.  Note that the stack grows from top to bottom.

assembly_stack

Passing Parameters To A Function

Parameters are the first to be pushed onto the stack.  The caller of the function pushes them from right to left.

Cleanup Of Parameters

Parameter cleanup may happen in the function that was called (e.g. stdcall calling convention) or by the caller of the function (e.g. cdecl).  Various kinds of calling conventions of the x86 architecture are explained in detail in this wiki article.   Stack cleanup code right after the function call gives a good idea of  the calling convention being used.

Take note that if cleanup happens in the function being called, variable number of arguments cannot be passed to it.  In contrast, variable number of arguments can be implemented if the parameter cleanup happens outside the function.  This essentially is the main difference between stdcall and cdelc calling conventions on a x86 architecture.

The Function Call

Before control is passed to the function, the return address (where the caller is supposed to resume after the function completes execution) is pushed on the stack.

To recap, parameters are pushed first, followed by the return address.  Later depending on the calling convention of the function, you may see cleanup happening right after the function call or within the function itself.

In the stack diagram above, as the stack grows downwards, the parameters are at the top followed by the return address.

Example

Take a look at the snippet and the equivalent assembly below :

[1]int a = 4;
[2]mov         dword ptr [a],4
[3]
[4]char c = 0;
[5]mov         byte ptr ,0
[6]
[7]c = f(a, 22);
[8]push        16h
[9]  mov         eax,dword ptr [a]
[10]  push        eax
[11]call        f (4111D1h)
[12]add         esp,8 
[13]mov         byte ptr ,al
  • In line [8], the constant value 22 (16 hex) is being pushed on the stack.
  • In line [9-10] the variable ‘a’ is pushed.
  • Line [11] is the function call to f() which implicitly pushes the return address.
  • The return value of the function is passed in a register and copied in variable ‘c’ (line[13]).
  • In line [12] one can see that 8 bytes are being discarded from the stack as f() uses the cdecl calling convention and cleanup needs to happen in the caller function.

The Function

The function usually starts with a prolog and ends with an epilog.

The prolog is either a simple ENTER instruction or more commonly, it saves the ebp register and copies the stack pointer value in it so as to use it as frame pointer.

push        ebp
mov         ebp,esp

The epilog just reverses what the prolog had done.  It is implented by a LEAVE instruction or is implemented as follows

mov         esp,ebp
pop         ebp

The ret instruction passes control from the function to the return address stored in the stack.

Compilers provide a switch that allows one to omit the frame pointer (fpo or frame pointer omission) which effectively removes the prolog and epilog and uses the ebp register for other optimizations.  It is easy to demarcate function boundaries where the frame pointer is present but one should be aware that the entry and exit points may be absent.

Allocation Of Local Variables

Local variables are allocated on the stack.  The total size of the local variables is computed at compile time and at runtime those many bytes are reserved on the stack.

004113A0  push        ebp
004113A1  mov         ebp,esp
004113A3  sub          esp,0E8h

Note that in the example above, 232 (0xE8) bytes are being reserved for local variables.

The above code will also help in understanding why local variable allocation is much faster than requesting memory from heap.   Allocating local variables requires moving the stack pointer whereas memory heap management is much more complex.

As the return address and local variable area are very close to each other, buffer overflows can be caused if data is written past the the local variable area which can then overwrite the the return address.   When this return address if forced to point to shell code (or to some other code placed intentionally by a hacker), such a buffer overflow is termed as an exploit.

Local Variables And Parameters In Assembly

The most important part in assembly is to be able to identify access to locals and parameters.  The frame pointer that is set in the prolog, acts as the reference pointer using which all variables can be accessed.  If you add to the frame pointer (Remember : P for Plus and P for Parameters), you will be able to access parameters.  If you subtract from the frame pointer, you will be able to access local variables.

mov         byte ptr [ebp-20h],3
mov         byte ptr [ebp+20h],5

The first line above accesses a local variable whereas the second accesses a parameter.

In the case of frame pointer omission, everything is calculated with respect to the stack pointer.  Therefore [esp + 20h] might refer to a local or a parameter depending on where the stack pointer currently points to.  And if say a register is pushed on the stack, the same variable will now be referred using [esp + 24h].  Debugging functions that have optimized out the frame pointer is not that easy as the changes made to the stack pointer need to be constantly tracked.

Conclusion

Debugging in assembly is not only fun but a useful tool to debug difficult problems.  Different debuggers provide different interfaces to interact with the assembly code but under the hood, all programs work alike.   Understanding this is the key to debugging in assembly with ease.

Cross-platform Debugging Cheat Sheet

If you work on multiple platforms and use different debuggers, you are expected to know the debugger’s user interfaces well enough.  At times this gets confusing especially if you have one primary  platform and you work on other platforms rather infrequently.

I have compiled a list of my favorite features in a debugger and how to invoke them on different debuggers (Visual Studio, XCode, gdb and Windbg).

This is not a substitute for the debugger’s documentation but helpful for quickly switching to an unfamiliar debugging environment.  Click the image below for viewing the table or download the PDF version as a ready reference.

Debugger Cheatsheet

Debugger Cheat Sheet (Download PDF)

Please note :

  • The commands (especially for gdb) are not necessarily complete and the debugger’s help should be consulted for detailed usage.
  • The list is not comprehensive and I have only put in my favorite commands that I use while debugging.
  • A square bracket [ ] denotes a keyboard shortcut.

Debugging – Types Of Data Breakpoints In GDB

Data breakpoints are now becoming a part of common breakpoint vocabulary. They help in detecting heap corruption, inadvertent data overwrites and writing past buffer boundaries.

Most programmer’s restrict the definition of data breakpoints to breakpoints that help halting the execution of code in the debugger when memory is written to. This is the kind of breakpoint that helps in catching most of the corruption bugs.

GDB provides data breakpoints (at least on Intel platforms) that do slightly more than that. As data breakpoints are implemented through hardware assistance, it is the hardware platform that provides the different kinds of data breakpoints and the debuggers provide the required interface.

watch

watch is gdb’s way of setting data breakpoints which will halt the execution of a program if memory changes at the specified location.

watch breakpoints can either be set on the variable name or any address location.

watch my_variable
watch *0×12345678
where 0×12345678 is a valid address.

Usually a crash because of heap corruption or invalid outcome due to buffer overruns shows up in the debugger when it is too late to figure out what went wrong. The watch or write data breakpoints can be used to find when a memory location has changed. The debugger at that very instant shows the reason for the inadvertent change.

The cause usually is double deletion of memory, writing to deleted memory, writing past the buffer boundary, etc. In order to fix such issues, it is more important to know when the corruption happens than to know what happens when the corruption has taken place.

Another interesting use of write data breakpoints is to find out the cause of memory leaks in reference counted objects by monitoring the increase and decrease in the reference count of the objects.

rwatch

rwatch (read-watch) breakpoints break the execution of code when the program tries to read from a variable or memory location.

rwatch iWasAccessed
rwatch *0×12345678
where 0×12345678 is a valid address.

For example, say you are new to your project and would like to figure out where exactly are your encryption routines in code. As a start, you can first search your codebase but you may hit a few false positives. If you know the memory location of your password and you set your rwatch breakpoints correctly, it would not be long before the debugger breaks execution right in your encryption algorithms which have to read the password in order to perform their function. Getting evil ideas already?

Another use of read data breakpoints is finding out the code that is reading from memory that has already been deleted and is using this corrupt information later.

awatch

awatch or access watches break execution of the program if a variable or memory location is written to or read from. In summary, awatches are watches and rwatches all in one. It is a handy way of creating one breakpoint than two separate ones.

awatch *0×12345678
where 0×12345678 is a valid address.

The problem with data breakpoints like any other breakpoint is that they can be triggered far too many times and the programmer may lose track of the problem being debugged. To ensure the breakpoints are hit the least number of times, the data being worked upon should be minimal and the breakpoint should be set as late as possible.

Software Breakpoints

Line breakpoints can be implemented either in hardware or software.  This article discusses the latter in detail.

It is very useful to be able to break execution of code at a line number of your choice.  Breakpoints are provided in debuggers to do exactly that.  It is fun getting to the root of the problem by setting breakpoints in a debug session.  It is even more fun to know how do breakpoints work in the first place.

Software breakpoints work by  inserting a special instruction in the program being debugged.  This special instruction on the Intel platform is “int 3″.  When executed it calls the debugger’s exception handler.

Example

Let us look at a very simple example that inserts a breakpoint in a program at compile time and not through a debugger.  The code uses the Intel instruction “int 3″ and you may need to figure out the equivalent instruction for a non-Intel platform.

// The code below works well with Visual Studio.
int main()
{
__asm int 3;
printf("Hello World\n");
return 0;
}

// The code below works well with gcc + gdb
int main()
{
asm("int $3");
printf("Hello World\n");
return 0;
}


If you run this program in Visual Studio, you get a dialog saying “helloworld.exe has triggered a breakpoint“.

In gdb you get the message “Program received signal SIGTRAP, Trace/breakpoint trap.

In the example above, a call to “int 3″ invokes the debugger’s exception handler.

It is also interesting to note the assembly instructions generated for the program above.

In Visual Studio, right click on the code and click on “Show Disassembly”. Also ensure that “Show Code Bytes” is on in the same context menu.

Visual Studio 2008 Disassembly

In gdb type disassemble at the gdb command.

disassemble

0x0040107a <main+42>:   int3

Now obtain the opcode of the int3 instruction using the x (examine memory) command

(gdb) x/x main+42

0x40107a <main+42>:     0xcc

As seen above, the breakpoint opcode we inserted during compilation is 0xCC .

How Do Debuggers Insert Breakpoints?

For practical reasons, it is unwise to ask for a recompilation whenever a breakpoint is added or deleted.  Debuggers change the loaded image of the executable in memory and insert the “int 3″ instruction at runtime.  The common steps a debugger performs to provide the functionality of a line breakpoint to a user are as follows -

  1. When a user inserts a breakpoint in a line of code, the debugger saves the opcode at that given location and replaces it with 0xCC (int 3).
  2. When the program is run and it executes the “int 3″ instruction, control is passed to the debugger’s exception handler.
  3. The debugger notifies the user that a breakpoint has been hit. Say that the user instructs the debugger to resume execution of the program.
  4. The debugger replaces the opcode 0xCC with the one it had saved earlier.  This is done to restore the instructions to their original state.
  5. The debugger then single steps the program.
  6. It then resaves the original instruction and re-inserts the opcode 0xCC.  If this step were not done, the breakpoint would have been lost.  Temporary breakpoints on the other hand skip this step.
  7. The debugger then resumes execution of the program.

Hardware breakpoints are limited in number but debuggers are able to provide unlimited breakpoints by implementing them through software.

Knowing what goes behind the scenes makes debugging a bit easier.  A debugger may defer setting a breakpoint if the module is not loaded in memory yet.  It needs to replace some opcode with 0xCC and that can happen only when the module is in memory.  Likewise, a mismatch between a binary, its sources and its debug symbols (or the lack of it) may cause breakpoints to be hit at unexpected locations because the debugger is not able to correctly map the source line to the opcode that it needs to replace with 0xCC.  At times debuggers complain about the mismatch and refuse to set the breakpoints.

Many of the setup issues with breakpoints become obvious once we know how they work internally.  And when all else fails and release build breakpoints  adamantly refuse to work, you always have the option of compiling an “int 3″ breakpoint right into your code.

Profiling with procexp (process explorer)

While running your program in memory intensive workflows, you may often run into a situation where the low memory condition starts to thrash the system.  Such a program usually exhibits a performance problem as it has consumed most of its available virtual memory.  Developers often use their favorite profilers to figure out the performance bottlenecks though profilng such programs is difficult and sometimes impractical.  A lot of good profilers crash when collecting data in such situations.  They need additional memory and resources to collect and consolidate the data which makes the situation even worse.

The good thing about a program that is thrashing a system is that it tends to be in the slow portion of the code for a long period of time.  So when it is bringing down the system, it is mostly executing the code that is responsible for the situation and all we want is to take a peek at the call stack at that very moment.  An obvious choice is to use a debugger or a profiler but given the low memory condition of the system, one may get little help from such tools.  When debugging or profiling become painfully slow, people may get evil ideas of reformatting their system or may start comparing their state of the art machines with ones they had five years back :).  This article describes a light weight profiling trick where one can get the call stack of the unresponsive program without really loading the system further.

Procexp (Process Explorer) is a tool from sysinternals (now Microsoft) and both the download and documentation is available at the link here (Microsoft’s site).

The tool allows one to view the call stack of any running program on the system.  Below are the steps needed to display a call stack of any running program.

  1. Launch procexp.
  2. In the process tree, find your process that is thrashing the system.
  3. Right click on “Properties” and select the “Threads” tab.    

    procexp threads tab    

     

     

     

     

    procexp threads tab

  4. Sort the “Cycles Delta” (on Vista) or “CSwitch Data” (on Windows XP) column in descending order and select the topmost thread.  For some programs there might be just one thread.
  5. For the selected thread click on the stack button to see the current call stack of the program.  Do note that this is a snapshot of the call stack and does not change dynamically.      

    procexp call stack    

     

     

     

     

    procexp call stack

This call stack can provide a good insight of the area of the code that is causing the system to stall.  In the example above functiondothis() is where the thread is spending the most time.  Take more than one sample to reconfirm the findings.  This is a very unintrusive and light weight method of getting a call stack of a running program.  The same trick can be used to debug a hang but there a debugger works equally well. 

Sometimes you don’t need heavy debugging tools and sometimes you just can’t use them.  Procexp is a nifty debugging utility (in addition to being a process explorer) that a developer should download and keep handy for times when nothing else works.