Tag Archives: windbg

Assembly And The Art Of Debugging

This article is an introduction to understanding Intel’s assembly language while debugging a computer program.
Debugging in assembly can be a daunting task and not every developer likes to make sense of the mnemonics that appear on the screen.  While debugging, one is normally used to viewing local variables, parameters  and syntax highlighted code in the most user friendly manner and that is the primary reason why many developers dislike debugging in assembly initially.
Why Debug In Assembly?
1.  You have a binary and its corresponding code but the build environment is not functional.  This could be because of missing tools, unavailable compilers and the lack of understanding of complex scripts and steps needed to recreate the binary.
2.  You are dealing with a release build only bug and the debug symbols are unavailable or you are dealing with a third party library and don’t have access to the source code.
3.  You like to have fun.
What You Need?
1.  A debugger that can display assembly language (Visual Studio, WinDbg, gdb, XCode are capable debuggers).
2.  A ready reference to the Intel instruction set.  Google works just fine.
3.  Patience and maybe a link to this article.  :-)
The key to understanding debugging in assembly is to understand how functions are called, parameters are passed, local variables accessed.  The rest can be understood by referring the Intel Instruction set while debugging the assembly code.
The diagram below is roughly how a stack should look like once a function call has been made.  When one encounters assembly, mapping the code to this diagram should simply the debugging experience.  Note that the stack grows downwards.
Passing Parameters To A Function
Parameters are the first to be pushed onto the stack.  The caller of the function pushes them from right to left.
Cleanup Of Parameters
Parameter cleanup may happen in the function that was called (e.g. stdcall calling convention) or by the caller of the function (e.g. cdecl).  Various kinds of calling conventions of the x86 architecture are explained in detailed in this wiki article. http://en.wikipedia.org/wiki/X86_calling_conventions  Looking right after the function call gives a good idea of what convention was being used when compiling the code.
Take note that if cleanup happens in the function being called, variable number of arguments cannot be passed to it.  Whereas it becomes possible to do this if the parameter cleanup happens outside the function.  This essentially is the main difference between stdcall and cdelc calling conventions on a x86 architecture.
The Function Call
Before control is passed to the function, the return address where the program is supposed to resume once the function has finished execution is pushed to the stack.
To recap, parameters are pushed first, followed by the return address.  Later depending on the calling convention of the function, you may see cleanup hhappening right after the function call or within the function itself.
In the diagram above, as the stack grows downwards, the parameters are at the top followed by the return address.
Example
Take a look at the snippet and the equivalent assembly below :
The Function
The function usually starts with a prolog and ends with an epilog
The prolog is either a simple ENTER instruction or more commonly, it saves the ebp register and copies the stack pointer value in it so as to use it as frame pointer.
push        ebp
mov         ebp,esp
The epilog just reverses what the prolog had done.  It is implented by a LEAVE instruction or is implemented as follows
mov         esp,ebp
pop         ebp
The ret instruction returns from the function to the return address stored in the stack.
A compiler switch allows one to omit the frame pointer (fpo – frame pointer omission) which effectively removes the prolog and epilog and uses the ebp register for other optimizations.  It is easy to demarcate function boundaries where the frame pointer is not omitted but one should be aware of the absense of these entry and exit points.
Allocation Of Local Variables
Local variables are allocated on the stack.  The total size of the local variables is computed at compile time and at runtime those many bytes are reserved on the stack.
004113A0  push        ebp
004113A1  mov         ebp,esp
004113A3  sub         esp,0E8h
Note that in the example above, 232 (0xE8) bytes are being reserved for local variables.
The above code will also help in understanding why local variable allocation is much faster than requesting memory from heap.
Local Variables And Parameters In Assembly
The most important part in assembly is to be able to identify access to locals and parameters.  The frame pointer that is set in the prolog, acts as the reference pointer using which all variables can be accessed.  If you add to the frame pointer (Remember : P for Plus and P for Parameters), you will access parameters.  If you subtract from the frame pointer, you will be able to access local variables.
mov         byte ptr [ebp-20h],3
mov         byte ptr [ebp+20h],5
The first line above access a local variable whereas the second accesses a parameter.
In the case of frame pointer omission, everything is calculated with respect to the stack pointer.  Therefore [esp + 20h] might refer to a local or a parameter depending on where the stack pointer currently points to.  And if say a register is pushed on the stack, the same variable will now be referred using [esp + 24h].  Debugging functions that have optimized out the frame pointer is not that easy as the changes made to the stack pointer need to be constantly tracked.

This article is an introduction to understanding Intel’s assembly language while debugging a computer program.

Debugging in assembly can be a daunting task and not every developer likes to make sense of the mnemonics that appear on the screen.

While debugging, one is normally used to viewing local variables, parameters  and syntax highlighted code in the most user friendly manner.  This is the primary reason why many developers dislike debugging in assembly initially.

Why Debug In Assembly?

Debugging in assembly is not an optional skill to have.   Every developer encounters a situation where there is no other alternative other than cracking open the assembly code.  Here are a few reasons why a developer should get their hands dirty with this skill –

  1. You have a binary and its corresponding code but the build environment is not functional.  This could be because of missing tools, unavailable compilers or the lack of understanding of complex scripts and steps needed to recreate the binary.
  2. You are dealing with a release build only bug and the debug symbols are unavailable or you are dealing with a third party library and don’t have access to its source code.
  3. You like to have fun :-)

What You Need?

  1. A debugger that can display assembly language (Visual Studio, WinDbg, gdb, XCode are capable debuggers).
  2. A ready reference to the Intel instruction set.  Google works just fine.
  3. Patience and maybe a link to this article.  :-)

The key to debugging in assembly is to understand how functions are called, parameters are passed, local variables accessed.  The code flow and logic can be understood by looking up the Intel instruction set .

The diagram below (click for larger image)  is roughly how a stack should look like when a function is called.  Mapping the assembly code to this diagram should simplify the debugging experience.  Note that the stack grows from top to bottom.

assembly_stack

Passing Parameters To A Function

Parameters are the first to be pushed onto the stack.  The caller of the function pushes them from right to left.

Cleanup Of Parameters

Parameter cleanup may happen in the function that was called (e.g. stdcall calling convention) or by the caller of the function (e.g. cdecl).  Various kinds of calling conventions of the x86 architecture are explained in detail in this wiki article.   Stack cleanup code right after the function call gives a good idea of  the calling convention being used.

Take note that if cleanup happens in the function being called, variable number of arguments cannot be passed to it.  In contrast, variable number of arguments can be implemented if the parameter cleanup happens outside the function.  This essentially is the main difference between stdcall and cdelc calling conventions on a x86 architecture.

The Function Call

Before control is passed to the function, the return address (where the caller is supposed to resume after the function completes execution) is pushed on the stack.

To recap, parameters are pushed first, followed by the return address.  Later depending on the calling convention of the function, you may see cleanup happening right after the function call or within the function itself.

In the stack diagram above, as the stack grows downwards, the parameters are at the top followed by the return address.

Example

Take a look at the snippet and the equivalent assembly below :

[1]int a = 4;
[2]mov         dword ptr [a],4
[3]
[4]char c = 0;
[5]mov         byte ptr ,0
[6]
[7]c = f(a, 22);
[8]push        16h
[9]  mov         eax,dword ptr [a]
[10]  push        eax
[11]call        f (4111D1h)
[12]add         esp,8 
[13]mov         byte ptr ,al
  • In line [8], the constant value 22 (16 hex) is being pushed on the stack.
  • In line [9-10] the variable ‘a’ is pushed.
  • Line [11] is the function call to f() which implicitly pushes the return address.
  • The return value of the function is passed in a register and copied in variable ‘c’ (line[13]).
  • In line [12] one can see that 8 bytes are being discarded from the stack as f() uses the cdecl calling convention and cleanup needs to happen in the caller function.

The Function

The function usually starts with a prolog and ends with an epilog.

The prolog is either a simple ENTER instruction or more commonly, it saves the ebp register and copies the stack pointer value in it so as to use it as frame pointer.

push        ebp
mov         ebp,esp

The epilog just reverses what the prolog had done.  It is implented by a LEAVE instruction or is implemented as follows

mov         esp,ebp
pop         ebp

The ret instruction passes control from the function to the return address stored in the stack.

Compilers provide a switch that allows one to omit the frame pointer (fpo or frame pointer omission) which effectively removes the prolog and epilog and uses the ebp register for other optimizations.  It is easy to demarcate function boundaries where the frame pointer is present but one should be aware that the entry and exit points may be absent.

Allocation Of Local Variables

Local variables are allocated on the stack.  The total size of the local variables is computed at compile time and at runtime those many bytes are reserved on the stack.

004113A0  push        ebp
004113A1  mov         ebp,esp
004113A3  sub          esp,0E8h

Note that in the example above, 232 (0xE8) bytes are being reserved for local variables.

The above code will also help in understanding why local variable allocation is much faster than requesting memory from heap.   Allocating local variables requires moving the stack pointer whereas memory heap management is much more complex.

As the return address and local variable area are very close to each other, buffer overflows can be caused if data is written past the the local variable area which can then overwrite the the return address.   When this return address if forced to point to shell code (or to some other code placed intentionally by a hacker), such a buffer overflow is termed as an exploit.

Local Variables And Parameters In Assembly

The most important part in assembly is to be able to identify access to locals and parameters.  The frame pointer that is set in the prolog, acts as the reference pointer using which all variables can be accessed.  If you add to the frame pointer (Remember : P for Plus and P for Parameters), you will be able to access parameters.  If you subtract from the frame pointer, you will be able to access local variables.

mov         byte ptr [ebp-20h],3
mov         byte ptr [ebp+20h],5

The first line above accesses a local variable whereas the second accesses a parameter.

In the case of frame pointer omission, everything is calculated with respect to the stack pointer.  Therefore [esp + 20h] might refer to a local or a parameter depending on where the stack pointer currently points to.  And if say a register is pushed on the stack, the same variable will now be referred using [esp + 24h].  Debugging functions that have optimized out the frame pointer is not that easy as the changes made to the stack pointer need to be constantly tracked.

Conclusion

Debugging in assembly is not only fun but a useful tool to debug difficult problems.  Different debuggers provide different interfaces to interact with the assembly code but under the hood, all programs work alike.   Understanding this is the key to debugging in assembly with ease.

Cross-platform Debugging Cheat Sheet

If you work on multiple platforms and use different debuggers, you are expected to know the debugger’s user interfaces well enough.  At times this gets confusing especially if you have one primary  platform and you work on other platforms rather infrequently.

I have compiled a list of my favorite features in a debugger and how to invoke them on different debuggers (Visual Studio, XCode, gdb and Windbg).

This is not a substitute for the debugger’s documentation but helpful for quickly switching to an unfamiliar debugging environment.  Click the image below for viewing the table or download the PDF version as a ready reference.

Debugger Cheatsheet

Debugger Cheat Sheet (Download PDF)

Please note :

  • The commands (especially for gdb) are not necessarily complete and the debugger’s help should be consulted for detailed usage.
  • The list is not comprehensive and I have only put in my favorite commands that I use while debugging.
  • A square bracket [ ] denotes a keyboard shortcut.

Temporary Breakpoint – Now You See It, Now You Don’t

Have you faced the problem of breakpoint clutter where breakpoints keep piling up only to hinder the debugging session?  It is then that one realizes that there are some breakpoints that can be deleted and others disabled.

A useful feature in a debugger is a temporary breakpoint that automagically gets deleted when hit thereby reducing the clutter of unnecessary breakpoints.  These breakpoints are useful when you wish to stop at a code location only once and do not require the execution to stop at that location ever again.

For example, say you are trying to determine whether a particular test scenario invokes a specific line of code or not, in that case a temporary breakpoint can be used as the breakpoint is not useful once it has been hit alteast once.

Below are steps on how to set temporary breakpoints in various debuggers.

gdb

Use the tb command to set a temporary breakpoint in gdb.  It is similar to the break command but the breakpoint will automatically be deleted when hit.

(gdb)help tb
Set a temporary breakpoint.
Like “break” except the breakpoint is only temporary,
so it will be deleted when hit.  Equivalent to “break” followed
by using “enable delete” on the breakpoint number.

Windbg

In Windbg, breakpoints set in the Command window using the bl /1 command can be used to create temporary breakpoints.  The /1 tells Windbg that the breakpoint should be deleted when hit.

In Windbg temporary breakpoints are also known as “one shot breakpoints”.

Visual Studio

I found it a bit painful to create temporary breakpoints in Visual Studio.  The only way I could create one was by setting a breakpoint and then setting the hit count for the breakpoint to be equal to 1.   The  article here explains how to set a hit count in Visual Studio.

The amount of work involved to do this sometimes doesn’t make temporary breakpoints worthwhile to set.  Moreover the breakpoint lingers on and doesn’t actually get deleted when hit.

Debugging – Using Breakpoint Hit Count For Fun And Profit.

If you are familiar with hit count breakpoints already, you may want to click here to jump to the advanced tricks shared in this article.

What is the hit count of a breakpoint?

A debugger allows users to set a breakpoint at a specific line in code.  When the execution reaches that line, the breakpoint is said to have been *hit* and the execution of program being debugged is suspended.

Internally the debugger also keeps a count of the number of times the breakpoint has been hit.  This is called the hit count of a breakpoint.  Debuggers allow users to set conditions based on the hit count of the breakpoint.  For example, you can specify that the execution of the program should only be suspended when the hit count is greater than or equal to 250. To put it in other words, the breakpoint will be skipped for the first 249 times it is hit.

The advantage of being able to set a condition with the hit count of a breakpoint is to make the process of debugging faster.

 

How can hit count based breakpoints be set?

Debuggers today have either a command line or a graphical user interface.  Mostly all debuggers provide a means to set hit count based breakpoints. Below are steps on setting such breakpoints in some of the debuggers I have used.

Microsoft Visual Studio 2005

  1. Set a breakpoint at a line in your code.
  2. Right click the breakpoint and then click on “Hit Count”.  You can also go to Debug -> Windows -> Breakpoints  and right click on the breakpoint that was just created and select “Hit Count”.
  3. In the dialog that pops up, you can choose from four ways of controlling the breakpoint based on its hit count.  The default is to ignore the hit count and suspend the program always when the breakpoint is hit. It is good to take note of the other three options.
Hit Count Window In Visual Studio 2005
Hit Count Window In Visual Studio 2005

When the program is in suspended mode, one can see the current hit count of the breakpoint in the breakpoint window.  In the image below, the “Hit Count” column shows the current hit count of the breakpoints.

Visual Studio Breakpoint Window

gdb

In gdb, the command continue is used to resume execution of the suspended program. When followed by a number N, the breakpoint is hit the Nth time.

(gdb) help continue
Continue program being debugged, after signal or breakpoint.
If proceeding from breakpoint, a number N may be used as an argument,
which means to set the ignore count of that breakpoint to N – 1 (so that
the breakpoint won’t break until the Nth time it is reached).
(gdb) continue 20

gdb is available on Mac OSX, Linux, AIX, Solaris, HPUX and Cygwin on Windows, etc so this is one command one should learn by heart.

On Mac OSX, the XCode IDE uses gdb internally and allows access to it through the menu (Debug -> Console Log).  Through the command line interface continue can be used as described above.

In gdb the info breakpoints command can be used to view the current hit count of all breakpoints.

(gdb) info breakpoints
Num Type           Disp Enb Address    What
1   breakpoint     keep y   0x0040118a in main at try.cpp:6
breakpoint already hit 246 times
3   breakpoint     keep y   0x004011a5 in main at try.cpp:8

Visual Studio and gdb differ slightly in terminology.  One allows setting breakpoints with a hit count and the other lets skipping of breakpoints for a certain count.  However they are essentially the same features that allows the programmer the option of not having to stop always at a breakpoint.  In the subsequent sections, the term “set a hit count breakpoint” is used instead of “skip the breakpoint n times”.  It should be trivial to interpret the tricks in terms of skipping a breakpoint.

 

WinDbg

I did not find a way to set hit count breakpoints in windbg yet.
Here is how you set a hit count breakpoint in WinDbg.

  1. Go to the source view and set a breakpoint in the source code.  The shortcut F9 can be used to toggle a breakpoint.
  2. In the command window (alt + 1), list all breakpoints using the bl command.
  3. Take note of the breakpoint that you just set and copy the location of the breakpoint which is listed in the format of <module_name>!<function_name>+<offset>.  See example below.
  4. Now redefine the breakpoint with the bp command.  After the bp command paste the location that you copied in the previous step followed by the hit count.

0:000> bl

1 e x86 00000000`004113b2     0001 (0001)  0:**** test_project!wmain+0x42
0:000> bp test_project!wmain+0x42 2300
breakpoint 1 redefined
0:000> bl
0 e x86 00000000`004113b2     2300 (2300) 0:**** test_project!wmain+0x42

The hit count in the above example is set to 2300.  This current  hit count as shown above is decremented each time the breakpoint is hit but the execution stops only when this number is equal to 1.  The number within the parentheses denotes the hit count that was originally set by the user. 

Using Hit Count For Fun And Profit

Many developers set breakpoints without the hit count conditions.  There are lot of nifty ways in which a hit count breakpoint can be used.

Below are some scenarios which developers will find useful while using hit count breakpoints:

Break In A Loop More Conveniently.

Setting an unconditional breakpoint in a loop (e.g. for, while, do-while) may break execution more often than needed.  If you know the iteration of the loop when you want to suspend execution of the program, you can set a hit count breakpoint.

For example, in the while loop below if the intention is to break in the 21stiteration,  a hit count based breakpoint will be more useful and simpler than a conditional one.  Do note that in the loop below, the variable i does not increment by one.

<code>int i = 0;
while( !flag &amp;&amp; i &lt; N )
{
/* some code */
i *= 2;
}</code>

Likewise, the for-loop below traverses through the int vector using an iterator.  If the intent is to break when the 10thelement in the vector is being processed in the loop, then a hit count breakpoint will be more useful and easy to set.

<code>std::vector&lt;int&gt;::iterator iter;
for( iter = vec.begin() ; iter != vec.end() ; ++iter )
{
/* some code */
}</code>

 

Create A Quick And Dirty Profiler And Much More.

 

Part I

Profilers that instrument code log the time taken by a function and the number of times it is called.  It is the latter where hit count breakpoints are very useful.  The greatest advantage of being able to track the number of times a function is called is that you don’t have a to run the code through a profiler but you get the results with the same accuracy.  Moreover profilers may crash at times but debuggers are pretty stable when it comes to debugging code.

The trick here is to set a hit count breakpoint that will never be reached.  For example, set a hit count to an unpractically large value (say 1000000) and set one breakpoint at the program termination (for example at the end of the main() function).

When the program is run, due to the large hit count, the breakpoint will never be hit and only the breakpoint at the end of the program will be hit.  The debugger however has no knowledge that the breakpoint hit count is too large for it to be hit and therefore tracks the count whenever execution reaches the breakpoint.

At program termination, when the program gets suspended due to your second breakpoint, you have the debugger waiting to tell you what the hit count of the first breakpoint currently is.  In other words it just told you how many times did the line of code get hit before the program terminated.  That exactly is the kind of information that the profiler would have told you.  Voila – you have that quick and dirty profiler ready for use :-).

Maybe someday I will write about how how a breakpoint works internally and then you can relate the similarity between what do the debugger and code instrumenting profiler have in common.

The above trick is explained in the C code snippet below.

<code>
void profile_me()
{
/* set hit count breakpoint here with a very large hit count */
/* function code */
}</code>

<code>
int main()
{
profile_me();
/* Set the second breakpoint here and when this is hit,*/
/* observe the hit count of the breakpoint set above */
return 0;
}</code>

Part II – Smart Breakpoints

Another use of hit breakpoints is very similar to the quick and dirty profiler trick.  At times when one encounters a crash in a loop or in a repeated function call, it may make more sense to debug a few iterations prior to when the crash actually happens.  For example,  say a loop is processing tokens  and a crash happens while processing the 2520th token.  The crash itself may not make much sense once it has occured but it may help to know what happened 5 iterations prior to the crash.  That way, the programmer  can collect data for prior iterations and reach the crash condition.   This will equip the programmer with relevant data needed to solve the crash at hand.

<code>while( token = get_token() )
{
/* some code */
switch( token )
{
case token_1: /*do code */
case token_2: /*do code */
/* more case statements */
}
}</code>

The trick here again is to set a very large hit count so that the breakpoint is never hit.  Once the crash occurs, the hit count of the breakpoint is noted.  Then the hit count of the breakpoint is reset to 5 minus the hit count obtained when the crash had occured.  From now on whenver the hit count condition is met and the breakpoint is hit, the programmer will know that in 5 iterations a crash is expected.  The data collected for the 5 iterations may be essential for resolving the crash.

 

Part III – Matching calls.

Hit count breakpoints have yet another use in debugging – matching the call count for a pair of functions.  For example, for every malloc call a free call should have been made in order to have zero memory leaks.  Similarly, a constructor (for now assume there is only one) and a destructor of a class should be called equal number of times.  These calls have an opposite effect but their pair should match to ensure that resources don’t leak.

<code>
C::C()
{
/* constructor */
}</code>

<code>~C::C()
{
/* destructor */
}
</code>

The trick is to set two hit count breakpoints with very large values that will never be reached in both the constructor and destructor above.  Also a breakpoint should be set at the point of program termination (for example at the end of function main() ).  The two breakpoints in the constructor and destructor will not be reached due to the very large values.  When the program’s execution is suspended at program termination due to the final breakpoint, the hit count of the two breakpoints set in the constructor and destructor should be checked and hopefully their hit counts should match.  Here I am assuming the class C was not involved in creating global or static objects.  A mismatch of hit counts may suggest that not all objects of class C were destroyed and a possible resource leak should be looked into.

In summary, if there are two calls that should be called equally during the life span of a program, then this trick can be used to check that the call hit counts do indeed match.

 

Final Note

Hit count is a slightly under used feature of a debugger but it can be used in many innovative ways to gain better control over debugging.  It is not a replacement for profilers but a great tool when you do not have one with you at hand.  The infinite-hit-count-breakpoints are useful to keep track of code workflow as these breakpoints are set with the intention of never wanting them to be hit.  However the information that such breakpoints can provide can be pretty useful and accurate.