8.5. Functions in Assembly
In the last several sections, we traced through simple functions in assembly. In this section, we discuss the interaction between multiple functions in assembly in the context of a larger program. We also introduce some new instructions involved with function management.
Let’s begin with a refresher on how the call stack is managed. Recall that %esp
is the stack pointer and always points to the top of the stack. The register %ebp
represents the base pointer (also known as the frame pointer) and points to the base
of the current stack frame. The stack frame (also known as the activation frame or
the activation record) refers to the portion of the stack allocated to a single function
call. The currently executing function is always at the top of the stack, and its stack
frame is referred to as the active frame. The active frame is bounded by the stack pointer
(at the top of stack) and frame pointer (at the bottom of the frame). The activation record
typically holds local variables and parameters for a function.
Figure 1 shows the stack frames for main()
and a function it calls named fname()
.
Going forward, we will refer to the main()
function as the caller function and fname()
as
the callee function.

In Figure 1 the current active frame belongs to the callee function (fname()
). The memory between
the stack pointer and frame pointer are used for local variables. The stack pointer moves as local
values are pushed and popped from the stack. In contrast, the frame pointer remains relatively constant,
pointing to the beginning (e.g. bottom) of the current stack frame. As a result, compilers like gcc commonly reference
values on the stack relative to the frame pointer. In Figure 1, the active frame is bounded below by the base pointer of fname()
which contains
stack address 0x418
. The value stored at this address is the "saved"
%ebp
value (0x42c
) which itself indicates the bottom of the activation frame for the main()
function. The top of the activation frame of main()
is bounded by the return address,
which indicates the program address at which main()
resumes execution once the callee function exits.
The return address points to program memory not stack memory
Recall that the call stack region (stack memory) of a program is different from its code region (code memory).
While ![]() Figure 2. The parts of a program’s address space.
|
Table 1 contains several additional instructions that the compiler uses for basic function management:
Instruction | Translation |
---|---|
|
prepares the stack for leaving a function. Equivalent to: mov %ebp, %esp pop %ebp |
|
switches active frame to callee function. Equivalent to: push %eip mov addr, %eip |
|
restores active frame to caller function. Equivalent to: pop %eip |
For example, the leave
instruction is a shorthand that the compiler uses to restore the
stack and frame pointers as it prepares to leave a function. When the callee function
finishes execution, leave
ensures that the frame pointer is restored to its previous value.
The call
and ret
instructions play a prominent role in the process where one function calls another.
Both instructions modify the instruction pointer (register %eip
). When the caller function executes
the call
instruction, the current value of %eip
is saved on the stack to represent the return address,
or the program address at which the caller resumes executing once the callee function finishes.
The call
instruction also replaces the value of %eip
with the address of the callee function.
The ret
instruction restores the value of %eip
to the value saved on the stack, ensuring that the program
resumes execution at the program address specified in the caller function. Any value returned by the callee is
stored in %eax
. The ret
instruction is usually the last instruction that executes in any function.
8.5.1. Tracing through an Example
Using our knowledge of function management, let’s trace through the code example first introduced at the beginning of this chapter:
#include <stdio.h>
int assign(){
int y = 40;
return y;
}
int adder(){
int a;
return a + 2;
}
int main(){
int x;
assign();
x = adder();
printf("x is: %d\n", x);
return 0;
}
Compile the code with the -m32
flag and use objdump -d
to view the underlying
assembly. The latter command outputs a pretty big file that contains a lot of
information that we don’t need. Use less
and the search functionality to extract the
adder
, assign
, and main
functions:
804840d <assign>: 804840d: 55 push %ebp 804840e: 89 e5 mov %esp,%ebp 8048410: 83 ec 10 sub $0x10,%esp 8048413: c7 45 fc 28 00 00 00 movl $0x28,-0x4(%ebp) 804841a: 8b 45 fc mov -0x4(%ebp),%eax 804841d: c9 leave 804841e: c3 ret 0804841f <adder>: 804841f: 55 push %ebp 8048420: 89 e5 mov %esp,%ebp 8048422: 83 ec 10 sub $0x10,%esp 8048425: 8b 45 fc mov -0x4(%ebp),%eax 8048428: 83 c0 02 add $0x2,%eax 804842b: c9 leave 804842c: c3 ret 0804842d <main>: 804842d: 55 push %ebp 804842e: 89 e5 mov %esp,%ebp 8048433: 83 ec 20 sub $0x14,%esp 8048436: e8 d2 ff ff ff call 804840d <assign> 804843b: e8 df ff ff ff call 804841f <adder> 8048440: 89 44 24 1c mov %eax,0xc(%esp) 8048444: 8b 44 24 1c mov 0xc(%esp),%eax 8048448: 89 44 24 04 mov %eax,0x4(%esp) 804844c: c7 04 24 f4 84 04 08 movl $0x80484f4,(%esp) 8048453: e8 88 fe ff ff call 80482e0 <printf@plt> 8048458: b8 00 00 00 00 mov $0x0,%eax 804845d: c9 leave 804845e: c3 ret
Each function begins with a symbolic label that corresponds to its declared name in the
program. For example, <main>:
is the symbolic label for the main()
function.
The address of a function label is also the address of the first instruction in that function.
To save space in the figures below, we truncate addresses to the lower 12 bits.
So, program address 0x804842d
is shown as 0x42d
.
8.5.2. Tracing through main
Figure 3 shows the execution stack immediately prior to the execution of main()
:

Recall that the stack grows toward lower addresses. In this example, %ebp
is address 0x140
, and %esp
is address 0x130
(both of these values are made up
for this example). Registers %eax
and %edx
initially contain junk values. The arrow
indicates the currently executing instruction. Initially, %eip
contains address 0x42d
,
which is the program memory address of the first line in the main()
function. Let’s trace
through the program’s execution together.

The first instruction pushes the value of ebp
on to the stack, saving address 0x140
. Since
the stack grows toward lower addresses, the stack pointer %esp
updates to 0x12c
, which is 4 bytes less than 0x130
.
Register %eip
advances to the next instruction in sequence.

The next instruction (mov %esp, %ebp
) updates the value of %ebp
to be the same as %esp
.
The frame pointer (%ebp
) now points to the start of the stack frame for the
main()
function. %eip
advances to the next instruction in sequence.

The sub
instruction subtracts 0x14
from the address of our stack
pointer, "growing" the stack by 20 bytes. Register %eip
advances to the next instruction, which is
the first call
instruction.

The call <assign>
instruction pushes the value inside register
%eip
(which denotes the address of the next instruction to execute) onto the
stack. Since the next instruction after call <assign>
has an address
of 0x43b
, that value is pushed onto the stack as the return address.
Recall that the return address indicates the program address where execution should
resume when program execution returns to main()
.
Next, the call
instruction moves address of the assign
function (0x40d
) into
register %eip
, signifying that program execution should continue
into the callee function assign()
and not the next instruction in program sequence.

The first two instructions that execute in the assign()
function are the usual book-keeping
that every function performs. The first instruction pushes the value stored in %ebp
(memory
address 0x12c
) onto the stack. Recall that this address points to the beginning of the
stack frame for main()
. %eip
advances to the second instruction in assign()
.

The next instruction (mov %esp, %ebp
) updates %ebp
to point to the top of the stack,
marking the beginning of the stack frame for assign()
. The instruction pointer (%eip
) advances to
the next instruction in the assign()
function.

The sub
instruction at address 0x410
grows the stack by 16 bytes,
creating extra space on the stack frame to store local values and updating
%esp
. The instruction pointer again advances to the next instruction in the assign()
function.

The mov
instruction at address 0x413
moves the value $0x28
(or 40
) onto the stack
at address -0x4(%ebp)
, which is 4 bytes above the frame pointer.
Recall that the frame pointer is commonly used to reference locations
on the stack. %eip
advances to the next instruction in the assign()
function.

The mov
instruction at address 0x41a
places the value $0x28
into register %eax
, which
holds the return value for of the function. %eip
advances to the leave
instruction in the assign()
function.

At this point, the assign()
function has almost completed execution.
The next instruction that executes is the leave
instruction, which
prepares the stack for returning from the function call. Recall that leave
is analogous to the
following pair of instructions:
mov %ebp, %esp pop %ebp
In other words, the CPU overwrites the stack pointer with the frame
pointer. In our example, the stack pointer is initially updated from 0x100
to
0x110
. Next, the CPU executes pop %ebp
, which takes the
value located at 0x110
(in our example, the address 0x12c
) and places
it in %ebp
. Recall that 0x12c
is the start of stack frame for main()
.
%esp
becomes 0x114
and %eip
points to the ret
instruction in the
assign()
function.

The last instruction in assign()
is a ret
instruction. When ret
executes, the return address is popped off the stack into register %eip
.
In our example, %eip
now advances to the call to the adder()
function.
Some important things to notice at this juncture:
-
The stack pointer and frame pointer have been restored to their values prior to the call to
assign()
, reflecting that the stack frame formain()
is once again the active frame. -
The old values on the stack from the prior active stack frame are not removed. They still exist on the call stack.

The call to adder()
overwrites the old return address on the stack
with a new return address (0x440
). This return address points to the next
instruction to be executed after adder()
returns, or mov %eax, 0xc(%ebp)
.
%eip
reflects the first instruction to execute in adder()
, which is at
address 0x41f
.

The first instruction in the adder()
function saves the caller’s frame pointer
(%ebp
of main()
) on the stack.

The next instruction updates %ebp
with the current value of %esp
, or address 0x110
.
Together, these last two instructions establish the beginning of the stack frame for adder()
.

The sub
instruction at address 0x422
"grows" the stack by 16 bytes. Notice
again that growing the stack does not impact any previously created values
on the stack. Again, old values will litter the stack until they are
overwritten.

Pay close attention to the next instruction that executes: mov $-0x4(%ebp),
%eax
. This instruction moves an old value that is on the stack into register %eax
!
This is a direct result of the fact that the programmer forgot to initialize a
in the
function adder()
.

The add
instruction at address 0x428
adds 2 to register %eax
. Recall IA32 passes
the return value through register %eax
. Together the last two instructions are
equivalent to the following code in adder()
:
int a;
return a + 2;

After leave
executes, the frame pointer once again points to the beginning of the stack
frame for main()
, or address 0x12c
. The stack pointer now stores the address 0x114
.

The execution of ret
pops the return address off of the stack, restoring the instruction
pointer back to 0x440
, or the address of the next instruction to execute in main()
. The
address of %esp
is now 0x118
.

The mov %eax, 0xc(%esp)
instruction places the value in %eax
in a location 12 bytes (3 spaces) below %esp
.

Skipping ahead a little, the mov
instructions at addresses 0x444
and 0x448
set
%eax
to the value saved at location %esp+12
(or 0x2A
) and places 0x2A
one spot
below the top of the stack (address %esp + 4
, or 0x11c
).

The next instruction (mov $0x80484f4, (%esp)
) copies a constant value that is a memory address
to the top of the stack. This particular memory address, 0x80484f4
, contains the string "x is %d\n"
.
The instruction pointer advances to the call to the printf()
function (which is denoted with
the label <printf@plt>
).

For the sake of brevity, we will not trace the printf()
function, which is part of stdio.h
. However, we know from
the manual page (man -s3 printf
) that printf()
has the following format:
int printf(const char * format, ...)
In other words, the first argument is a pointer to a string
specifying the format, and the second argument onwards specify
the values that are used in that format. The instructions
specified by addresses 0x444
- 0x45c
correspond to the following
line in the main()
function:
printf("x is %d\n", x);
When the printf()
function is called:
-
A return address specifying the instruction that executes after the call to
printf()
is pushed onto the stack. -
The value of
%ebp
is pushed onto the stack, and%ebp
is updated to point to the top of the stack, indicating the beginning of the stack frame forprintf()
.
At some point, printf()
references its arguments, which are
string "x is %d\n"
and value 0x2A
. Recall that the return address is located directly
below %ebp
at location %ebp+4
. The first argument is thus located at %ebp+8
(i.e,
right below the return address), and the second argument is located at %ebp+12
.
For any function with n arguments , gcc places the first argument at location %ebp+8
,
the second at %ebp+12
, and the nth argument at location (%ebp+8) + (4*(n-1))
.
After the call to printf()
, the value 0x2A
is outputted to the user
in integer format. Thus, the value 42
is printed to the screen!

After the call to printf()
, the last few instructions clean up the stack and prepare a clean
exit from the main()
function. First the value 0x0
is placed in register %eax
, signifying that
the value 0 is returned from main()
. Recall that a program returns 0
to indicate correct termination.

Once leave
and ret
are executed, the stack and frame pointers revert to their original values prior
to the execution of main()
. With 0x0
in the return register %eax
, the program returns 0.
If you have carefully read through this section, you should understand why our
program prints out the value 42
. In essence, the program inadvertently uses old values
on the stack to cause it to behave in a way that we didn’t expect. While this example was pretty
harmless, we discuss in future sections how hackers have misused function calls to make programs
misbehave in truly malicious ways.