Laboratory 12 - Theory
Multi-module programming (asm+C)
Multi-module programming (asm+C)
Motivation:
- high execution speed in resolving tasks with minimal resource consumption;
Call code
Entry code
Return code
- Restoring nonvolatile altered resources;
- Removing local variables of the function;
- Destroying the stack frame;
- Returning to the calling code and removing the parameters.
Declaring extern symbols:
- In order to access a function written in assembly language from a C program, the function needs to be declared global in the assembly program and needs to contain the character '_' in front of the function name.
- If the function will be called from the C program as
fun()
, then the asm program will contain the following:global _fun segment code public code use32 _fun:
Keeping the value of some registers untainted
High level languages require that certain registers have the same value after a function call as before the function call. For this purpose, if the subprogram defined in assembly language changes some of these registers, then their values at the entry point need to be stored (for example on the stack). These values will be restored before returning from the procedure.- PUSHAD and POPAD can be used for storing and restoring the values of the 8 general registers.
Passing parameters to the function
- Parameters are passed using the stack, which offers a greater flexibility than passing parameters using registers (regarding the number of parameters);
Establishing the stack frame
- When entering the function we set the register EBP←ESP. Before exiting the function we will restore this value.
Because ESP changes when we push parameters on the stack, the best way to acces the values of the parameters is using a base or an index register. For this purpose EBP is more suitable, because when we use it we automatically refer to the stack segment.
The sequence that prepares the stack access is:
push ebp mov ebp, esp
Reserving memory space for local defined data
Sometimes the procedure needs local data. If their value does not need to be stored between two consecutive function calls, then these are volatile data and they will be stored on the stack. Otherwise, these are static data and they will be stored in a different segment from the stack segment, for instance in the data segment. Reserving n bytes (n being a multiple of 4) for local data can be done relative to EBP.sub esp,n
Hence:
- the EBP register will be used to acces parameters (for example [EBP+8] accesses the first parameter represented on 32 bits);
- the first parameter accessible from the stack is the last parameter added on the stack by the caller program;
- we reserve space on the stack for local variables, for example:
sub esp,4*1
- this method simplifies the way of accessing parameters, especially for functions with a variable number of parameters;
- it is the responsibility of the programer to pop the parameters out of the stack.
Returning values from the function
- if the function returns an integer, then this will be returned in EAX;
- if the function returns a string, then its address will be returned in EAX;
- using the CDECL convention, it is assumed the the registers EBX, ESI, EDI, EBP and ESP do not modify their value during the function call;
Returning from the procedure
When returning from the procedure the following steps are necessary:- restoring the values of the registers (see section Keeping the value of some registers untainted);
- restoring the stack so that it contains the return address on top:
mov esp, ebp pop ebp
Structure of a function:
global _fun segment code public code use32 _fun: push ebp mov ebp, esp pushad ;... code of the function ... popad mov eax, returned_value mov esp, ebp pop ebp end
Using procedures defined in assembly within a C program
Example 1
In an assembly program we define a procedure called hello_world that does not have any parameters and does not return anything. The procedure prints the message "Hello World!" on the screen.
hello_world.asm |
hello_world.c |
bits 32 extern _printf global _hello_world segment data public data use32 mesaj db 'Hello world!', 0 segment code public code use32 _hello_world: push ebp mov ebp,esp push dword mesaj call _printf add esp, 4*1 pop ebp ret |
#include <stdio.h> void hello_world(); int main() { hello_world(); printf("This program just prints something on the screen!"); return 0; } |
---|
Observe the keyword extern, which tells the compiler that the function / variable is defined in a different file (not in the current file). It is the linker's job to create a conexion between this declaration of the function / variable and its definition.
Example 2
In an assembly program we define a procedure called return_10, which does not have any parameters and it returns an integer.
return_10.asm |
return_10.c |
---|---|
bits 32 global _return_10 segment data public data use32 segment code public code use32 _return_10: mov eax, 10 ret |
#include <stdio.h> int return_10(); int main() { printf("The program returns the value %d!",return_10()); return 0; } |
Example 3
In an assembly program we define a procedure called sum which has two integer parameters and returns their sum (an integer).
sum.asm |
sum.c |
---|---|
bits 32 global _sum segment data public data use32 segment code public code use32 _sum: push ebp mov ebp, esp mov eax, [ebp+8] add eax, [ebp+12] mov esp, ebp pop ebp ret |
#include <stdio.h> int sum(int, int); int main() { printf("%d\n", sum(2, 3)); return 0; } |
Example 4
In an assembly program we define a procedure called factorial which has a positive integer as parameter and returns its factorial (a positive integer).
factorial.asm |
factorial.c |
---|---|
bits 32 global _factorial segment data public data use32 segment code public code use32 _factorial: push ebp mov ebp,esp sub esp, 4 mov eax, [ebp+8] cmp eax,2 jbe .trivial .recursiv: dec eax push eax call _factorial add esp, 4 mov [ebp-4], eax ; m = (n-1)! mov eax, [ebp+8] ; n mul dword [ebp-4] ; edx:eax ← n * m jmp .final .trivial: xor edx, edx .final: add esp, 4 mov esp, ebp pop ebp ret |
#include <stdio.h> int factorial(int); int main() { int n, f; printf("n = "); scanf("%d", &n); f = factorial(n); printf("factorial(%d) = %d\n", n, f); return 0; } |
Multi-module programming (asm+C) in Visual Studio
The following tutorial is based on Visual Studio 2015, it is assumed that you have a version of Visual Studio installed on your computer.
For more details please access in MS TEAMS the Files section from the General channel, and follow the steps presented in the document procedura_instalare.doc.
The following example shows how to compile, run and debug the program from the Example section.
We use the command line for compiling/assembling the modules
The steps used for compiling the main.c program are:
- open the Visual Studio command line, for this navigate in the Windows Start menu at Visual Studio and choose the option VS2015 x86 Native Tools Command Prompt, as in the figure below.
- In the terminal window navigate to the directory where the program sources are located. In the following example the sources are in the tmp folder, the dir command lists the content of the current directory. Besides the source files of the program, in the tmp directory we also have the executable nasm.exe used for assembling modulAsm.asm.
- In the first step we assembly modulAsm.asm using the command:
nasm modulAsm.asm -fwin32 -o modulAsm.obj(see the figure below). The result is the file modulAsm.obj.
- Using the Visual Studio compiler (cl.exe) we compile main.c. This step must include link editing -> we use the parameter /linker with the file modulAsm.obj. The result is the program main.exe.
- The program can be executed from command line using main.exe:
We can debug the program using Ollydbg, but in order to do this we must specify it in the assembly/compile step:
> nasm modulAsm.asm -fwin32 -g -o modulAsm.objFrom Ollydbg, File -> Open we open main.exe.
> cl /Z7 main.c /link modulAsm.obj
In Visual C if we wish to include debugging information the options are /Z{7|i|I} (see https://msdn.microsoft.com/en-us/library/958x11bc.aspx).