Jul 6, 2014

C Programming #37: Function - Stack

This article will discuss about how the function call is achieved in C language.

Function call uses a kind of data structure called as stack. We are really interested how stack helps us to 

  1. To setup local variable of a function ?
  2. How are actual parameters passed to function ?
  3. How are formal parameters setup of a function ?
Hence understanding stack is very critical in understanding Function call (And in future recursion also). 


Stack is type of data structure (memory[data] managed and handled in particular way). Lets take Stack with 10 items. We are assuming 64 bit machine.
Hence address will be 64 bit and each stack item is 8 bytes in length. Say Stack starts at address 0x20000000 it would look as follow. Since there are no items in the stack at the moment Top of Stack is pointing to address 0x2000000



There are two operation with respect to Stack 


  1.  Push -> item is pushed into head.
  2.  Pop  -> item is poped out of head.


Lets see how the Stack dynamically changes with operation push(0x1000), push(0x2000), pop().




See how top of stack moves, observe that, last item that was pushed 0x2000 was first to pop out. Hence operation in Stack is Last In First Out (LIFO).

Lets see how does the function call is realized using Stack. Lets take following example



#include <stdio.h>
void function_x(int);
int main()
{
   int a = 10, b = 20;
   
   printf("Executing main(before) a = %d, b = %d\n",a ,b); 
   function_x(a);
   printf("Executing main(after) a = %d, b = %d\n",a ,b); 

   return 0;
}

void function_x(int x)
{
   int c = 30, d = 40;
   printf("Executing function_x c = %d, d = %d, x = %d\n",c ,d, x); 
   return;
}

Setting up of stack can be better understood when we see the equivalent assembly. Lets go to assembly with following command. Output sample.s will be generated.

gcc -S sample.c

If you are unaware of the assembly of x86-64 then please go though this article. sample.s generated in my Linux box is as follows.

   .file "sample.c"
              
   .section .rodata
   .align 8
.LC0:
   .string  "Executing main(before) a = %d, b = %d\n"
   .align 8
.LC1:
   .string  "Executing main(after) a = %d, b = %d\n"
   .text
   .globl   main
   .type main, @function
main:
.LFB0:
   .cfi_startproc
   pushq %rbp
   .cfi_def_cfa_offset 16
   .cfi_offset 6, -16
   movq  %rsp, %rbp
   .cfi_def_cfa_register 6
   subq  $16, %rsp
   movl  $10, -8(%rbp)
   movl  $20, -4(%rbp)
   movl  $.LC0, %eax
   movl  -4(%rbp), %edx
   movl  -8(%rbp), %ecx
   movl  %ecx, %esi
   movq  %rax, %rdi
   movl  $0, %eax
   call  printf
   movl  -8(%rbp), %eax
   movl  %eax, %edi
   call  function_x
   movl  $.LC1, %eax
   movl  -4(%rbp), %edx
   movl  -8(%rbp), %ecx
   movl  %ecx, %esi
   movq  %rax, %rdi
   movl  $0, %eax
   call  printf
   movl  $0, %eax
   leave
   .cfi_def_cfa 7, 8
   ret
   .cfi_endproc
.LFE0:
   .size main, .-main
   .section .rodata
   .align 8
.LC2:
   .string  "Executing function_x c = %d, d = %d, x = %d\n"
.text
   .globl   function_x
   .type function_x, @function
function_x:
.LFB1:
   .cfi_startproc
   pushq %rbp
   .cfi_def_cfa_offset 16
   .cfi_offset 6, -16
   movq  %rsp, %rbp
   subq  $32, %rsp
   movl  %edi, -20(%rbp)
   movl  $30, -8(%rbp)
   movl  $40, -4(%rbp)
   movl  $.LC2, %eax
   movl  -20(%rbp), %ecx
   movl  -4(%rbp), %edx
   movl  -8(%rbp), %esi
   movq  %rax, %rdi
   movl  $0, %eax
   call  printf
   leave
   .cfi_def_cfa 7, 8
   ret
   .cfi_endproc
.LFE1:
   .size function_x, .-function_x
   .ident   "GCC: (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3"
   .section .note.GNU-stack,"",@progbits

Lets not go line by line of assembly have a look at some chunk of code

How is local variable setup in main ?

In main we have two local variable a and b having value 10 and 20 in them.
In assembly this stack is modified to make room of local variables
   pushq %rbp
   .cfi_def_cfa_offset 16
   .cfi_offset 6, -16
   movq  %rsp, %rbp
   .cfi_def_cfa_register 6
   subq  $16, %rsp
   movl  $10, -8(%rbp)
   movl  $20, -4(%rbp)

In assembly the stack accessed and control by two registers


  1. Stack pointer - rsp (r tells it is 64 bit register)
  2. Base pointer - rbp 

Stack pointer always points to top of stack. Base pointer always points to start of stack from where local variables/ parameters of function are stored. The memory between BP and SP is often called frame of a function.  Always rbp is initialized which will help referring to all local variables. Now, lets go through each line by line 

   pushq %rbp                   -> Old rbp is saved in Stack by pushing it.
   .cfi_def_cfa_offset 16       -> Ignore it
   .cfi_offset 6, -16           -> Ignore it
   movq  %rsp, %rbp             -> Top of stack in the new rbp (here is where main frame starts)
.cfi_def_cfa_register 6         -> Ignore it
   subq  $16, %rsp              -> Top of stack is moved by 16 bytes to accomadate local variables.
   movl  $10, -8(%rbp)          -> rbp - 8, will contain variable a, and it will contain intial value 10
   movl  $20, -4(%rbp)          -> rbp - 4, will contain variable b, and it will contain intial value 20 
   
How are actual-parameters passed to function_x ?
In x86-64 the formal- arguments are passed via registers and stack. First argument is passed via rdi then rdi, rdx, rcx, r8 and r9. If there is 7th argument then it is passed directly pushing to stack.

   movl  -8(%rbp), %eax   -> Copies the value of variable a (rbp - 8) to eax first
   movl  %eax, %edi       -> same value (variable a) is copied to edi (first argument)
   call  function_x       -> Then the function_x is called.

How are formal-parameters received in function_x ?
See the assembly code after label function_x, this time i have removed the unwanted code

   pushq %rbp             -> Old rbp is pushed to stack to be restored later
   movq  %rsp, %rbp       -> New rbp is set (Our frame of function_x starts here)
   subq  $32, %rsp        -> New Frame of 32 bytes is allocated.
   movl  %edi, -20(%rbp)  -> First actual arugment is copied to rbp - 20 (variable x) formal argument
   movl  $30, -8(%rbp)    -> Local variable c is set up at rbp - 8
   movl  $40, -4(%rbp)    -> Local variable d is set up at rbp - 4

How will C come to know where to return after executing the function ?
call function_x will push the PC value to rsp implicitly. And the instruction ret will pop it out back to PC to return to place after the function call.

How is Frame restored when it returns from the function ?

There is instruction called leave at the end of the function which is equivalent to 

   mov rsp,rbp
   pop rbp

mov rsp,rbp, here rsp is moved back to start of the frame, by this entire frame is released. Then poping rbp will copy the old rbp in stack back to rsp, (hence setting up the old frame).

If you want one more perspective and more detailed explanation then visit here


Links

Next Article - C Programming #38: Function - recursion
Previous Article - C Programming #36: Parameterised Macro
All Article - C Programming

No comments :

Post a Comment