Lecture 25 - Functions II
Goals
- Learn about the roles of all of the ARM32 registers
- Learn what happens to LR when we have to call a second function
- Learn what happens when we have more parameters than registers
Registers
I was asked about r12, since we have identified a couple of special registers: - r11: FP - r13: SP - r14: LR - r15: PC
For the others - R0-R3: arguments and result (R0) - R4-R10: general purpose registers (callee-saved) - R12: intra-procedure scratch - rarely used for linked in libraries
Importantly, r13-r15 are all hardware level special purpose registers. They are associated with specific instructions in the ISA, so they have to be wired specially. r11 and r12 are special, but that is determined by the compiler. Their roles are convention, but a different compiler could make a different choice.
Saving registers across calls
I listed r4-r10 as “callee saved”. What does that mean?
The way we think about calling a function is that we expect the system to go off, perform some computation and then return us to exactly where we were with the result. It should not affect our local environment at all. All of the code we have been looking at has been unoptimized and all of our local variables are stored on the stack and fetched from the stack for every call. More optimized code would make better use of our registers which are faster than main memory.
The downside of leaving variables in registers, however, is that the function we are calling may want to use the same registers. To preserve the values, they will need to be written into memory. There are a variety of strategies we can employ for this.
In our unoptimized approach, the source of truth for every variables is already in memory, so we don’t worry about it.
We can make it the responsibility of the caller to save any register values it would like when the function call completes. We see this for registers r0-r3 already since they will be used to pass values or return a value. The caller already knows that what is in there at the start of the call may not be there when it comes back. The general pattern with caller saved registers is that the first function saves the value on the stack before the function call is made, and then reads the value back into the register after the call.
The other policy is to make registers callee saved. The idea with these registers is that the first function can load a value into one of those registers and expect that the value wil still be there when the call returns. If the callee (the function being called) wants to use a callee saved register, it needs to preserve the value in it. The pattern will be to add the value of the register onto the stack at the start of the function. Use the register however we want. Then at the end, load the value back into the register before the function returns. To the caller, it will appear as if the value was never touched.
Multiple return addresses
If the return address is in a register don’t we have a problem if we call a second function from the first?
Yes, we certainly do. What should we do?
Save the return address on the stack
Let’s take a look at some code with multiple functions
int sum(int x, int y){
int result;
result = x + y;
return result;
}
int do_math(int x, int y){
return 2 * sum(x,y);
}
int main(int argc, char * argv[]){
int a,b,c;
a = 1;
b = 2;
c = do_math(a,b);
}After we have produced the object code
$ arm-none-eabi-gcc -c func2.c
$ arm-none-eabi-objdump -d -j .text func2.o > func2.s
we get this code:
func2.o: file format elf32-littlearm
Disassembly of section .text:
00000000 <sum>:
0: e52db004 push {fp} ; (str fp, [sp, #-4]!)
4: e28db000 add fp, sp, #0
8: e24dd014 sub sp, sp, #20
c: e50b0010 str r0, [fp, #-16]
10: e50b1014 str r1, [fp, #-20] ; 0xffffffec
14: e51b2010 ldr r2, [fp, #-16]
18: e51b3014 ldr r3, [fp, #-20] ; 0xffffffec
1c: e0823003 add r3, r2, r3
20: e50b3008 str r3, [fp, #-8]
24: e51b3008 ldr r3, [fp, #-8]
28: e1a00003 mov r0, r3
2c: e28bd000 add sp, fp, #0
30: e49db004 pop {fp} ; (ldr fp, [sp], #4)
34: e12fff1e bx lr
00000038 <do_math>:
38: e92d4800 push {fp, lr}
3c: e28db004 add fp, sp, #4
40: e24dd008 sub sp, sp, #8
44: e50b0008 str r0, [fp, #-8]
48: e50b100c str r1, [fp, #-12]
4c: e51b100c ldr r1, [fp, #-12]
50: e51b0008 ldr r0, [fp, #-8]
54: ebfffffe bl 0 <sum>
58: e1a03000 mov r3, r0
5c: e1a03083 lsl r3, r3, #1
60: e1a00003 mov r0, r3
64: e24bd004 sub sp, fp, #4
68: e8bd4800 pop {fp, lr}
6c: e12fff1e bx lr
00000070 <main>:
70: e92d4800 push {fp, lr}
74: e28db004 add fp, sp, #4
78: e24dd018 sub sp, sp, #24
7c: e50b0018 str r0, [fp, #-24] ; 0xffffffe8
80: e50b101c str r1, [fp, #-28] ; 0xffffffe4
84: e3a03001 mov r3, #1
88: e50b3008 str r3, [fp, #-8]
8c: e3a03002 mov r3, #2
90: e50b300c str r3, [fp, #-12]
94: e51b100c ldr r1, [fp, #-12]
98: e51b0008 ldr r0, [fp, #-8]
9c: ebfffffe bl 38 <do_math>
a0: e50b0010 str r0, [fp, #-16]
a4: e3a03000 mov r3, #0
a8: e1a00003 mov r0, r3
ac: e24bd004 sub sp, fp, #4
b0: e8bd4800 pop {fp, lr}
b4: e12fff1e bx lr
I want to take a look at the start of the do_math function
push {fp, lr}
We already saw push for pushing the fp on the stack. This pushes both on. Notice how we pop both at the end as well.
Why do we sometimes pass the value in a register, and sometimes store it in the stack?
Registers are fast. We don’t have to go out to memory so we have a gain there, but the memory used to make registers itself is also much faster to read and write.
parameters revisited
We have a similar issue with the parameters
We saw earlier that the parameters were being passed in registers, but what happens when we have more parameters than registers?
int f(int a, int b, int c, int d, int e, int f, int g, int h)
{
int result = a + b + c + d + e + f + g + h;
return result;
}
int main(int argc, char *argv[])
{
int c = f(0, 1, 2, 3, 4, 5, 6, 7);
return c;
}00000000 <f>:
0: e52db004 push {fp} ; (str fp, [sp, #-4]!)
4: e28db000 add fp, sp, #0
8: e24dd014 sub sp, sp, #20
c: e50b0008 str r0, [fp, #-8]
10: e50b100c str r1, [fp, #-12]
14: e50b2010 str r2, [fp, #-16]
18: e50b3014 str r3, [fp, #-20] ; 0xffffffec
1c: e51b2008 ldr r2, [fp, #-8]
20: e51b300c ldr r3, [fp, #-12]
24: e0822003 add r2, r2, r3
28: e51b3010 ldr r3, [fp, #-16]
2c: e0822003 add r2, r2, r3
30: e51b3014 ldr r3, [fp, #-20] ; 0xffffffec
34: e0822003 add r2, r2, r3
38: e59b3004 ldr r3, [fp, #4]
3c: e0822003 add r2, r2, r3
40: e59b3008 ldr r3, [fp, #8]
44: e0822003 add r2, r2, r3
48: e59b300c ldr r3, [fp, #12]
4c: e0822003 add r2, r2, r3
50: e59b3010 ldr r3, [fp, #16]
54: e0823003 add r3, r2, r3
58: e1a00003 mov r0, r3
5c: e28bd000 add sp, fp, #0
60: e49db004 pop {fp} ; (ldr fp, [sp], #4)
64: e12fff1e bx lr
00000068 <main>:
68: e92d4800 push {fp, lr}
6c: e28db004 add fp, sp, #4
70: e24dd020 sub sp, sp, #32
74: e50b0010 str r0, [fp, #-16]
78: e50b1014 str r1, [fp, #-20] ; 0xffffffec
7c: e3a03007 mov r3, #7
80: e58d300c str r3, [sp, #12]
84: e3a03006 mov r3, #6
88: e58d3008 str r3, [sp, #8]
8c: e3a03005 mov r3, #5
90: e58d3004 str r3, [sp, #4]
94: e3a03004 mov r3, #4
98: e58d3000 str r3, [sp]
9c: e3a03003 mov r3, #3
a0: e3a02002 mov r2, #2
a4: e3a01001 mov r1, #1
a8: e3a00000 mov r0, #0
ac: ebfffffe bl 0 <f>
b0: e50b0008 str r0, [fp, #-8]
b4: e3a03000 mov r3, #0
b8: e1a00003 mov r0, r3
bc: e24bd004 sub sp, fp, #4
c0: e8bd4800 pop {fp, lr}
c4: e12fff1e bx lr
We can see that r0-r3 were used for the first three values. Beyond that they are pushed on the stack
What’s more these are being added within main‘s’ stack frame. At the beginning you can see that we are reserving a good sized chunk for main on the stack
The other interesting thing is that there is no “unpacking” of the values in memory. Instead, the function is reaching in and accessing the values directly with a positive offset from the fp. The compiler is fulfilling a contract that the last items in the caller’s frame will be the values we want access to.
Mechanical level
vocabulary
Skills