pete > courses > CS 202 Spring 24 > Lecture 19: assembly programs and toolchain


Lecture 19: assembly programs and toolchain

Goals


given the ARM reference I wrote, do you think you could, given time, write an assembler?

do you think you could write a disassembler?

(I won’t ask you to do either of these)

sure, it’s a straightforward, totally mechanizable process

and you’re practicing it on the assignments, so I’m not going to bother you with it anymore

let’s look at some very small programs written in ARM32 assembly


Program The First

e3a0200c    mov r2, #12
e3530000    cmp r3, #0
aa000001    bge #1
e2633000    rsb r3, r3, #0
e0822003    add r2, r2, r3

let’s take it instruction by instruction

and keep a record of what the registers contain

after "mov r2, #12", register r2 contains 12 (decimal)

after "cmp r3, #0", the CC regs reflect the comparison between the contents of r3 and 0

now the tricky part: the branch, which does different things based on what’s in the CC regs

if the CC regs say "greater than or equal to" (ie, the contents of r3 >= 0), we skip one instruction

(one way to think of positive offsets to branch instructions is the number of instructions to skip)

so if r3 >= 0, we skip the rsb instruction

(yes, that’s a new instruction, I’ll get to it in a second)

therefore, only when r3 < 0 do we execute the rsb instruction

rsb is pretty simple: it’s a sub with reversed operands

so "rsb r3, r3, #0" means "r3 <- 0 - r3"

and finally "add r2, r2, r3" adds the contents of r2 and r3 and puts them in r2


we’ve gotten our hands really dirty here

and sometimes when you do that, it’s tough to see the bigger picture

so let’s step back and re-write this a a bit at a higher level of abstraction (ie, worrying more about effects rather than method)

put #12 into r2

compare r3 with 0

if the comparison shows that r3 < 0, we negate r3

then we add r2 and r3

in a nutshell, we’ve performed this operation: r2 <- 12 + absolute-value(r3)

being able to see some chunk of code, deduce its function, and describe it concisely at a higher level of abstraction is important


so we’re going to practice it

here’s a chunk of code

you’ve got 15 minutes, you are encouraged to talk amongst yourselves

e3a0000a    mov r0, #10
e3a01000    mov r1, #0
e3a02005    mov r2, #5
ea000002    b #2
e0811000    add r1, r1, r0
e2422001    sub r2, r2, #1
e3520000    cmp r2, #0
cafffffc    bg #-4

the tricky part is the branch instruction: remember that it works relative to the PC + 4 (ie, the address of the subsequent instruction)


answer: calculates product of 5 and 10, puts result in r1

and it uses a loop to do it!

if we were to write Java- or C-like code, it might look like this:

r0 = 10;
r1 = 0;
r2 = 5;

while(r2 > 0) {
    r1 = r1 + r0;
    r2 = r2 - 1;    
}

how did I get all this assembly language and machine code?

certainly I didn’t compose it myself: I’m far too lazy for that

which means it’s time to review the chain of tools that get us from source code to machine code

these programs are collectively referred to as the "toolchain"

no, this is not a coincidence

in reviewing the toolchain, I’m going to start tying these vague vocabulary words to actual programs we will use for these purposes over the rest of the semester


compiler translates source code to assembly (we’ll be using gcc)

assembler translates assembly to machine code (gcc for this, too)

disassembler translates machine code to assembler (objdump)

you’ll note that these are inextricably tied to a particular ISA

recall, the ISA is (among other things) the set of instructions supported by a given CPU

since the set of instructions is particular to an ISA, the assembly language itself is going to be particular to an ISA

and therefore the compiler, assembler, and disassembler are also going to be particular to an ISA


there are a few other programs that fall under the broad heading of toolchain

one is the program that lets you step through your code step by step, examining the state of the machine at each point

this is called a debugger and the one we’ll use is gdb

we may valgrind later on when we see the heap

I’m hoping to show you what the linker does, but not for a few weeks


let’s talk C, because that’s what’s next

you guys are comfortable with the concepts of, eg, variables, conditionals, loops, classes, and functions in Java and/or Python

my goal is to simultaneously teach you C and show how higher-level programming languages translate to assembly code

so I’m going to write up very simple C programs that demonstrate these concepts

and show the assembly that is produced when we feed them to a compiler

then we get to writing larger programs in C, it’s on you to take the high-level notions you already have of variables, conditionals, loops, etc and translate those ideas to C

Last modified: