pete > courses > CS 202 Spring 24 > Lecture 17: simple assembly program
Lecture 17: simple assembly program
Goals
- interpret the meaning of a series of assembly instructions as a whole (ie, a program)
- perform bitwise operations
- explain Harvard and von Neumann style architectures and their tradeoffs
- describe the purpose of the program counter (instruction pointer)
loads and stores in x86
c7 45 cc 00 00 00 00 mov DWORD PTR [rbp-0x34],0x0 Meaning: Mem[rbp-0x34] <- 0
8b 45 cc mov eax,DWORD PTR [rbp-0x34] Meaning: eax <- Mem[rbp-0x34]
89 54 85 d0 mov DWORD PTR [rbp+rax*4-0x30],edx Meaning: Mem[rbp+rax*4-0x30] <- eax
so we’ve got all these instructions now
we can perform operations like addition and subtraction on values in registers
and when the registers don’t give us enough working space, we can resort to main memory by using the load and store instructions
keeping in mind that we also know all the abstractions and hardware that actually make these things work
now we can start to write programs!
that is, sequences of instructions that solve larger problems
so what does a program written in assembly language look like?
mov r0, #180 mov r1, #42 add r2, r0, r1 eor r0, r0, r0 str r2, [r0]
here we’re adding two numbers together and storing their result at memory location 0
(no, we haven’t officially seen the eor instruction yet, but it has predictable effects—it’s also a common method to set a register to zero)
one thing you need to know how to do is interpret the behavior of an assembly program like the one above
by which I mean understand the individual steps it performs and the higher purpose it fulfills (the latter is abstraction!)
to the first question, this program is similar to Python or Java programs you’ve written in that it proceeds from one instruction to the next in sequence, performing the operation indicated at each step
one way to deduce a program’s behavior is to keep track of how its state changes throughout execution
therefore, I start with a diagram of a register file whose contents we do not yet know and then evaluate the instructions in turn
(I simulate this lack of knowledge by leaving entries in the register file blank—registers cannot be floating, so they will have a value, we just can’t know it, and therefore shouldn’t assume anything about it)
the first instruction is "mov r0, #180"
this is the "immediate" variant of the MOV instruction
meaning that it takes the value 180 and puts it into register r0
therefore after the first instruction executes, r0 contains 180 and the rest of the registers are still unknown
by the same token, after the second instruction executes, r1 contains 42
and after the third instruction executes, r2 contains 222, which is the sum of the contents of r0 and r1
a brief digression back to finite state machines
the contents of the registers at any given instant is the state
an instruction causes the transition from one state (ie, particular contents of registers) to another state (ie, a different set of contents in registers)
the fourth instruction is new and different in a couple ways: eor r0, r0, r0
we haven’t seen the eor instruction before, nor have we seen an instruction whose source and destination registers are all the same
furthermore, the value in r0 is 32 bits, and we’ve only seen what xor does in the context of single-bit values, so how does it extrapolate to bigger inputs and output?
the answer is that this is bitwise xor, meaning that the zeroth bit of the output is the result of xor’ing the zeroth bit of the first operand with the zeroth bit of the second operand
likewise for the other 31 bits
it just xor’s each corresponding pair of bits
so if I was performing bitwise xor on 10101 and 11011, I would get:
10101 xor 11011 --------- 01110
because I apply the single-bit xor operation by columns
what, therefore, is the effect of "eor r0, r0, r0" ?
since the two operands are the same, every pair of bits xor’ed together will be the same
and the result of xor’ing a pair of identical bits is always zero
therefore the result of this operation will always be zero
this is a popular way of getting the value zero into a register
(there are other, perhaps more intuitive ways, but this one is sufficiently popular that you need to be able to recognize it)
so the upshot is that r0 now contains zero, r1 contains 42, and r2 contains 22
finally: "str r2, [r0]"
this says "take the contents of register r0, use those contents as a memory address, and store the contents of r2 at that address"
given the current state of the register file, this means "store 222 at address 0"
because r2 contains 222 and r0 contains 0
thus the ultimate effect of the program is to store 222 at address 0
but even this simple program raises some questions: where are the instructions themselves stored and how do we achieve the "execute them one by one" effect?
first: where are these instructions stored?
(because this will lead to answering the other question)
they don’t just appear out of the ether, to be executed by the hardware you’re building for this week’s assignment
so where do they live?
where might they live?
well we’ve already got all this memory stuff to store bits, what if instructions live there, too?
good idea, let’s do that
now another question: we’re already going to use memory to store data (ie, values we’re operating on), do we want to intermingle this data with instructions?
that is, do we use the same 32-bit address space to simultaneously store instructions and data?
there are two possibilities: "yes" and "no"
over the decades, different computers have picked different answers
and the relative approaches, unsurprisingly, have been given names
a "Harvard architecture" is one in which data and instructions are segregated, often physically
in contrast, a "Von Neumann architecture" has data and instructions inhabit the same address space
this means that in a Von Neumann architecture, you perform loads and stores against a single pool of memory
the bits you read might be an instruction or they might be data
(and recall that bits have no type! there is no way to look at bits in memory and know it is or is not an instruction, just like you can’t know whether bits represent an integer or a floating point number or a string of ASCII)
in a Harvard architecture, you have two separate pools: one exclusively for instructions and the other exclusively for data
there are tradeoffs
in a Harvard architecture, you need twice the hardware to perform memory operations
but in the Von Neumann architecture, one could imagine writing a program that overwrites its own instructions
(perhaps surprisingly, this may be simultaneously a bug and a feature)
there are other considerations that we’ll discover as we explore computer systems over the next few weeks
for now, though, know that most general-purpose computers are Von Neumann-ish
and many special-purpose computers are Harvard-ish
I say "ish" because, like most things in this course, there are exceptions
and, in fact, the organization of modern Intel processors (which I use as the standard) are inspired by both—we’ll see this near the end of the semester
okay, on to the second question: how do we achieve the "execute one instruction after another" effect?
but first, given that instructions are stored in memory, how do we get the "execution just one instruction" effect?
first we have to grab the instruction from memory
when we get around to discussing the phases of instruction execution, we’ll call this step "instruction fetch"
it means we need to have the address of the instruction to execute
and in fact we save this address in a register
called the "program counter"
a register that stores the address of the instruction about to execute
sometimes also called the "insruction pointer" because it points to the instruction we want to fetch and execute (ie, indicates its location)
abbreviated PC or IP
interestingly, in ARM, the PC is just another register
you can actually use it as an operand to many instructions: it’s r15
given we’ve got this register that stores the address of the instruction to execute, what does the process of executing an instruction look like?
well, you’re building it (though you’re not dealing with the PC for this assignment: that’s for hw6)
going back to the code I showed at the beginning of class today, what should happen to the PC after every instruction executes?
that is, what needs to happen so that the PC contains the correct address?
increment it by the size of an instruction (4 bytes)
side-note: here the uniform instruction length of RISC architectures makes life easier: we always increment the PC by the same amount—this is not the case for CISC machines
we’ve now answered the questions we posed ourselves: where are instructions stored and how do we get the sequential execution of instructions to happen?
instructions are stored in memory, sometimes in the same memory pool as data (Von Neumann) and sometimes not (Harvard)
and we keep the address of the next instruction in a register, which we call the Program Counter, so that we can fetch it and execute it
we’re that much closer to being able to write real programs
but real programs feature some operations that the instructions we’ve looked at just can’t do
loops, conditionals, function calls
collectively referred to as "control structures"
because they control which instructions get executed