CS 202 Lecture 18 – control structures

pete > courses > CS 202 Spring 24 > Lecture 18: control structures

Lecture 18: control structures

Goals

define control structure
describe the purpose and behavior of the branch instruction
describe the purpose and behavior of the compare instruction

in the previous lecture, we resolved two important issues

firstly, instructions are stored in memory

in a Von Neumann-style computer (of which most computers we directly interact with are examples) instructions and data both reside in the same memory

the other—Harvard-style architecture—has separate memories for data and instructions

we discussed trade-offs and I said that real computer processors these days often incorporate elements of both

then we talked about how to achieve sequential execution of a program written in assembly language

since instructions are stored in memory, I presented the idea of a register that contains the address of the currently-executing instruction

I called this the program counter (PC) or instruction pointer (IP)

we concluded that, to cause sequential execution, we just needed to increment the PC by 4 after an instruction finishes executing

(four because each instruction is 4 bytes long and we’re working with byte-addressable memory)

let’s revisit the example sequence of instructions from Friday, but I’m going to give them memory addresses (the left-most column)

96:  mov r0, #180
100: mov r1, #42
104: add r2, r1, r0
108: xor r0, r0, r0
112: str r2, [r0]

additionally, beyond the three registers explicitly mentioned in the instructions above, we’ll start keeping track of the program counter (PC)

it’ll have an initial value of 96, indicating that the instruction at that address (the first mov) is the instruction about to be executed

therefore, the sequence of events from Friday plays out like so:

read the program counter (96) and fetch the instruction at that address (mov r0, #180)
execute that instruction: r0 gets the value 180
increment the program counter by 4: PC now has the value 100
read the program counter (100) and fetch the instruction at that address (mov r1, #42)
execute that instruction: r1 gets the value 42
increment the program counter by 4: PC now has the value 104
read the program counter (104) and fetch the instruction at that address (add r2, r1, r0)
execute that instruction: r2 gets 222
increment the program counter by 4: PC now has the value 108
read the program counter (108) and fetch the instruction at that address (xor r0, r0, r0)
execute that instruction: r0 gets 0
increment the program counter by 4: PC now has the value 112
read the program counter (112) and fetch the instruction at that address (str r2, [r0])
execute that instruction: store the value 222 in main memory at address 0
increment the program counter by 4: PC now has the value 116

and now there are no more instructions, so we can imagine the program is done (this is an oversimplification, but will do for the purposes of this course)

this is great, but there are a huge number of programs we can’t write

we need some way to implement the equivalent of conditionals, loops, and functions—collectively referred to as "control structures"

because they alter the "flow of control"

by which I mean "which instruction executes next"

by default, the flow of control is such that instructions execute sequentially, by incrementing the program counter by 4

but conditionals, loops, and functions all require that the program counter behave differently

concretely, what we need is some way to counteract this inexorable march of PC <- PC + 4

that is, sometimes we want the PC to advance to the next instruction and sometimes we want it to take on a completely different value

let’s consider the simplest control structure you’ve seen: the "if" statement

here’s an example written in Java (though it also happens to be valid C):

if(x > y) {
    x = x - y;
}

y = x;

in general, it says: if some condition is true, execute some block of code

in assembly language terms, that "block of code" is just a bunch of instructions

the condition is a bit less straightforward

let’s make it stupid-simple and imagine the condition is a register: if this register contains zero, the condition is false; if the register contains non-zero, the condition is true

so we want something to go right before the block of assembly instructions

if the condition is true, what do we want? we want the PC to be incremented by 4 as normal

but if the condition is false, we want to set the PC such that the instruction following the block is executed next

this is potentially counter-intuitive

when the condition is false, we want something special to happen

only when the condition is true do we want the normal thing (ie, PC <- PC + 4) to happen

unsurprisingly, there is an instruction to do just this

that instruction is called branch and it is given the mnemonic "b"

here is what it looks like

 3   2                   1                   0
 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+---------------------------------------------------------------+
| cond  |1 0 1 0|                    offset                     |
+---------------------------------------------------------------+

first thing to note: 4-digit opcode!

second thing to note: we’re actually using the first four bits of the instruction!

third thing to note: that’s a biiiiiig offset!

the way it works is that the "cond" field specifies a particular condition

if that condition evaluates to false, PC <- PC + 4 as normal

if it evaluates to true, PC <- PC + 4 + sign-extend(offset) * 4

note that, because the offset is sign extended, we can both add to the PC and subtract from it

adding jumps to larger addresses (later instructions) and subtracting jumps to smaller addresses (earlier instructions—this is useful for loops!)

here are the values for the "cond" field

0 0 0 0     equal
0 0 0 1     not-equal
1 0 1 0     greater-than-or-equal
1 0 1 1     less-than
1 1 0 0     greater-than
1 1 0 1     less-than-or-equal
1 1 1 0     always

but how does this even work?

"equal" is all well and good, but we need two things to compare to see if they’re equal!

there’s no room in the branch instruction to specify these things!

enter the CMP instruction

unsurprisingly, there’s both an immediate version and a register version

here’s the immediate version

 3   2                   1                   0
 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+---------------------------------------------------------------+
|       |0 0 1 1 0|     |  Rn   |       |         imm12         |
+---------------------------------------------------------------+

so "cmp r2, #42" will compare the contents of register r2 with the sign-extended immediate value imm12 and set the condition code registers accordingly

the register version looks predictably similar

 3   2                   1                   0
 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+---------------------------------------------------------------+
|       |0 0 1 1 0|     |  Rn   |                       |  Rm   |
+---------------------------------------------------------------+

so "cmp r2, r1" will compare the contents of register 2 with the contents of register r1 and set the condition code registers accordingly

(there are versions with the shift as we saw in the load/store instructions, but those are an unnecessary complication right now)

revisiting the example Java/C code previously, we could imagine the following sequence of assembly instructions:

(assume that r0 holds the value of x and r1 holds the value of y)

if(x > y) {                     cmp r0, r1
                                ble #1
    x = x - y;                  sub r0, r0, r1
}
y = x;                          mov r1, r0

note that the if statement is actually represented by two instructions, whereas the assignments are each represented by a single instruction

why branch "always" ?

consider this modification to the previous Python code:

if(x > y) {
    x = x - y;
} else {
    x = x + y;
}
y = x;

suppose the "if" condition evaluates to true, in which case the body is executed ("x = x - y")

what, then, do we want to happen when we finish executing the assembly instructions that implement "x = x - y" ?

we need to jump over the instructions that implement the "else" body ("x = x + y")

thus, if we reach that point, we must always branch

once again, the Java/C code from above could be written using assembly like the following:

if(x > y) {                     cmp r0, r1
                                ble #2
    x = x - y;                  sub r0, r0, r1
                                b #1
} else {
    x = x + y;                  add r0, r0, r1
}
y = x;                          mov r1, r0

note that the end of the "if" block is marked by an unconditional jump over the "else" block

also note that the offset to the ble instruction is now 2 because we need to jump over both the sub and the b instructions to get to the "else" case

now the question is: our condition code registers are N, Z, and P

but the branch conditions are things like "equal" and "greater-than" and so on

how do we implement the latter with the former?

that is indeed a good question, which you will be figuring out in a future assignment

Definitions

The following definitions introduced in this lecture are fair-game for future quizzes. You will be expected to give the exact definition as provided in these lecture notes.

control structure