CS 202 Lecture 12 – assembly language

pete > courses > CS 202 Spring 24 > Lecture 12: assembly language

Lecture 12: assembly language

Goals

identify the components of a CPU instruction
define machine code, assembly language, assembler, and disassembly
compare and contrast DRAM and SRAM
describe why computers usually include both kinds of memory
enumerate some differences between RISC and CISC architectures

last time, we started to almost get a glimpse of how all these circuity things we’ve been working with can kind of behave like a computer

we have some combinational logic to perform mathematical operations on data—we call this component the ALU

then we have another chunk of storage elements to hold the operands and results for our computations

the latter is a group of registers, which we call a register file

then, just as in the traffic light example, we can make it do what we want by picking appropriate values for the inputs to those two components

and combine the signals that choose the appropriate values into a single bundle of wires, whose value we call an instruction

so to execute (the equivalent of) the instruction x = y + z, we’d need to specify:

the register where the two operands (y and z) reside;
the operation to perform (addition);
and which register to put the result in.

let’s assume we have sixteen registers at our disposal and that register 4 is set aside for x, register 2 is set aside for y, and register 5 is set aside for z

sixteen registers means we need 4 bits to identify a particular register (because 2^4 = 16)

let’s also assume that the magical code we feed to the ALU to get it to perform addition is 100 (binary)

thus to cause this code to execute, we’d need to supply the following instruction:

100 0100 0010 0101
 ^    ^    ^    ^
 |    |    |    +--- register allocated to z
 |    |    +-------- register allocated to y
 |    +------------- register allocated to x
 +------------------ addition

then we package up these 15 bits and we suddenly have a single instruction that will make our register-file/ALU combo dance in a manner analogous to the 12-bit "instructions" we fed to the traffic light last week

thus a program will be a sequence of similar instructions that collectively implement the logic required

the 15-bit instruction above uses 4 bits each to represent three different registers (two sources and one destination)

the number of bits used to specify a register is a direct consequence of the design of the processor: we need enough bits to specify any register, so the number of registers in the proc affects the number of bits we need in the instruction

recall that "number of registers" is one of the things defined by the ISA

consequently, the length and format of instructions is also defined by the ISA

as you might imagine, different people have different ideas about the details of a given ISA, ideas evolve over time, some are designed for specific purposes, etc

which results in different ISAs existing

not just that, but there are two primary "families" of ISAs and you need to know them

they are RISC (reduced instruction set computers) and CISC (complex instruction set computers)

the instruction set originated by Intel for its processors and since duplicated by other companies to produce compatible processors is (often) called "x86", and is a representative of the CISC family

the ARM company’s ISA (which is itself called "ARM", and is used in the majority of smartphones these days) is a representative of the RISC family

modern Apple computers also use chips based on the ARM ISA

on the topic of registers, one (general) hallmark of RISC ISAs is that they have many general-purpose registers

by "general-purpose" I mean that they don’t have prescribed uses: they can be used as source and destination for nearly any instruction

by contrast, CISC ISAs often have many special-purpose registers meaning that they can only be used for particular purposes

as an example, in x86, the destination register is often always the same: a register called rax

this is a special-purpose register that holds the result of computations (indeed, many x86 instructions do not allow the programmer to specify the destination: it is implicitly hard-wired to be the rax register)

another difference is in the form of the instructions themselves

instructions for a given RISC architecture are all the same length: for example, in the 32-bit version of ARM (which is the specific ISA we will be looking at this semester) all instructions are 32 bits long

CISC architectures have variable-length instructions, meaning that you could see a 16-bit instruction followed by a 32-bit instruction, followed by a 48-bit instruction, all in the same program (even though they’re variable length, they are still a multiple of 8 bits, because these machines all use byte-addressable memory)

these are very broad-sweeping, general statements; there could very well be exceptions; you do need to know the high-level trends, though

you may have seen computer punchcards

this is how instructions (and data, for that matter) were originally communicated to a computer

each row represents a datum

a punched-out hole represents a 0, a non-hole represents a 1

(or vice versa, I don’t actually know, but you get the idea)

therefore, to run a program (ie, a series of instructions) you created a stack of punchcards containing the appropriate instructions

you took the stack down to the computer lab, stuffed it in the hopper, and a few hours later you picked up the stack of cards that encoded the result

this is kind of crazy, and not at all user-friendly—either to the person who wants to run a program or to a person who wants to write one

so, like good computer scientists are wont to do, they automated it

first, they created a human-understandable language they could use to describe the instructions

and then they wrote software to translate from the human-understandable language to the computer-understandable instruction

so this machine-code instruction

100 0100 0010 0101

will translate to/from this assembly language instruction:

ADD r4 r2 r5

vocab time

machine code: the raw series of bits that make up a machine-understandable instruction

assembly language: the human-readable language that describes machine instructions

assembler: the program that translates from assembly language into machine code

disassembly: the process of translating backwards, from machine code to assembly language

caveat: like most things in the course, there are exceptions; for example, sometimes you’ll hear "assembler" used to refer to the human-readable language as opposed to the translation software

when you see these terms out in the real world, be aware of context and don’t be afraid to ask for clarification!

so what does assembly language look like?

it’s a translation to/from machine code instructions

recall that instructions in ARM32 are always exactly 32 bits long, so a single assembly instruction is going to translate to a single 32-bit ARM32 machine instruction (and vice versa)

here’s an example instruction written in assembly: ADD r0, r1, r2

it starts with a short word that describes the operation being performed (eg, "ADD")—we call this the opcode or the mnemonic

then comes the destination, followed by the operands

so this assembly language instruction will translate to a machine code instruction that causes the contents of registers r1 and r2 to be added together and the result put in r0

another caveat: this is one style of ordering operands; the other style puts the destination last (note that this doesn’t affect the machine code: the assembler just needs to make sure to translate the instruction correctly)

once again: when you get to the real world, be sure what you’re working with

we’ll see way more instructions in the coming weeks

these instructions operate on… operands

and the operands are, to the best of our knowledge at this point, stored in registers

if we only have, say, sixteen registers, though, this potentially limits the kinds of programs we can write

my guess is some programs you’ve written have required you to keep track of more than 16 things simultaneously

how can we accommodate that?

we can add more registers!

unfortunately, this isn’t as straightforward as we might like

the technology we’ve used to create registers (ie, flip-flops) has some downsides: despite being wicked-fast, it’s also expensive, requires a relatively large amount of energy, and takes up a relatively large amount of physical space to store a single bit

(less energy and smaller space lead to faster processors, which is why we want to avoid/minimize uses of energy and space if possible)

allow me to introduce you to another way to store a bit

a capacitor is another electrical component that stores a charge for a little while

capacitor demo

it can be manufactured much more densely and cheaply than flip-flops, though it isn’t quite as fast in giving up its data

additionally, the capacitor doesn’t hold its charge forever, so it must be refreshed from time to time

despite these drawbacks, memory built from capacitors is vital to modern computers

in fact, those strips of memory that plugged into the motherboard use precisely this technology

(if you have no idea what a motherboard is and/or have never seen the inside of a computer, we will get there soon)

so what happens is that modern computer processors have a small register file built out of flip-flops that stores values for immediate use

and a larger collection of slower but more capacious "main memory" built out of capacitors that stores values for less-immediate use ("more capacious" meaning "has more capacity")

as you can imagine, this has tradeoffs, some of which we’ll discuss later in the semester

for now, however, one question is salient: how do we operate on data in main memory?

here we will see another factor that differentiates RISC and CISC processors

RISC processors typically don’t have instructions that can operate on values in memory

instead, they have explicit instructions that transfer data from memory into a register (called a "load" instruction) and others that transfer data the opposite direction: from a register into memory (a "store")

in contrast, CISC processors often have instructions that can use directly use data in memory as both source and destination

more vocab

the kind of memory we use for the register file, built out of flip-flops, we call SRAM (static random-access memory)

"static" because it maintains its value—ie, doesn’t need to be refreshed

the kind of memory we use for main memory, built out of capacitors, we call DRAM (dynamic random-access memory)

"dynamic" because it needs to be refreshed (I think; don’t quote me on this one—you don’t need to know why it’s called this, but you do need to know that it’s called this)

both are "random-access" meaning that we can arbitrarily read or write any item within the memory

one drawback these two types of memory share is that they both lose what they’re storing when they lose power

we refer to such memory as volatile

whereas nonvolatile memory remembers what it’s storing even when the power goes out

flash memory, for example, is nonvolatile

clearly, there is significant benefit to being able to remember stuff even when we lose power, but we’re going to ignore that minor issue until the last couple weeks of the semester