The Obr compiler generates code to run on a simple RISC architecture that was originally described by Niclaus Wirth in his book "Theory and Techniques of Compiler Constuction" (which may be downloaded from http://www-old.oberon.ethz.ch/WirthPubl/CBEAll.pdf). This, in turn, was designed to be close in spirit the MIPS architecture which was originally designed by John L. Hennessey of Stamford University.
These notes provide a brief overview of RISC architecture sufficient for
understanding the code generated by the Obr compiler used in
COMP332 exercises and assignments. Further detailed information about the RISC
instruction set and machine execution may be gleaned by browsing the
files RISC.scala
and RISCISA.scala
in this code
bundle.
RISC is a 32-bit architecture in which all general purpose registers, memory addresses and machine instructions are 32 bits wide. In the subset provided by our emulator code RISC instructions support the direct manipulation of 32-bit integers stored in twos-complement form. Furthermore, it supports the direct addressing of a memory space of upto 2^32 bytes. While addresses in the RISC architecture refer to individual bytes, all 32-bit values can only be directly accessed if they are aligned on 4 byte boundaries.
$0 this is generally constrained to always contain the value 0.
$28 (PC) this is the program counter, which contains the address of the instruction that is currently being executed.
$29 (FP) this is the frame pointer, which points to the first byte after the top of the frame (or activation record) associated with the current procedure activation.
$30 (SP) this is the stack pointer, which points to the first byte of the frame associated with the current procedure activation.
$31 (LNK) this is the link pointer,
in which return addresses are stored during subroutine
calls (using the bsr
instruction).
RISC instructions themselves come in four different formats, each of which packs the instruction and information about its operands into a single 32-bit word. These formats are:
inst reg1, reg2, reg3
an instruction with
three operands all of which are registers.
inst reg1, reg2, im
an instruction with
three operands two of which are registers and one of which is an
immediate (fixed constant) 16-bit value (mostly arithmetic and
bitwise operations).
inst reg1, reg2, disp
similar to type 2
instructions, but generally used for branching, load/store and terminal
IO operations (the fixed constant here is again 16 bits and is called
disp
for displacement).
inst disp
an instruction with a single
integer operand which is an immediate (fixed constant) displacement
represented as a 26-bit value (generally used for branching instructions).
Load/store instructions are the only instructions that
access memory. The ldw
and stw
instructions load and store word/single quantities from/to integer
registers.
ldw reg1, reg2, disp ! Load word from address reg2 + disp into reg1. stw reg1, reg2, disp ! Store contents of reg1 to address reg2 + disp.The RISC architecture also provides byte load/store
ldb
and stb
and push/pop instructions push
and pop
(which maintain a stack discipline by appropriately
updating the address stored in reg2
). However, as things
stand the Obr compiler does not make use of these instructions.Here are some loading and storing examples:
ldw $2, $1, 0 ! Load from the address stored in $1 into register $2 ldw $2, $1, 8 ! Load the word from an 8 byte offset from the ! address stored in $1 into register $2. stw $31, $0, 32 ! Store the value held in the link pointer (which is ! probably a return address) in memory at absolute ! address 32 (because register $0 always contains 0).Most arithmetic operations come in type 0 (operating on three registers) and type 1 (operating on two registers and a constant) versions. In each case the first operand specifies the destination register in which a calculated result will be placed and the last two operands specify the sources of the data upon which the operation will act. Notice here that there is noting to prevent an instruction from writing its result back into one of the registers it is using as a source.
Here is a complete list of the 32-bit arithmetic and bitwise operations provided by our version of the RISC architecture:
add dreg, reg1, reg2 ! add values in registers reg1 and reg2 and place result in dreg addi dreg, reg1, im ! add value in register reg1 to constant value im and place result in dreg sub dreg, reg1, reg2 ! subtract value in register reg2 from that in reg1 and place result in dreg subi dreg, reg1, im ! subtract constant value im from value in register reg1 and place result in dreg mul dreg, reg1, reg2 ! multiply values in registers reg1 and reg2 and place result in dreg muli dreg, reg1, im ! multiply value in register reg1 to constant value im and place result in dreg div dreg, reg1, reg2 ! divide value in register reg1 by that in reg2 and place result in dreg divi dreg, reg1, im ! divide value in register reg1 by constant value im and place result in dreg mod dreg, reg1, reg2 ! take remainder of division of value in register reg1 by that in reg2 and place result in dreg modi dreg, reg1, im ! take remainder of division of value in register reg1 byconstant value im and place result in dreg and dreg, reg1, reg2 ! bitwise and values in registers reg1 and reg2 and place result in dreg andi dreg, reg1, im ! bitwise and value in register reg1 with constant value im and place result in dreg or dreg, reg1, reg2 ! bitwise or values in registers reg1 and reg2 and place result in dreg ori dreg, reg1, im ! bitwise or value in register reg1 with constant value im and place result in dreg xor dreg, reg1, reg2 ! bitwise exclusive or values in registers reg1 and reg2 and place result in dreg xori dreg, reg1, im ! bitwise exclusive or value in register reg1 with constant value im and place result in dregThis architecture also combines the common operations of bit shifting, negation and the copying of values from one register to another into a single family of "move" instructions:
mov dreg reg1 reg2 ! shift the value in reg2 the number of bits specified by the value ! in reg1 and put the result in dreg movi dreg reg1 im ! shift the constant value im the number of bits specified by the value ! in reg1 and put the result in dreg mvn dreg reg1 reg2 ! shift the value in reg2 the number of bits specified by the value ! in reg1, negate that shifted number and put the result in dreg mvni dreg reg1 im ! shift the constant value im the number of bits specified by the value ! in reg2, negate that shifted number and put the result in dregNote: in each of these instructions, if register
reg1
contains the positive number n then the bit shifter shifts the value in the other source operand n bits to the left. On the other hand, if register reg1
contains the negative number -n then the bit shifter shifts the value in the other source operand n bits to the right.In the Obr compiler we mainly use the move instructions to simply copy numbers from one register to another
mov $1, $0, $2 ! copies the value from register $2 into register $1 without shifting ! (because register $0 always contains 0)or to negate the value stored in a register:
mvn $12, $0, $12 ! negate the value stored in register $12However, we also combine it with the
or
instruction in order to load a 32-bit constant values into a register. We are required to do this because the 3rd (constant) operand of type 1 and type 2 instructions are only 16-bit numbers. So if we wish to load a 32-bit constant we must first split it into two 16-bit pieces and load these separately into the high (using the mov
instruction) and low (using the or
instruction) 16-bit segments of the destination register. For example, if we have a 32-bit constant int
we divide it into two 16-bit portions int_hi
and int_lo
and load it into register $10
using the following code:
movi $10, $0, 16 ! Load the constant 16 into register $10 - this constant ! specifies the number of bits to shift int_hi as it is loaded movi $10, $10, int_hi ! Shift int_hi 16 bits to the left and load the result into register $10. ! In other words, this loads int_hi into the high 16 bits of register $10. ori $10, $10, int_lo ! Bitwise or int_lo onto the low 16-bits of the contents of register $10.Control transfer instructions alter the program counter. They are used to implement source-level control statements and procedure calls. All branch instructions are type 3 instructions, taking a single 26-bit operand which is interpreted as a relative instruction offset. So, for example. the instruction
br -10
will cause execution to jump to the instruction
10 instructions prior to the current one. In RISC assembly code
places that we wish to jump to are marked by symbolic labels and
our "assembler" automatically turns branches to these labels into
appropriate offset values.
The br
(branch) instruction is an
unconditional transfer of control.
br label1 ! Transfer control to label label1.Subroutine / procedure calling is implemented
bsr
instruction,
which saves the value of the program counter + 1 into the link pointer
(register $31
) before the actual branch action is performed.
Typically, on entry to a procedure the contents of the link pointer is
written into the frame on the call stack of this new procedure activation.
On the other hand the ret
instruction takes a single register
operand and makes a jump to the address whose offset from the current
instruction is given by the content of that register. So we may effect a
procedure return by loading the stored return address from the frame of the
current procedure activation into the link pointer and executing the instruction
ret $31
.
As a special case, if the register referred to by a ret
instruction contains the value 0 then execution of that instruction will cause
the machine to halt completely. So the following instruction is used by the
Obr compiler to halt the execution of a program:
ret $0 ! halts the machine, since register $0 is guaranteed to contain 0Conditional transfers are implemented with the aid of a set of two condition codes called
Z
for zero and N
for
negative. These are set by the following instructions:
cmp reg1, reg2 ! Subtract the value of reg2 from that of reg1 ! if the result is 0 then set Z to 1 else set it to 0 ! if the result is negative then set N to 1 else set it to 0 cmp reg1, im ! Similar to cmp, but subtracts the constant value im from that in ! register reg1Once the condition codes have been set, branch instructions can be used to transfer control if particular condition code bits are set. These conditional branch instructions are:
beq label ! branch if Z = 1 bne label ! branch if Z = 0 blt label ! branch if N = 1 bge label ! branch if N = 0 ble label ! branch if N = 1 or Z = 1 bgt label ! branch if N = 0 and Z = 0
Terminal IO instructions are a slightly unusual feature of this architecture. Usually interations with hardware external to a processor would be achieved via some kind of memory mapped IO mechanism, where registers of the IO hardware would be mapped to memory addresses which may then be written to and read from by the processor. However, for simplicity's sake the RISC architecture provides some higher level instructions for writing and reading integers to and from the terminal:
rd reg ! read a signed 32-bit integer in decimal format from ! the terminal and store it in register reg. wrd reg ! write the contents of register reg to the terminal ! as a signed decimal number. wrh reg ! write the contents of register reg to the terminal ! as an (unsigned) hexadecimal number. wrl ! write a newline character to the terminal.
label2: add $1, $1, $1 ! double value held in register $1 cmpi $1, 256 ! compare value in register $1 with constant 256 ble label2 ! branch back if comparison showed that value in $1 is <= 256As an example, when the Obr compiler compiles the Obr program
PROGRAM X (x : INTEGER) : INTEGER; VAR i : INTEGER; sum : INTEGER; BEGIN sum := 0; FOR i := 1 TO 10 DO sum := sum + i; END RETURN sum; END X.it outputs the following RISC assembly code
! Prologue movi $27, $0, 0 ! StW(Local(0),Read()) rd $1 stw $1, $27, 0 ! StW(Local(8),IntDatum(0)) movi $1, $0, 0 stw $1, $27, 8 ! StW(Local(4),IntDatum(1)) movi $1, $0, 1 stw $1, $27, 4 ! StW(Local(20),IntDatum(10)) movi $2, $0, 10 stw $2, $27, 20 ! Bne(CmpgtW(LdW(Local(4)),LdW(Local(20))),Label(3)) ldw $1, $27, 4 ldw $2, $27, 20 cmp $1, $2 movi $1, $0, 1 bgt label6 movi $1, $0, 0 label6: cmpi $1, 0 bne label3 ! Jmp(Label(2)) br label2 ! LabelDef(Label(4)) label4: ! StW(Local(4),AddW(LdW(Local(4)),IntDatum(1))) ldw $1, $27, 4 movi $2, $0, 1 add $1, $1, $2 stw $1, $27, 4 ! LabelDef(Label(2)) label2: ! StW(Local(8),AddW(LdW(Local(8)),LdW(Local(4)))) ldw $1, $27, 8 ldw $2, $27, 4 add $1, $1, $2 stw $1, $27, 8 ! Bne(CmpltW(LdW(Local(4)),LdW(Local(20))),Label(4)) ldw $1, $27, 4 ldw $2, $27, 20 cmp $1, $2 movi $1, $0, 1 blt label7 movi $1, $0, 0 label7: cmpi $1, 0 bne label4 ! LabelDef(Label(3)) label3: ! Write(LdW(Local(8))) ldw $1, $27, 8 wrd $1 wrl ! Ret() br label5 ! Epilogue label5: ret $0