Interrupt and syscall handling

Posted on Fri 19 June 2020 in MUPS16

My first attempt at implementing interrupt, syscall and exception handling in MUPS/16 currently has 4 types of traps:

  • interrupts (currently 4, with equal priority)
  • page faults
  • syscalls
  • privilege errors

Syscalls are handled slightly differently as they are completely syncronous, and so can be treated just as another instruction (R-type, opcode 0x30, subtype 1).

Interrupts can happen at any time, but are also a bit easier to handle as we can defer handling them until cycle 0 of the next instruction. The others all have to interrupt the current instruction. In the case of privilege errors, we also have to make sure that we prevent any part of the instruction (after the first two cycles which load the instruction from memory) from having an effect.

Current implementation

The microcode address unit has logic that detects when any of the async cases has happened, and sets the microcode address for the next cycle to

  • 0x600 for interrupts
  • 0x610 for page faults
  • 0x620 for privilege errors

These addresses then have a series of up to 16 steps to save some CPU state and transfer control to the OS

Common steps

All four traps start with the following steps:

  1. save the current CPU flags to S1
  2. save PC to SPC, clear the user flag, block all interrupts
  3. map the address in TSP (the trap stack pointer) to a physical address
  4. write SP into the address we just mapped
  5. copy TSP into SP

At the end of this sequence SP now points to a separate stack from the one that was in use when the trap happened, and we've pushed the old SP to the top of the new stack.

Each of the cases has slightly different steps after this.

Syscall

  1. set PC to 0x2. All arguments are already loaded into registers by the user.

Interrupt

  1. set S2 to zero (code for interrupt)
  2. copy the hidden IRQ register (containing one bit per IRQ line, indicating which lines have signalled) to S3
  3. set PC to 0x2

Page fault

  1. set S2 to 4 (code for page fault)
  2. copy the virtual address that triggered the fault into S3
  3. set PC to 0x2

Privilege error

  1. set S2 to 2 (code for privilege error)
  2. set PC to 0x2

Trap handlers

The intention was that this would be enough for the OS to be able to handle the trap:

# At address 0x2 we just have a jump to the trap handler
     j trap_handler
     ; ...

trap_handler:
     ; The CPU will have saved the user's SP to the trap stack, and set SP to the
     ; trap stack for us. We need to save any other registers that we might clobber
     ; (which, for now, is all of them, to be safe)
     addi sp, sp, -10
     sw   [8]sp, ra
     sw   [6]sp, r4
     sw   [4]sp, r3
     sw   [2]sp, r2
     sw   [0]sp, r1

     ; Get the code for the trap from S2
     mfs  r1, s2

This almost works. There are a few things that I'm not happy about, though, and which I suspect will be problems:

  1. the trap processing in the CPU executes with the user page table loaded. This means that the address that TSP points to must be mapped into the same location in every user process
  2. similarly, the CPU will jump to address 0x2 in the current address space, so every process must have the trap handler mapped into its address space in page 0.
  3. it conflates the concepts of storage of process state and the stack in which handlers execute. This makes thinking about recursive calls much harder, especially since trap handlers don't necessarily behave like functions (they may well not return; for example, a page fault handler may decide to terminate the user process, in which case the end of the handler is a jump to a new part of the kernel)

This feels fragile. I have a suspicion that there are edge cases that will be tricky or impossible to efficiently support (what happens if an interrupt occurs while servicing a page fault? Or a page fault in a system call?).

In addition, it requires us to have a lot of kernel memory mapped into each user process, and with a small 16-bit address space, every page counts. Syscalls have to execute in the same address space as the process that triggered them (arguments to the syscall may well be pointers into user memory, and while we could manually map these in software, it's going to be very slow and cumbersome). The async traps, though, could just as easily execute in kernel space (using page table zero), as they don't need to read or write from user memory.

I think this needs a redesign before I commit anything to PCBs.