Rethinking control

Posted on Sat 26 September 2020 in MUPS16

Several of the instructions in the current design need a bit of a rethink. In particular, I've been thinking about how this CPU could eventually be pipelined, and as part of that, how to simplify the control unit. I've probably made too much use of the flexibility that microcoding instructions gives, to the point where instructions take anything from three to seven cycles. With a bit of thought I think I can get this down to four cycles for anything except memory accesses, which would need five. More importantly, I think I can get all instructions to only use each part of the CPU once, which is important for pipelining.

The first couple of instructions to look at are LUI, and the conditional set instructions

LUI

This instruction is a little more complex than most, as it needs to shift the immediate left, then set it into the top byte of the destination, leaving the low byte unchanged. Currently, it does this in three steps, using the Temp register and an FF constant:

    Instruction("LUI", 15, {
        step(regouta=1, regoutidxa=IRFieldAlt, constop=CFF, aluop=And, aluout=1, regrd=1, regrdidx=IRField),
        step(regouta=1, regoutidxa=Zero, immop=TypeIIUpper, aluop=Add, aluout=1, regrd=1, regrdidx=Temp),
        step(regouta=1, regoutidxa=IRFieldAlt, regoutb=1, regoutidxb=Temp, aluop=Or, aluout=1, regrd=1, regrdidx=IRField)
    }),

This works, but it uses the ALU three times, and (more importantly, right now) it's the only use of the CFF constant op.

A fairly simple extra bit of circuitry in the control unit should be able to reduce this to a single cycle that doesn't use the ALU at all. Roughly, something like this:

lui

This adds one control line (the lui flag) and two 8-bit buffers to control output to the data bus, which feels acceptable. The microcode would then just be:

    Instruction("LUI", 15, {
        step(regouta=1, regoutidxa=IRFieldAlt, lui=1, regrd=1, regrdidx=IRField)
    }),

SEQ/SNE/SLT/SLTU

These four instructions are currently all pretty similar: they use the ALU to perform a comparison, and they latch the flags into an internal register in the control unit. These flags are then output to the B bus via the constant unit on the next cycle, and sent via the ALU to the destination register:

     Instruction("SEQ", 27, 0, {
         step(regouta=1, regoutidxa=IRField, regoutb=1, regoutidxb=IRField, aluop=Eq, aluout=0, setaluflags=1),
         step(regouta=1, regoutidxa=Zero, constop=Carry, aluop=Add, aluout=1, regrd=1, regrdidx=IRField)
     }),
     Instruction("SNE", 27, 1, {
         step(regouta=1, regoutidxa=Zero, constop=C1, aluop=Add, regrd=1, regrdidx=Temp, aluout=1),
         step(regouta=1, regoutidxa=IRField, regoutb=1, regoutidxb=IRField, aluop=Eq, aluout=0, setaluflags=1),
         step(regouta=1, regoutidxa=Temp, constop=Carry, aluop=Sub, aluout=1, regrd=1, regrdidx=IRField)
     }),
     Instruction("SLT", 27, 2, {
         step(regouta=1, regoutidxa=IRField, regoutb=1, regoutidxb=IRField, aluop=Sub, aluout=0, setaluflags=1),
         step(regouta=1, regoutidxa=Zero, constop=ALULT, aluop=Add, aluout=1, regrd=1, regrdidx=IRField)
     }),
     Instruction("SLTU", 27, 3, {
         step(regouta=1, regoutidxa=IRField, regoutb=1, regoutidxb=IRField, aluop=Sub, aluout=0, setaluflags=1, unsigned_=1),
         step(regouta=1, regoutidxa=Zero, constop=ALULT, aluop=Add, aluout=1, regrd=1, regrdidx=IRField)
     }),

sne is slightly more complicated than the others as it needs to invert the carry output from the ALU (see ALU docs for details of the eq operation).

Problems here are similar to with lui above: these instructions use the ALU either two or three times each, and sne is the last remaining use of the C1 constant op.

A simple extra bit of circuitry should be able to eliminate this, and get all four down to a single cycle:

comparisons

    Instruction("SEQ", 27, 0, {
        step(regouta=1, regoutidxa=IRField, regoutb=1, regoutidxb=IRField, aluop=Eq, aluout=0,
            aluflagop=Carry, regrd=1, regrdidx=IRField)
    }),
    Instruction("SNE", 27, 1, {
        step(regouta=1, regoutidxa=IRField, regoutb=1, regoutidxb=IRField, aluop=Eq, aluout=0,
             aluflagop=NotCarry, regrd=1, regrdidx=IRField)
    }),
    Instruction("SLT", 27, 2, {
        step(regouta=1, regoutidxa=IRField, regoutb=1, regoutidxb=IRField, aluop=Sub, aluout=0,
             aluflagop=LessThan, regrd=1, regrdidx=IRField)
    }),
    Instruction("SLTU", 27, 3, {
        step(regouta=1, regoutidxa=IRField, regoutb=1, regoutidxb=IRField, aluop=Sub, aluout=0,
             aluflagop=LessThan, regrd=1, regrdidx=IRField, unsigned_=1)
    }),

This is faster, simpler, and significantly reduces the amount of circuitry: not only is the flag-specific circuitry here simpler, but the entire constant unit can be deleted, which is a big saving.