Redesigning the BZ/BNZ/SEQ/SNE instructions

Posted on Sun 22 December 2019 in MUPS16

There are four instructions that need to do some kind of equality comparisons of registers:

  • BZ (branch if input register is zero)
  • BNZ (branch if input register is not zero)`
  • SEQ (set output register to 1 if both input registers are identical)
  • SNE (set output register to 1 if both input registers are not identical)

Currently these all work by having a zero line output from the ALU that is 1 iff the result of any computation is all zeros. This flag is stored in the flags register in the control unit, and used with the regrdmask control line in the BZ/BNZ instructions to control whether the register unit's set line is enabled to overwrite PC with the newly-computed address:

    { "BZ", 20, {
        // Add zero to the test register (we just want the zero flag from the ALU)
        step(regouta=1, regoutidxa=IRFieldAlt, regoutb=1, regoutidxb=Zero, aluop=Add, aluout=0, regrd=0),
        step(regouta=1, regoutidxa=PC, immop=TypeIIExt, aluop=Add, aluout=1, regrd=1, regrdmask=SetIfZero, regrdidx=PC)
    }},
    { "BNZ", 21, {
        // OR zero with the test register. If any bits in the test register are non-zero, the result
        // will be non-zero
        step(regouta=1, regoutidxa=IRFieldAlt, regoutb=1, regoutidxb=Zero, aluop=Or, aluout=0, regrd=0),
        step(regouta=1, regoutidxa=PC, immop=TypeIIExt, aluop=Add, aluout=1, regrd=1, regrdmask=SetIfNonZero, regrdidx=PC)
    }},

The SEQ and SNE instructions are slightly more complicated, because we don't have a native equality comparison, only a check for zero. SEQ can just use constop=ALUZero, to output the saved zero flag to the B bus, but SNE has to somehow invert the zero flag. We can't just use a NOR operation, since we don't want 1111...1110 as the result, so we have to do a juggle to get 1 into the temporary register, then subtract the zero flag from that:

    { "SEQ", 27, 0, {
        step(regouta=1, regoutidxa=IRField, regoutb=1, regoutidxb=IRField, aluop=Sub, aluout=0),
        step(regouta=1, regoutidxa=Zero, constop=ALUZero, aluop=Add, aluout=1, regrd=1, regrdidx=IRField)
    }},
    { "SNE", 27, 1, {
        // FIXME: this is not efficient :/
        step(regouta=1, regoutidxa=Zero, constop=C1, aluop=Add, regrd=1, regrdidx=Temp, aluout=1),
        step(regouta=1, regoutidxa=IRField, regoutb=1, regoutidxb=IRField, aluop=Sub, aluout=0),
        step(regouta=1, regoutidxa=Temp, constop=ALUZero, aluop=Sub, aluout=1, regrd=1, regrdidx=IRField)
    }},

All of this works, but it's awkward, and (most importantly), the zero-detection circuit in the ALU uses quite a lot of ICs to test every bit for zero.

We could potentially clean this up a bit, and reduce the circuitry needed, by doing a couple of things: 1. remove the dedicated zero-test circuitry in the ALU 2. rejig the control unit to make it possible to output both a constant and a flag, so that we can replace the extra step in SNE with a simple XOR of 1 and the carry flag

2 requires quite a large refactoring, so I'll leave it for now, but 1 is quite simple. Instead of doing zero tests by having a dedicated zero line from the ALU, we could add an equality operation to the ALU. As per this comment from /u/marcelk72, we can do this with the existing circuitry (at the cost of complicating the operation decoding circuit a bit).

This relies on the following truth table:

Cin A B A ⊻ ~B A ⊻ ~B + Cin Cout
0 0 0 1 01 0
0 0 1 0 00 0
0 1 0 0 00 0
0 1 1 1 01 0
1 0 0 1 10 1
1 0 1 0 01 0
1 1 0 0 01 0
1 1 1 1 10 1

So, the carry out is 1 iff the carry in was 1 and the bits A and B were the same. Any other result gives a zero carry out. Moreover, if the carry in is zero then it will never give a carry out of 1, so if we chain a whole series of these together, passing the carry out of one bit into the carry in of another, then we will get a 1 out of the final carry out iff every single bit had a carry out, which indicates that the inputs were the same at every bit. Since chaining carries like this is just what an adder does, we can do an equality comparison by computing 0 + ~(A⊻B) and setting Cin to 1. Since we already have multiplexers in the input stages to the adder that can compute any logical combination of the input bits, this is easy.

Once we've got the carry flag, we could replace the zero flag in the control unit with a carry one (which we want to do anyway, to support the ADDC and SUBB instructions that are currently unimplemented). Then, we can change the regrdmask field from:

enum RegReadMask {
    NoMask=0,
    SetIfZero=1,
    SetIfNonZero=2,
    SetIfLT=3
};

enum ConstOp {
    HighZ=0,
    ALUZero=1,
    ALULT=2,
    CFF=3,
    ...
};

to

enum RegReadMask {
    NoMask=0,
    SetIfCarry=1,
    SetIfNoCarry=2
};

enum ConstOp {
    HighZ=0,
    Carry=1,
    ALULT=2,
    CFF=3,
    ...
};

and change the instructions definitions to:

    { "BZ", 20, {
        step(regouta=1, regoutidxa=IRFieldAlt, regoutb=1, regoutidxb=Zero, aluop=Eq, aluout=0),
        step(regouta=1, regoutidxa=PC, immop=TypeIIExt, aluop=Add, aluout=1, regrd=1, regrdmask=SetIfCarry, regrdidx=PC)
    }},
    { "BNZ", 21, {
        step(regouta=1, regoutidxa=IRFieldAlt, regoutb=1, regoutidxb=Zero, aluop=Eq, aluout=0),
        step(regouta=1, regoutidxa=PC, immop=TypeIIExt, aluop=Add, aluout=1, regrd=1, regrdmask=SetIfNoCarry, regrdidx=PC)
    }},
    { "SEQ", 27, 0, {
        step(regouta=1, regoutidxa=IRField, regoutb=1, regoutidxb=IRField, aluop=Eq, aluout=0),
        step(regouta=1, regoutidxa=Zero, constop=Carry, aluop=Add, aluout=1, regrd=1, regrdidx=IRField)
    }},
    { "SNE", 27, 1, {
        step(regouta=1, regoutidxa=Zero, constop=C1, aluop=Add, regrd=1, regrdidx=Temp, aluout=1),
        step(regouta=1, regoutidxa=IRField, regoutb=1, regoutidxb=IRField, aluop=Sub, aluout=0),
        step(regouta=1, regoutidxa=Temp, constop=Carry, aluop=Sub, aluout=1, regrd=1, regrdidx=IRField)
    }},

Within the ALU, we need to add decoding logic to set the multiplexer control lines, too. One needs to output zero, the other A ⊻ ~B. The control lines for this second value are:

A B A ⊻ ~B
0 0 1
0 1 0
1 0 0
1 1 1

Since the B-side multiplexers generally already output zero unless an add or subtract is being done, it's probably easier to change the A-side one.