Disclosed is methodology for fast multiplexing of microprocessor results bus data using early instruction decode.
Fast Results Bus Data Multiplexing Due to Early Instruction Decode
Multiplexing data onto a microprocessor results bus is typically implemented as a 2-stage operation consisting of a set of AND functions and an OR function. Figure 1 shows the prior art implementation of a 3-way CMOS multiplexor between the adder result, the rotator result and the logical result with individual selects for each input. As Figure 1 shows the delay for the result busses is 2 gates, Nand2 and Nand3. This invention replaces the 2-stages multiplexor with one Nand3 stage resulting in a reduced delay for the result bus multiplexing.
To provide the multiplexor function in only a single level of Nand3 gate we identify early in the cycle the active instruction and which result bus is desired. Once the active instruction is identified we can send a signal to the other two execution blocks that will force their outputs to a 1. With 2 of the 3 inputs of the Nand3 forced to 1 the desired third input will pass through the nand3 with an inversion. By disabling all 3 inputs (forcing to 1) the output of the nand3 can be driven to a 0. The forcing of outputs to 1 is done in a way that does not introduce extra delay in the result path. Thus the overall delay for the result bus multiplexor is only one nand3 stage resulting in a performance improvement compared to the typical 2-stage...