MIPS architecture

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Not to be confused with Millions of instructions per second.
Designer MIPS Technologies, Imagination Technologies
Bits 64-bit (32 → 64)
Introduced 1985; 32 years ago (1985)
Version MIPS32/64 Release 6 (2014)
Design RISC
Type Register-Register
Encoding Fixed
Branching Compare and branch
Endianness Bi
Page size 4 KB
Extensions MDMX, MIPS-3D
Open No
General purpose 32
Floating point 32

MIPS is a reduced instruction set computer (RISC) instruction set architecture (ISA)[1]:A-1[2]:19 developed by MIPS Technologies (formerly MIPS Computer Systems). The early MIPS architectures were 32-bit, with 64-bit versions added later. There are multiple versions of MIPS: including MIPS I, II, III, IV, and V; as well as five releases of MIPS32/64 (for 32- and 64-bit implementations, respectively). As of April 2017, the current version is MIPS32/64 Release 6.[3][4] MIPS32/64 primarily differs from MIPS I–V by defining the privileged kernel mode System Control Coprocessor in addition to the user mode architecture.

Several optional extensions are also available, including MIPS-3D which is a simple set of floating-point SIMD instructions dedicated to common 3D tasks,[5] MDMX (MaDMaX) which is a more extensive integer SIMD instruction set using the 64-bit floating-point registers, MIPS16e which adds compression to the instruction stream to make programs take up less room,[6] and MIPS MT, which adds multithreading capability.[7]

Computer architecture courses in universities and technical schools often study the MIPS architecture.[8] The architecture greatly influenced later RISC architectures such as Alpha.

As of April 2017, MIPS processors are used in embedded systems such as residential gateways and routers. Originally, MIPS was designed for general-purpose computing, and during the 1980s and 1990s, MIPS processors for personal, workstation, and server computers were used by many companies such as Digital Equipment Corporation, MIPS Computer Systems, NEC, Pyramid Technology, SiCortex, Siemens Nixdorf, Silicon Graphics, and Tandem Computers. Historically, video game consoles such as the Nintendo 64, Sony PlayStation, PlayStation 2 and PlayStation Portable used MIPS processors. MIPS processors also used to be popular in supercomputers during the 1990s, but all such systems have dropped off the TOP500 list. These uses were complemented by embedded applications at first, but during the 1990s, MIPS became a major presence in the embedded processor market, and by the 2000s, most MIPS processors were for these applications. In the mid- to late-1990s, it was estimated that one in three RISC microprocessors produced was a MIPS processor.[9]

MIPS is a modular architecture supporting up to four coprocessors (CP0/1/2/3). In MIPS terminology, CP0 is the System Control Coprocessor (an essential part of the processor that was implementation-defined in MIPS I–V), CP1 is an optional floating-point unit (FPU) and CP2/3 are optional implementation-defined coprocessors (MIPS III removed CP3 and reused its opcodes for other purposes). For example, in the Playstation game console, CP2 is the Geometry Transformation Engine (GTE), which is used for accelerating the geometry in 3D computer graphics.

MIPS I[edit]

The first version of the MIPS architecture was designed by MIPS Computer Systems for its R2000 microprocessor, the first MIPS implementation. Both MIPS and the R2000 were introduced together in 1985.[citation needed] When MIPS II was introduced, MIPS was renamed MIPS I to distinguish it from the new version.

MIPS is a load-store architecture (also known as a register-register architecture); except for the load/store instructions used to access memory, all instructions operate on the registers.


MIPS I has thirty-two 32-bit general-purpose registers. Register $0 is hardwired to zero and writes to it are discarded. Register $31 is the link register. For integer multiplication and division instructions, which run asynchronously from other instructions, a pair of 32-bit registers, hi and lo, are provided. There is a small set of instructions for copying data between the general-purpose registers and the hi/lo registers.

The program counter has 32 bits. The two low-order bits always contain zero since MIPS I instructions are 32 bits long and are aligned to their natural word boundaries.

Instruction formats[edit]

Instructions are divided into three types: R, I and J. Every instruction starts with a 6-bit opcode. In addition to the opcode, R-type instructions specify three registers, a shift amount field, and a function field; I-type instructions specify two registers and a 16-bit immediate value; J-type instructions follow the opcode with a 26-bit jump target.[1]:A-174

The following are the three formats used for the core instruction set:

Type -31-                                 format (bits)                                 -0-
R opcode (6) rs (5) rt (5) rd (5) shamt (5) funct (6)
I opcode (6) rs (5) rt (5) immediate (16)
J opcode (6) address (26)


  • In the following, the register letters d, t, and s are placeholders for (register) numbers or register names.
  • C denotes a constant (immediate).
  • All the following instructions are native instructions.
  • Opcodes and funct codes are in hexadecimal.
  • The word unsigned as part of Add and Subtract instructions, is a misnomer. The difference between signed and unsigned versions of commands is not a sign extension (or lack thereof) of the operands, but controls whether a trap is executed on overflow (e.g. Add) or an overflow is ignored (Add unsigned). An immediate operand CONST to these instructions is always sign-extended.

CPU loads and stores[edit]

The encoding column shows which bits correspond to which parts of the instruction. A hyphen (-) is used to indicate don't cares.

Instruction Name Syntax Operation Format/Opcode/Function Notes/Encoding
Load Word lw $t,C($s) $t = Memory[$s + C] I 2316 - loads the word stored from: MEM[$s+C] and the following 3 bytes.
Load Halfword lh $t,C($s) $t = Memory[$s + C] (signed) I 2116 - loads the halfword stored from: MEM[$s+C] and the following byte. Sign is extended to width of register.
Load Halfword Unsigned lhu $t,C($s) $t = Memory[$s + C] (unsigned) I 2516 - As above without sign extension.
Load Byte lb $t,C($s) $t = Memory[$s + C] (signed) I 2016 - loads the byte stored from: MEM[$s+C].
Load Byte Unsigned lbu $t,C($s) $t = Memory[$s + C] (unsigned) I 2416 - As above without sign extension.
Store Word sw $t,C($s) Memory[$s + C] = $t I 2B16 - stores a word into: MEM[$s+C] and the following 3 bytes. The order of the operands is a large source of confusion.
Store Halfword sh $t,C($s) Memory[$s + C] = $t I 2916 - stores the least-significant 16-bit of a register (a halfword) into: MEM[$s+C].
Store Byte sb $t,C($s) Memory[$s + C] = $t I 2816 - stores the least-significant 8-bit of a register (a byte) into: MEM[$s+C]
Load Word Left lwl
Load Word Right lwr
Store Word Left swl
Store Word Right swr


Instruction name Syntax Operation Format/Opcode/Function Notes/Encoding
Add add $d,$s,$t $d = $s + $t R 0 2016 adds two registers, executes a trap on overflow
000000ss sssttttt ddddd--- --100000
Add Unsigned addu $d,$s,$t $d = $s + $t R 0 2116 as above but ignores an overflow
000000ss sssttttt ddddd--- --100001
Subtract sub $d,$s,$t $d = $s - $t R 0 2216 subtracts two registers, executes a trap on overflow
000000ss sssttttt ddddd--- --100010
Subtract Unsigned subu $d,$s,$t $d = $s - $t R 0 2316 as above but ignores an overflow
000000ss sssttttt ddddd000 00100011
Add Immediate addi $t,$s,C $t = $s + C (signed) I 816 - Used to add sign-extended constants (and also to copy one register to another: addi $1, $2, 0), executes a trap on overflow
001000ss sssttttt CCCCCCCC CCCCCCCC
Add Immediate Unsigned addiu $t,$s,C $t = $s + C (signed) I 916 - as above but ignores an overflow
001001ss sssttttt CCCCCCCC CCCCCCCC
And and $d,$s,$t $d = $s & $t R 0 2416 Bitwise and
000000ss sssttttt ddddd--- --100100
And Immediate andi $t,$s,C $t = $s & C I C16 - Leftmost 16 bits are padded with 0s
001100ss sssttttt CCCCCCCC CCCCCCCC
Or or $d,$s,$t $d = $s | $t R 0 2516 Bitwise or
Or Immediate ori $t,$s,C $t = $s | C I D16 - Leftmost 16 bits are padded with 0s
Exclusive Or xor $d,$s,$t $d = $s ^ $t R 0 2616 Bitwise exclusive or
Exclusive Or Immediate xori $t,$s,C $t = $s ^ C I E16 - Leftmost 16 bits are padded with 0s
Nor nor $d,$s,$t $d = ~ ($s | $t) R 0 2716 Bitwise nor
Set on Less Than slt $d,$s,$t $d = ($s < $t) R 0 2A16 Tests if one register is less than another.
Set on Less Than Unsigned sltu $d,$s,$t $d = ($s < $t) R 0 2B16 Tests if unsigned integer in one register is less than another.
Set on Less Than Immediate slti $t,$s,C $t = ($s < C) I A16 - Tests if one register is less than a constant.
Set on Less Than Immediate Unsigned sltiu
Load Upper Immediate lui


Instruction name Syntax Operation Format/Opcode/Function Notes/Encoding
Shift Left Logical Immediate sll $d,$t,shamt $d = $t << shamt R 0 0 shifts shamt number of bits to the left (multiplies by )
Shift Right Logical Immediate srl $d,$t,shamt $d = $t >> shamt R 0 216 shifts shamt number of bits to the right - zeros are shifted in (divides by ). Note that this instruction only works as division of a two's complement number if the value is positive.
Shift Right Arithmetic Immediate sra $d,$t,shamt R 0 316 shifts shamt number of bits - the sign bit is shifted in (divides a positive or even 2's complement number by )
Shift Left Logical sllv $d,$t,$s $d = $t << $s R 0 4 16 shifts $s number of bits to the left (multiplies by )
Shift Right Logical srlv $d,$t,$s $d = $t >> $s R 0 616 shifts $s number of bits to the right - zeros are shifted in (divides by ). Note that this instruction only works as division of a two's complement number if the value is positive.
Shift Right Arithmetic srav $d,$t,$s R 0 716 shifts $s number of bits - the sign bit is shifted in (divides a positive or even 2's complement number by )

Multiply and divide[edit]

Instruction name Syntax Operation Format/Opcode/Function Notes/Encoding
Multiply mult $s,$t LO = (($s * $t) << 32) >> 32;
HI = ($s * $t) >> 32;
R 0 1816 Multiplies two registers and puts the 64-bit result in two special memory spots - LO and HI. Alternatively, one could say the result of this operation is:
(int HI,int LO) = (64-bit) $s * $t
. See mfhi and mflo for accessing LO and HI regs.
Multiply Unsigned multu $s,$t LO = (($s * $t) << 32) >> 32;
HI = ($s * $t) >> 32;
R 0 1916 Multiplies two registers and puts the 64-bit result in two special memory spots - LO and HI. Alternatively, one could say the result of this operation is:
(int HI,int LO) = (64-bit) $s * $t
. See mfhi and mflo for accessing LO and HI regs.
Divide div $s, $t LO = $s / $t     HI = $s % $t R 0 1A16 Divides two registers and puts the 32-bit integer result in LO and the remainder in HI.
Divide Unsigned divu $s, $t LO = $s / $t     HI = $s % $t R 0 1B16 Divides two registers and puts the 32-bit integer result in LO and the remainder in HI.
Move from HI mfhi $d $d = HI R 0 1016 Moves a value from HI to a register. Do not use a multiply or a divide instruction within two instructions of mfhi (that action is undefined because of the MIPS pipeline).
Move to HI mthi $s HI = $s
Move from LO mflo $d $d = LO R 0 1216 Moves a value from LO to a register. Do not use a multiply or a divide instruction within two instructions of mflo (that action is undefined because of the MIPS pipeline).
Move to LO mtlo $s LO = $s

Jump and branch[edit]

Instruction name Syntax Operation Format/Opcode/Function Notes/Encoding
Branch on Equal beq $s,$t,C if ($s == $t) go to PC+4+4*C I 416 - Goes to the instruction at the specified address if two registers are equal.
000100ss sssttttt CCCCCCCC CCCCCCCC
Branch on Not Equal bne $s,$t,C if ($s != $t) go to PC+4+4*C I 516 - Goes to the instruction at the specified address if two registers are not equal.
Branch on Less Than or Equal to Zero blez
Branch on Greater Than Zero bgtz
Branch on Less Than Zero bltz
Branch on Greater Than or Equal to Zero bgez
Branch on Less Than Zero and Link bltzal
Branch on Greater Than or Equal to Zero and Link bgezal
Jump j C PC = PC+4[31:28] . C*4 J 216 - Unconditionally jumps to the instruction at the specified address.
Jump Register jr $s goto address $s R 0 816 Jumps to the address contained in the specified register
Jump and Link jal C $31 = PC + 4; PC = PC+4[31:28] . C*4 J 316 - For procedure call - used to call a subroutine, $31 holds the return address; returning from a subroutine is done by: jr $31. Return address is PC + 8, not PC + 4 due to the use of a branch delay slot which forces the instruction after the jump to be executed
Jump and Link Register jalr $d,$s


Instruction name Syntax Operation Format/Opcode/Function Notes/Encoding
Move from Control Register mfcZ $t, $d $t = Coprocessor[Z].ControlRegister[$d] R 0 Moves a 4 byte value from Coprocessor Z Control register to a general purpose register. Sign extension.
Move to Control Register mtcZ $t, $d Coprocessor[Z].ControlRegister[$d] = $t R 0 Moves a 4 byte value from a general purpose register to a Coprocessor Z Control register. Sign extension.

Floating point[edit]

MIPS has 32 floating-point registers. Two registers are paired for double precision numbers. Odd numbered registers cannot be used for arithmetic or branching, just as part of a double precision register pair.

Category Name Instruction syntax Meaning Format opcode funct Notes/Encoding
Floating-Point Add add.s $x,$y,$z $x = $y + $z Floating-Point add (single precision)
Floating-Point Subtract sub.s $x,$y,$z $x = $y - $z Floating-Point subtract (single precision)
Floating-Point Multiply mul.s $x,$y,$z $x = $y * $z Floating-Point multiply (single precision)
Floating-Point Divide div.s $x,$y,$z $x = $y / $z Floating-Point divide (single precision)
Floating-Point Add add.d $x,$y,$z $x = $y + $z Floating-Point add (double precision)
Floating-Point Subtract sub.d $x,$y,$z $x = $y - $z Floating-Point subtract (double precision)
Floating-Point Multiply mul.d $x,$y,$z $x = $y * $z Floating-Point multiply (double precision)
Floating-Point Divide div.d $x,$y,$z $x = $y / $z Floating-Point divide (double precision)
Data Transfer[edit]
Load word coprocessor lwcZ $x,CONST ($y) Coprocessor[Z].DataRegister[$x] = Memory[$y + CONST] I Loads the 4 byte word stored from: MEM[$y+CONST] into a Coprocessor data register. Sign extension.
Store word coprocessor swcZ $x,CONST ($y) Memory[$y + CONST] = Coprocessor[Z].DataRegister[$x] I Stores the 4 byte word held by a Coprocessor data register into: MEM[$y+CONST]. Sign extension.
Floating-Point Compare (eq,ne,lt,le,gt,ge) c.lt.s $f2,$f4 cond = ($f2 < $f4) Floating-point compare less than single precision
Floating-Point Compare (eq,ne,lt,le,gt,ge) c.lt.d $f2,$f4 cond = ($f2 < $f4) Floating-point compare less than double precision
Branch on FP True bc1t 100
if (cond)
  goto PC+4+100;
PC relative branch if FP condition
Branch on FP False bc1f 100
if (cond)
  goto PC+4+100;
PC relative branch if not condition

MIPS II[edit]

MIPS II removed the load delay slot[2]:41 and added several sets of instructions. For shared-memory multiprocessing, the Synchronize Shared Memory, Load Linked Word, and Store Conditional Word instructions were added. A set of Trap-on-Condition instructions were added. These instructions caused an exception if the evaluated condition is true. All existing branch instructions were given branch-likely versions that executed the instruction in the branch delay slot only if the branch is taken.[2]:40 These instructions improve performance in certain cases by allowing useful instructions to fill the branch delay slot.[2]:212 Doubleword load and store instructions for COP1–3 were added. Consistent with other memory access instructions, these loads and stores required the doubleword to be naturally aligned.

The instruction set for the floating point coprocessor also had several instructions added to it. An IEEE 754-compliant floating-point square root instruction was added. It supported both single- and double-precision operands. A set of instructions that converted single- and double-precision floating-point numbers to 32-bit words were added. These complemented the existing conversion instructions by allowing the IEEE rounding mode to be specified by the instruction instead of the Floating Point Control and Status Register.

MIPS Computer Systems' R6000 microprocessor (1989) was the first MIPS II implementation.[2]:8 Designed for servers, the R6000 was fabricated and sold by Bipolar Integrated Technology, but was a commercial failure. During the mid-1990s, many new 32-bit MIPS processors for embedded systems were MIPS II implementations because the introduction of the 64-bit MIPS III architecture in 1991 left MIPS II as the newest 32-bit MIPS architecture until MIPS32 was introduced in 1999.A[2]:19

Instructions added to MIPS II[1]
Name Mnemonic
Synchronize Shared Memory SYNC
Trap if Greater Than or Equal TGE
Trap if Greater Than or Equal Unsigned TGEU
Trap if Less Than TLT
Trap if Less Than Unsigned TLTU
Trap if Equal TEQ
Trap if Not Equal TNE
Branch on Less Than or Equal to Zero Likely BLEZL
Branch on Greater Than or Equal to Zero Likely BGEZL
Trap if Greater Than or Equal Immediate TGEI
Trap if Greater Than or Equal Unsigned Immediate TGEIU
Trap if Less Than Immediate TLTI
Trap if Less Than Unsigned Immediate TLTIU
Trap if Equal Immediate TEQI
Trap if Not Equal Immediate TNEI
Branch on Less Than Zero and Link Likely BLTZALL
Branch on Greater Than or Equal to Zero and Link Likely BGEZAL
Floating-Point Square Root SQRT.S
Floating-Point Square Root SQRT.D
Floating-Point Round to Word Fixed-Point ROUND.S
Floating-Point Round to Word Fixed-Point ROUND.D
Floating-Point Truncate to Word Fixed-Point TRUNC.S
Floating-Point Truncate to Word Fixed-Point TRUNC.D
Floating-Point Ceiling to Word Fixed-Point CEIL.S
Floating-Point Ceiling to Word Fixed-Point CEIL.D
Floating-Point Ceiling to Word Fixed-Point FLOOR.S
Floating-Point Ceiling to Word Fixed-Point FLOOR.D
Branch on FP False Likely BC1FL
Branch on FP True Likely BC1TL
Branch on Equal Likely BEQL
Branch on Not Equal Likely BNEL
Branch on Less Than Zero Likely BLTZL
Branch on Greater Than Zero Likely BGTZL
Load Linked LL
Load Doubleword to Coprocessor 1 LDC1
Load Doubleword to Coprocessor 2 LDC2
Load Doubleword to Coprocessor 3 LDC3
Store Conditional SC
Store Doubleword to Coprocessor 1 SDC1
Store Doubleword to Coprocessor 2 SDC2
Store Doubleword to Coprocessor 3 SDC3

MIPS III[edit]

MIPS III is a backwards compatible extension of MIPS II that added support for 64-bit memory addressing and integer operations. The 64-bit data type is called a doubleword, and MIPS III extended the general-purpose registers, HI/LO registers, and program counter to 64 bits to support it. New instructions were added to load and store doublewords, to perform integer addition, subtraction, multiplication, division, and shift operations on them, and to move doubleword between the GPRs and HI/LO registers. Existing instructions originally defined to operate of 32-bit words were redefined, where necessary, to sign-extend the 32-bit results to permit words and doublewords to be treated identically by most instructions. Among those instructions redefined was Load Word. In MIPS III it sign-extends words to 64 bits. To complement Load Word, a version that zero-extends was added.

The R instruction format's inability to specify the full shift distance for 64-bit shifts (its 5-bit shift amount field is too narrow to specify the shift distance for doublewords) required MIPS III to provide three 64-bit versions of each MIPS I shift instruction. The first version is a 64-bit version of the original shift instructions, used to specify constant shift distances of 0–31 bits. The second version is similar to the first, but adds 3210 the shift amount field's value so that constant shift distances of 32–64 bits can be specified. The third version obtains the shift distance from the six low-order bits of a GPR.

MIPS III removed the Coprocessor 3 (CP3) support instructions, and reused its opcodes for the new doubleword instructions. The remaining coprocessors gained instructions to move doublewords between coprocessor registers and the GPRs. The floating general registers (FGRs) were extended to 64 bits and the requirement for instructions to use even-numbered register only was removed. This is incompatible with earlier versions of the architecture; a bit is used to operate the MIPS III floating-point unit in a MIPS I- and II-compatible mode. The floating-point control registers were not extended for compatibility. The only new floating-point instructions added were those to convert single- and double-precision floating-point numbers into doubleword integers and vice versa. MIPS III added a supervisor privilege level in between the existing kernel and user privilege levels.

MIPS Computer Systems's R4000 microprocessor (1991) was the first MIPS III implementation. It was designed for use in personal, workstation, and server computers. MIPS Computer Systems aggressively promoted the MIPS architecture and R4000, establishing the Advanced Computing Environment (ACE) consortium advance its Advanced RISC Computing (ARC) standard, which aimed to establish MIPS as the dominant personal computing platform. ARC found little success in personal computers, but the R4000 (and the R4400 derivative) were widely used in workstation and server computers, especially by its largest user, Silicon Graphics. Other uses of the R4000 included high-end embedded systems and supercomputers.

MIPS III was eventually implemented by a number of embedded microprocessors. Quantum Effect Design's R4600 (1993) and its derivatives was widely used in high-end embedded systems and low-end workstations and servers. MIPS Technologies' R4200 (1994), was designed for embedded systems, laptop, and personal computers. A derivative, the R4300i, fabricated by NEC Electronics, was used in the Nintendo 64 game console. The Nintendo 64, along with the PlayStation, were the among the highest volume users of MIPS architecture processors in the mid-1990s.

MIPS IV[edit]

MIPS IV is the fourth version of the architecture. It is a superset of MIPS III and is compatible with all existing versions of MIPS. The first implementation of MIPS IV was the R8000, which was introduced in 1994.[citation needed] MIPS IV added:

  • Register + register addressing for floating point loads and stores
  • Single- and double-precision floating point fused-multiply add and subtract instructions
  • Conditional move instructions for both general-purpose and floating point reigsters
  • Seven extra condition bits in the floating point control and status register, bringing the total to eight
  • Single- and double-precision floating point reciprocal instructions
  • Single- and double-precision floating point reciprocal square-root instructions
  • Optional imprecise exceptions for IEEE 754 traps
  • New floating point branch instructions that can access the eight floating point condition bits
  • Prefetch instruction for performing memory prefetching and specifying cache hints

MIPS V[edit]

Announced on October 21, 1996 at the Microprocessor Forum 1996 alongside the MIPS Digital Media Extensions (MDMX) extension, MIPS V was designed to improve the performance of 3D graphics transformations.[10] In the mid-1990s, a major use of non-embedded MIPS microprocessors were graphics workstations from SGI. MIPS V was completed by the integer-only MDMX extension to provide a complete system for improving the performance of 3D graphics applications.[11]

MIPS V implementations were never introduced. In 1997, SGI announced the "H1" or "Beast" and the "H2" or "Capitan" microprocessors. The former was to have been the first MIPS V implementation, and was due to be introduced in 1999. The "H1" and "H2" projects were later combined and were eventually canceled in 1998.

MIPS V added a new data type, the pair-single (PS), which consisted of two single-precision (32-bit) floating-point numbers stored in the existing 64-bit floating-point registers. Variants of existing floating-point instructions for arithmetic, compare and conditional move were added to operate on this data type in a SIMD fashion. New instructions were added for loading, rearranging and converting PS data. It was the first instruction set to exploit floating-point SIMD with existing resources.[11]


When MIPS Technologies was spun-out of Silicon Graphics in 1998, it refocused on the embedded market. Up to MIPS V, each successive version was a strict superset of the previous version, but this property was found to be a problem,[citation needed] and the architecture definition was changed to define two 32- and 64-bit architectures: MIPS32 and a 64-bit MIPS64. Both were introduced in 1999.[12] MIPS32 is based on MIPS II with some additional features from MIPS III, MIPS IV, and MIPS V; MIPS64 is based on MIPS V.[12] NEC, Toshiba and SiByte (later acquired by Broadcom) each obtained licenses for MIPS64 as soon as it was announced. Philips, LSI Logic, IDT, Raza Microelectronics, Inc., Cavium, Loongson Technology and Ingenic Semiconductor have since joined them.

MIPS32/MIPS64 Release 1[edit]

The first release of MIPS32, based on MIPS II, added conditional moves, prefetch instructions, and other features from the R4000 and R5000 families of 64-bit processors.[12] The first release of MIPS64 adds a MIPS32 mode to run 32-bit code.[12] The MUL and MADD (multiply-add) instructions, previously available in some implementations, were added to the MIPS32 and MIPS64 specifications, as were cache control instructions.[12]

MIPS32/MIPS64 Release 2[edit]

MIPS32/MIPS64 Release 3[edit]

MIPS32/MIPS64 Release 5[edit]

Announced on December 6, 2012.[13] Release 4 was skipped because the number four is perceived as unlucky in many Asian cultures.[14]

MIPS32/MIPS64 Release 6[edit]

MIPS32/MIPS64 Release 6 in 2014 added[15] the following:

  • a new family of branches with no delay slot:
    • unconditional branches (BC) & branch-and-link (BALC) with a 26-bit offset,
    • conditional branch on zero/non-zero with a 21-bit offset,
    • full set of signed & unsigned conditional branches compare between two registers (e.g. BGTUC) or a register against zero (e.g. BGTZC),
    • full set of branch-and-link which compare a register against zero (e.g. BGTZALC).
  • index jump instructions with no delay slot designed to support large absolute addresses.
  • instructions to load 16-bit immediates at bit position 16, 32 or 48, allowing to easily generate large constants.
  • PC-relative load instructions, as well as address generation with large (PC-relative) offsets.
  • bit-reversal & byte-alignment instructions (previously only available with the DSP extension).
  • multiply & divide instructions redefined so that they use a single register for their result).
  • instructions generating truth values now generate all zeroes or all ones instead of just clearing/setting the 0-bit,
  • instructions using a truth value now only interpret all-zeroes as false instead of just looking at the 0-bit.

Removed infrequently used instructions:

  • some conditional moves
  • branch likely instructions (deprecated in previous releases).
  • integer overflow trapping instructions with 16-bit immediate
  • integer accumulator instructions (together HI/LO registers, moved to the DSP Application-Specific Extension)
  • unaligned load instructions (LWL & LWR), (requiring that most ordinary loads & stores support misaligned access, possibly via trapping and with the addition of a new instruction (BALIGN))

Reorganized the instruction encoding, freeing space for future expansions.

Application-Specific Extensions[edit]

The base MIPS32 and MIPS64 architectures can be supplemented with a number of optional architectural extensions, which are collectively referred to as Application-Specific Extensions (ASEs). These ASEs provide features that improve the efficiency and performance of certain workloads, such as digital signal processing.


Enhancements for microcontroller applications. The MCU ASE (Application Specific Extension) has been developed to extend the interrupt controller support, reduce the interrupt latency and enhance the I/O peripheral control function typically required in microcontroller system designs.

  • Separate priority and vector generation
  • Supports up to 256 interrupts in EIC (External Interrupt Controller) mode and eight hardware interrupt pins
  • Provides 16-bit vector offset address
  • Pre-fetching of the interrupt exception vector
  • Automated Interrupt Prologue – adds hardware to save and update system status before the interrupt handling routine
  • Automated Interrupt Epilogue – restores the system state previously stored in the stack for returning from the interrupt.
  • Interrupt Chaining – supports the service of pending interrupts without the need to exit the initial interrupt routine, saving the cycles required to store and restore multiple active interrupts
  • Supports speculative pre-fetching of the interrupt vector address. Reduces the number of interrupt service cycles by overlapping memory accesses with pipeline flushes and exception prioritization
  • Includes atomic bit set/clear instructions which enables bits within an I/O register that are normally used to monitor or control external peripheral functions to be modified without interruption, ensuring the action is performed securely.

MIPS16 and MIPS16e[edit]

MIPS16 is an optional extension designed by LSI Logic during the mid-1990s. MIPS16e is an improved version of MIPS16 introduced in the early 2000s, and is supported as an optional extension by MIPS32 and MIPS64 (up to Release 5). Release 6 replaced it with microMIPS. Both MIPS16 and MIPS16e decrease the size of application by up to 40%[citation needed] by using 16-bit instructions instead of 32-bit instructions. MIPS16e also improves power efficiency, the instruction cache hit rate, and is equivalent in performance to its base architecture.[citation needed] It is supported by hardware and software development tools from MIPS Technologies and other providers.


The DSP ASE is an optional extension to the MIPS32/MIPS64 release 2 and newer instruction sets which can be used to accelerate a large range of "media" computations - particularly audio, since TV-resolution video. The DSP module comprises a set of instructions and state in the integer pipeline and requires minimal additional logic to implement in MIPS processor cores. Revision 2 of the ASE was introduced in the second half of 2006. This revision adds extra instructions to the original ASE, but is otherwise backwards-compatible with it.[16]

Unlike the bulk of the MIPS architecture, it's a fairly irregular set of operations, many chosen for a particular relevance to some key algorithm.

Its main novel features (vs original MIPS32)[17]:

  • Saturating arithmetic (when a calculation overflows, deliver the representable number closest to the non-overflowed answer).
  • Fixed-point arithmetic on signed 32- and 16-bit fixed-point fractions with a range of -1 to +1 (these are widely called "Q31" and "Q15").
  • The existing integer multiplication and multiply-accumulate instructions, which deliver results into a double-size accumulator (called "hi/lo" and 64 bits on MIPS32 CPUs). The DSP ASE adds three more accumulators, and some different flavours of multiply-accumulate.
  • SIMD instructions operating on 4 x unsigned bytes or 2 x 16-bit values packed into a 32-bit register (the 64-bit variant of the DSP ASE supports larger vectors, too).
  • SIMD operations are basic arithmetic, shifts and some multiply-accumulate type operations.

To make use of MIPS DSP ASE, you may:

  • Hand-code in assembly language, which is the most time-consuming method of utilizing the MIPS DSP ASE, but can produce code with the highest performance.
  • Use asm macros supported by GCC that produce DSP instructions directly from C code.
  • Use intrinsics supported by GCC for the MIPS DSP ASE.
  • Use fixed-point data types and operators in C supported by GCC. The MIPS DSP ASE is the only processor architecture that supports fixed-point data types in a general-purpose processor.
  • Use auto-vectorization supported by GCC for loops via the optimization option -ftree-vectorize. The advantage of auto-vectorization is that the compiler can recognize scalar variables (which can be integer, fixed-point, or floating-point types) in order to utilize SIMD instructions automatically. In the ideal case, when auto-vectorization is used, there is no need to use SIMD variables explicitly.[18]

Linux 2.6.12-rc5 starting 2005-05-31 adds support for the DSP ASE. Note that to actually make use of the DSP ASE a toolchain which support this is required. GCC already has support for DSP and DSPr2.


Instruction set extensions designed to accelerate multimedia.

  • 32 vector registers of 16 x 8-bit, 8 x 16-bit, 4 x 32-bit, and 2 x 64 bit vector elements
  • Efficient vector parallel arithmetic operations on integer, fixed-point and floating-point data
  • Operations on absolute value operands
  • Rounding and saturation options available
  • Full precision multiply and multiply-add
  • Conversions between integer, floating-point, and fixed-point data
  • Complete set of vector-level compare and branch instructions with no condition flag
  • Vector (1D) and array (2D) shuffle operations
  • Typed load and store instructions for endian-independent operation
  • IEEE Standard for Floating-Point Arithmetic 754-2008 compliant
  • Element precise floating-point exception signaling
  • Pre-defined scalable extensions for chips with more gates/transistors
  • Accelerates compute-intensive applications in conjunction with leveraging generic compiler support
  • Software-programmable solution for consumer electronics applications or functions not covered by dedicated hardware
  • Emerging data mining, feature extraction, image and video processing, and human-computer interaction applications
  • High-performance scientific computing

MIPS Virtualization[edit]

Hardware supported virtualization technology.

MIPS Multi-Threading[edit]

Each multi-threaded MIPS core can support up to two VPEs (Virtual Processing Elements) which share a single pipeline as well as other hardware resources. However, since each VPE includes a complete copy of the processor state as seen by the software system, each VPE appears as a complete standalone processor to an SMP Linux operating system. For more fine-grained thread processing applications, each VPE is capable of supporting up to 9 TCs allocated across 2 VPEs. The TCs share a common execution unit but each has its own program counter and core register files so that each can handle a thread from the software. The MIPS MT architecture also allows the allocation of processor cycles to threads, and sets the relative thread priorities with an optional Quality of Service (QoS) manager block. This enables two prioritization mechanisms that determine the flow of information across the bus. The first mechanism allows the user to prioritize one thread over another. The second mechanism is used to allocate a specified ratio of the cycles to specific threads over time. The combined use of both mechanisms allows effective allocation of bandwidth to the set of threads, and better control of latencies. In real-time systems, system-level determinism is very critical, and the QoS block facilitates improvement of the predictability of a system. Hardware designers of advanced systems may replace the standard QoS block provided by MIPS Technologies with one that is specifically tuned for their application.

Single-threaded microprocessors today waste many cycles while waiting to access memory, considerably limiting system performance.[dubious ] The use of multi-threading masks the effect of memory latency by increasing processor utilization. As one thread stalls, additional threads are instantly fed into the pipeline and executed, resulting in a significant gain in application throughput. Users can allocate dedicated processing bandwidth to real-time tasks resulting in a guaranteed Quality of Service (QoS). MIPS’ MT technology constantly monitors the progress of threads and dynamically takes corrective actions to meet or exceed the real-time requirements. A processor pipeline can achieve 80-90% utilization by switching threads during data-dependent stalls or cache misses. All of this leads to an improved mobile device user experience, as responsiveness is greatly increased.


SmartMIPS is an Application-Specific Extension (ASE) designed to improve performance and reduce memory consumption for smart card software. It is supported by MIPS32 only, since smart cards do not require the capabilities of MIPS64 processors. Few smart cards used SmartMIPS, and MIPS32 Release 6 removed it from the architecture.


Main article: MDMX


Main article: MIPS-3D


microMIPS32 and microMIPS64 are high performance code compression technologies that combine optimized 16- and 32-bit instructions in a single instruction set. As a complete ISA, microMIPS can operate standalone or in co-existence with the legacy-compatible MIPS32 instruction decoder, allowing programs to intermix 16- and 32-bit code without having to switch modes. microMIPS32 has 32x32b registers; 32 bits Virtual Address, up to 36 bits Physical Address (same as MIPS32). microMIPS64 has 32x64b registers; 64 bits Virtual Address, up to 59 bits Physical Address, adds 64- bit variables (same as MIPS64)


Open Virtual Platforms (OVP)[19] includes the freely available for non-commercial use simulator OVPsim, a library of models of processors, peripherals and platforms, and APIs which enable users to develop their own models. The models in the library are open source, written in C, and include the MIPS 4K, 24K, 34K, 74K, 1004K, 1074K, M14K, microAptiv, interAptiv, proAptiv 32 bit cores and the MIPS 64bit 5K range of cores. These models are created and maintained by Imperas[20] and in partnership with MIPS Technologies have been tested and assigned the MIPS-Verified (tm) mark. Sample MIPS-based platforms include both bare metal environments and platforms for booting unmodified Linux binary images. These platforms–emulators are available as source or binaries and are fast, free for non-commercial usage, and are easy to use. OVPsim is developed and maintained by Imperas and is very fast (hundreds of million of instructions per second), and built to handle multicore homogeneous and heterogeneous architectures and systems.

There is a freely available MIPS32 simulator (earlier versions simulated only the R2000/R3000) called SPIM for use in education. EduMIPS64[21] is a GPL graphical cross-platform MIPS64 CPU simulator, written in Java/Swing. It supports a wide subset of the MIPS64 ISA and allows the user to graphically see what happens in the pipeline when an assembly program is run by the CPU. It has educational purposes and is used in some[who?] computer architecture courses in universities around the world.

MARS[22] is another GUI-based MIPS emulator designed for use in education, specifically for use with Hennessy's Computer Organization and Design.

WebMIPS[23] is a browser-based MIPS simulator with visual representation of a generic, pipelined processor. This simulator is quite useful for register tracking during step by step execution.

More advanced free emulators are available from the GXemul (formerly known as the mips64emul project) and QEMU projects. These emulate the various MIPS III and IV microprocessors in addition to entire computer systems which use them.

Commercial simulators are available especially for the embedded use of MIPS processors, for example Wind River Simics (MIPS 4Kc and 5Kc, PMC RM9000, QED RM7000, Broadcom/Netlogic ec4400, Cavium Octeon I), Imperas (all MIPS32 and MIPS64 cores), VaST Systems (R3000, R4000), and CoWare (the MIPS4KE, MIPS24K, MIPS25Kf and MIPS34K).

See also[edit]

  • Delay slot
  • DLX, a very similar architecture designed by John L. Hennessy (creator of MIPS) for teaching purposes
  • MIPS-X, a follow-on project to Stanford University's MIPS project


  1. ^ a b c Price, Charles (September 1995). MIPS IV Instruction Set (Revision 3.2), MIPS Technologies, Inc.
  2. ^ a b c d e f Sweetman, Dominic (1999). See MIPS Run. Morgan Kaufmann Publishers, Inc. ISBN 1-55860-410-3. 
  3. ^ "MIPS32 Architecture". Imagination Technologies. Retrieved 4 Jan 2014. 
  4. ^ "MIPS64 Architecture". Imagination Technologies. Retrieved 4 Jan 2014. 
  5. ^ "MIPS-3D ASE". Imagination Technologies. Retrieved 4 Jan 2014. 
  6. ^ "MIPS16e". Imagination Technologies. Retrieved 4 Jan 2014. 
  7. ^ "MIPS Multithreading". Imagination Technologies. Retrieved 4 Jan 2014. 
  8. ^ University of California, Davis. "ECS 142 (Compilers) References & Tools page". Retrieved 28 May 2009. 
  9. ^ Rubio, Victor P. "A FPGA Implementation of a MIPS RISC Processor for Computer Architecture Education" (PDF). New Mexico State University. Retrieved 22 December 2011. 
  10. ^ "Silicon Graphics Introduces Enhanced MIPS Architecture to Lead the Interactive Digital Revolution". Silicon Graphics, Inc. 21 October 1996. 
  11. ^ a b Gwennap, Linley (18 November 1996). "Digital, MIPS Add Multimedia Extensions". Microprocessor Report. pp. 24–28.
  12. ^ a b c d e "MIPS Technologies, Inc. Enhances Architecture to Support Growing Need for IP Re-Use and Integration" (Press release). Business Wire. May 3, 1999. 
  13. ^ "Latest Release of MIPS Architecture Includes Virtualization and SIMD Key Functionality for Enabling Next Generation of MIPS-Based Products" (Press release). MIPS Technologies. December 6, 2012. Archived from the original on 13 December 2012. 
  14. ^ "MIPS skips Release 4 amid bidding war". EE Times. 10 December 2012. 
  15. ^ https://imgtec.com/mips/architectures/mips32/
  16. ^ Using the GNU Compiler Collection (GCC): MIPS DSP Built-in Functions
  17. ^ Instruction Set Architecture - LinuxMIPS
  18. ^ Five Methods of Utilizing the MIPS® DSP ASE
  19. ^ "OVP: Fast Simulation, Free Open Source Models. Virtual Platforms for software development". Ovpworld.org. Retrieved 2012-05-30. 
  20. ^ "Imperas". Imperas. 2008-03-03. Retrieved 2012-05-30. 
  21. ^ "EduMIPS64". Edumips.org. Retrieved 2012-05-30. 
  22. ^ "MARS MIPS simulator - Missouri State University". Courses.missouristate.edu. Retrieved 2012-05-30. 
  23. ^ http://www.maiconsoft.com.br/webmips/index.asp (online demonstration) http://www.dii.unisi.it/~giorgi/WEBMIPS/ (source)

Further reading[edit]

External links[edit]