Jump to content

RISC-V

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Jeremybennett (talk | contribs) at 16:25, 27 August 2020 (Add reference to updated Unprivileged ISA). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

RISC-V
DesignerUniversity of California, Berkeley
Bits
  • 32
  • 64
  • 128
Introduced2010
Version
  • unprivileged ISA 20191213,[1]
  • privileged ISA 20190608[2]
DesignRISC
TypeLoad-store
EncodingVariable
BranchingCompare-and-branch
EndiannessLittle[1][3]
Page size4 KiB
Extensions
  • M: Multiplication
  • A: Atomic
  • F: Floating point (32-bit)
  • D: FP Double (64-bit)
  • Q: FP Quad (128-bit)
OpenYes, and royalty free
Registers
General-purpose
  • 16
  • 32
(including one always-zero register)
Floating point32 (optional)

RISC-V (pronounced "risk-five"[1]: 1 ) is an open standard instruction set architecture (ISA) based on established reduced instruction set computer (RISC) principles. Unlike most other ISA designs, the RISC-V ISA is provided under open source licenses that do not require fees to use. A number of companies are offering or have announced RISC-V hardware, open source operating systems with RISC-V support are available and the instruction set is supported in several popular software toolchains.

Notable features of the RISC-V ISA include a load–store architecture, bit patterns to simplify the multiplexers in a CPU, IEEE 754 floating-point, a design that is architecturally neutral, and placing most-significant bits at a fixed location to speed sign extension.[1] The instruction set is designed for a wide range of uses. It is variable-width and extensible so that more encoding bits can always be added. It supports three word-widths, 32, 64, and 128 bits, and a variety of subsets. The definitions of each subset vary slightly for the three word-widths. The subsets support small embedded systems, personal computers, supercomputers with vector processors, and warehouse-scale 19 inch rack-mounted parallel computers.

The instruction set space for the 128-bit stretched version of the ISA was reserved because 60 years of industry experience has shown that the most unrecoverable error in instruction set design is a lack of memory address space. As of 2016, the 128-bit ISA remains undefined intentionally, because there is yet so little practical experience with such large memory systems.[1] There are proposals to implement variable-width instructions up to 864 bits long, 27 times the usual length.[1][4]

The project began in 2010 at the University of California, Berkeley along with many volunteer contributors not affiliated with the university.[5] Unlike other academic designs which are typically optimized only for simplicity of exposition, the designers intended that the RISC-V instruction set be useable for practical computers.

As of June 2019, version 2.2 of the user-space ISA[1] and version 1.11 of the privileged ISA[2] are frozen, permitting software and hardware development to proceed. The user-space ISA, now renamed the Unprivileged ISA, was updated, ratified and frozen as version 20191213 [6]. A debug specification is available as a draft, version 0.13.2.[2]

Rationale

RISC-V processor prototype, January 2013

CPU design requires design expertise in several specialties: electronic digital logic, compilers, and operating systems. To cover the costs of such a team, commercial vendors of computer designs, such as ARM Holdings and MIPS Technologies charge royalties for the use of their designs, patents and copyrights.[7][8][9] They also often require non-disclosure agreements before releasing documents that describe their designs' detailed advantages. In many cases, they never describe the reasons for their design choices.

RISC-V was started with a goal to make a practical ISA that was open-sourced, usable academically and in any hardware or software design without royalties.[1][10] Also, the rationales for every part of the project are explained, at least broadly. The RISC-V authors are academic but have substantial experience in computer design. The RISC-V ISA is a direct development from a series of academic computer-design projects. It was originated in part to aid such projects.[1][10]

In order to build a large, continuing community of users and therefore accumulate designs and software, the RISC-V ISA designers planned to support a wide variety of practical uses: Small, fast, and low-power real-world implementations,[1][11] without over-architecting for a particular microarchitecture.[1][12][13][14] A need for a large base of contributors is part of the reason why RISC-V was engineered to fit so many uses.

The designers say that the instruction set is the main interface in a computer because it lies between the hardware and the software. If a good instruction set were open, available for use by all, then it should dramatically reduce the cost of software by permitting far more reuse. It should also increase competition among hardware providers, who can use more resources for design and less for software support.[10]

The designers assert that new principles are becoming rare in instruction set design, as the most successful designs of the last forty years have become increasingly similar. Of those that failed, most did so because their sponsoring companies failed commercially, not because the instruction sets were poor technically. So, a well-designed open instruction set designed using well-established principles should attract long-term support by many vendors.[10]

RISC-V also supports the designers' academic uses. The simplicity of the integer subset permits basic student exercises. The integer subset is a simple ISA enabling software to control research machines. The variable-length ISA enables extensions for both student exercises and research.[1] The separated privileged instruction set permits research in operating system support, without redesigning compilers.[15] RISC-V's open intellectual property allows its designs to be published, reused, and modified.[1]

History

The term RISC dates from about 1980.[16] Before this, there was some knowledge that simpler computers could be effective, but the design principles were not widely described. Simple, effective computers have always been of academic interest. Academics created the RISC instruction set DLX for the first edition of Computer Architecture: A Quantitative Approach in 1990. David Patterson was an author, and later assisted RISC-V. DLX was intended for educational use; academics and hobbyists implemented it using field-programmable gate arrays, but it was not a commercial success. ARM CPUs, versions 2 and earlier, had a public-domain instruction set, and it is still supported by the GNU Compiler Collection (GCC), a popular free-software compiler. Three open-source cores exist for this ISA, but they have not been manufactured.[17][18] OpenRISC is an open-source ISA based on DLX, with associated RISC designs. It is fully supported with GCC and Linux implementations. However, it has few commercial implementations.

Krste Asanović at the University of California, Berkeley, found many uses for an open-source computer system. In 2010, he decided to develop and publish one in a "short, three-month project over the summer". The plan was to help both academic and industrial users.[10] David Patterson at Berkeley also aided the effort. He originally identified the properties of Berkeley RISC,[16] and RISC-V is one of his long series of cooperative RISC research projects. At this stage, students inexpensively provided initial software, simulations, and CPU designs.[5]

The RISC-V authors and their institution originally provided the ISA documents and several CPU designs under BSD licenses, which allow derivative works—such as RISC-V chip designs—to be either open and free, or closed and proprietary. The ISA specification itself (i.e., the encoding of the instruction set) was effectively put into the public domain when the ISA tech reports were published, though the actual tech report text (an expression of the specification) was later put under a Creative Commons license to allow it to be improved by external contributors including RISC-V International.

A full history of RISC-V has been published on the RISC-V International website.[19]

RISC-V Foundation and RISC-V International

Commercial users require an ISA to be stable before they can use it in a product that may last many years. To address this issue, the RISC-V Foundation was formed to own, maintain, and publish intellectual property related to RISC-V's definition.[20] The original authors and owners have surrendered their rights to the foundation.[21]

In November 2019, the RISC-V Foundation announced that it would relocate to Switzerland, citing concerns over U.S. trade regulations.[22] As of March 2020, the organization was named RISC-V International, a Swiss nonprofit business association.[23]

As of 2019, RISC-V International freely publishes the documents defining RISC-V and permits unrestricted use of the ISA for design of software and hardware. However, only paying members of RISC-V International can vote to approve changes, or use the trademarked compatibility logo.[21]

Awards

  • 2017: The Linley Group's Analyst's Choice Award for Best Technology (for the instruction set)[24]

Design

ISA base and extensions

RISC-V has a modular design, consisting of alternative base parts, with added optional extensions. The ISA base and its extensions are developed in a collective effort between industry, the research community and educational institutions. The base specifies instructions (and their encoding), control flow, registers (and their sizes), memory and addressing, logic (i.e., integer) manipulation, and ancillaries. The base alone can implement a simplified general-purpose computer, with full software support, including a general-purpose compiler.

The standard extensions are specified to work with all of the standard bases, and with each other without conflict.

Many RISC-V computers might implement the compact extension to reduce power consumption, code size, and memory use.[1] There are also future plans to support hypervisors and virtualization.[15]

Together with a supervisor instruction set extension, S, an RVGC defines all instructions needed to conveniently support a general purpose operating system.

ISA base and extensions
Name Description Version Status[a]
Base
RV32I Base Integer Instruction Set, 32-bit 2.1 Frozen
RV32E Base Integer Instruction Set (embedded), 32-bit, 16 registers 1.9 Open
RV64I Base Integer Instruction Set, 64-bit 2.0 Frozen
RV128I Base Integer Instruction Set, 128-bit 1.7 Open
Extension
M Standard Extension for Integer Multiplication and Division 2.0 Frozen
A Standard Extension for Atomic Instructions 2.0 Frozen
F Standard Extension for Single-Precision Floating-Point 2.0 Frozen
D Standard Extension for Double-Precision Floating-Point 2.0 Frozen
G Shorthand for the base integer set (I) and above extensions
Q Standard Extension for Quad-Precision Floating-Point 2.0 Frozen
L Standard Extension for Decimal Floating-Point 0.0 Open
C Standard Extension for Compressed Instructions 2.0 Frozen
B Standard Extension for Bit Manipulation 0.92 Open
J Standard Extension for Dynamically Translated Languages 0.0 Open
T Standard Extension for Transactional Memory 0.0 Open
P Standard Extension for Packed-SIMD Instructions 0.1 Open
V Standard Extension for Vector Operations 0.8 Open
N Standard Extension for User-Level Interrupts 1.1 Open
H Standard Extension for Hypervisor 0.4 Open
  1. ^ Frozen parts are expected to have their final feature set and to receive only clarifications before being ratified.
32-bit RISC-V instruction formats
Format Bit
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Register/register funct7 rs2 rs1 funct3 rd opcode
Immediate imm[11:0] rs1 funct3 rd opcode
Upper immediate imm[31:12] rd opcode
Store imm[11:5] rs2 rs1 funct3 imm[4:0] opcode
Branch [12] imm[10:5] rs2 rs1 funct3 imm[4:1] [11] opcode
Jump [20] imm[10:1] [11] imm[19:12] rd opcode

  • opcode (7 bits): Partially specifies which of the 6 types of instruction formats.
  • funct7, and funct3 (10 bits): These two fields, further than the opcode field, specify the operation to be performed.
  • rs1 (5 bits): Specifies, by index, the register containing first operand (i.e., source register).
  • rs2 (5 bits): Specifies the second operand register.
  • rd (5 bits): Specifies the destination register to which the computation result will be directed.

To tame the combinations of functionality that may be implemented, a nomenclature is defined to specify them.[1] The instruction set base is specified first, coding for RISC-V, the register bit-width, and the variant; e.g., RV64I or RV32E. Then follows letters specifying implemented extensions in canonical order (as above). The base, extended integer and floating point calculations, and synchronisation primitives for multi-core computing, the base and extensions MAFD, are considered to be necessary for general-purpose computation, and thus have the shorthand, G.

A small 32-bit computer for an embedded system might be RV32EC. A large 64-bit computer might be RV64GC; i.e., shorthand for RV64IMAFDC.

A naming scheme with Zxxx for standard extensions and Yxxx for non-standard (vendor-specific) extensions has been proposed. For example, the Ztso extension for total store ordering, an alternative memory consistency model to weak memory ordering, is under discussion.[25]

Register sets

Register
name
Symbolic
name
Description Owner
32 integer registers
x0 Zero Always zero
x1 ra Return address Caller
x2 sp Stack pointer Callee
x3 gp Global pointer
x4 tp Thread pointer
x5 t0 Temporary / alternate return address Caller
x6–7 t1–2 Temporary Caller
x8 s0/fp Saved register / frame pointer Callee
x9 s1 Saved register Callee
x10–11 a0–1 Function argument / return value Caller
x12–17 a2–7 Function argument Caller
x18–27 s2–11 Saved register Callee
x28–31 t3–6 Temporary Caller
32 floating-point extension registers
f0–7 ft0–7 Floating-point temporaries Caller
f8–9 fs0–1 Floating-point saved registers Callee
f10–11 fa0–1 Floating-point arguments/return values Caller
f12–17 fa2–7 Floating-point arguments Caller
f18–27 fs2–11 Floating-point saved registers Callee
f28–31 ft8–11 Floating-point temporaries Caller

RISC-V has 32 (or 16 in the embedded variant) integer registers, and, when the floating-point extension is implemented, separate 32 floating-point registers. Except for memory access instructions, instructions address only registers.

The first integer register is a zero register, and the remainder are general-purpose registers. A store to the zero register has no effect, and a read always provides 0. Using the zero register as a placeholder makes for a simpler instruction set.

move rx to ry becomes add r0 to rx and store in ry.[1]

Control and status registers exist, but user-mode programs can access only those used for performance measurement and floating-point management.

No instructions exist to save and restore multiple registers. Those were thought to be needless, too complex, and perhaps too slow.[1]

Memory access

Like many RISC designs, RISC-V is a load–store architecture: instructions address only registers, with load and store instructions conveying to and from memory.

Most load and store instructions include a 12-bit offset and two register identifiers. One register is the base register. The other register is the source (for a store) or destination (for a load.)

The offset is added to a base register to get the address. Forming the address as a base register plus offset allows single instructions to access data structures. For example, if the base register points to the top of a stack, single instructions can access a subroutine's local variables in the stack. Likewise the load and store instructions can access a record-style structure or a memory-mapped I/O device. Using the constant zero register as a base address allows single instructions to access memory near address zero.[1]

Memory is addressed as 8-bit bytes, with words being in little-endian order.[1] Words, up to the register size, can be accessed with the load and store instructions.

Accessed memory addresses need not be aligned to their word-width, but accesses to aligned addresses may be faster; for example, simple CPUs may implement unaligned accesses with slow software emulation driven from an alignment failure interrupt.[1]

Like many RISC instruction sets (and some complex instruction set computer (CISC) instruction sets, such as x86 and IBM System/360 families), RISC-V lacks address-modes that write back to the registers. For example, it does not auto-increment.[1]

RISC-V manages memory systems that are shared between CPUs or threads by ensuring a thread of execution always sees its memory operations in the programmed order. But between threads and I/O devices, RISC-V is simplified: It doesn't guarantee the order of memory operations, except by specific instructions, such as fence.

A fence instruction guarantees that the results of predecessor operations are visible to successor operations of other threads or I/O devices. fence can guarantee the order of combinations of both memory and memory-mapped I/O operations. E.g. it can separate memory read and write operations, without affecting I/O operations. Or, if a system can operate I/O devices in parallel with memory, fence doesn't force them to wait for each other. One CPU with one thread may decode fence as nop.

RISC-V is little-endian to resemble other familiar, successful computers, for example, x86. This also reduces a CPU's complexity and costs slightly because it reads all sizes of words in the same order. For example, the RISC-V instruction set decodes starting at the lowest-addressed byte of the instruction. The specification leaves open the possibility of non-standard big-endian or bi-endian systems.[1]

Some RISC CPUs (such as MIPS, PowerPC, DLX, and Berkeley's RISC-I) place 16 bits of offset in the loads and stores. They set the upper 16 bits by a load upper word instruction. This permits upper-halfword values to be set easily, without shifting bits. However, most use of the upper half-word instruction makes 32-bit constants, like addresses. RISC-V uses a SPARC-like combination of 12-bit offsets and 20-bit set upper instructions. The smaller 12-bit offset helps compact, 32-bit load and store instructions select two of 32 registers yet still have enough bits to support RISC-V's variable-length instruction coding.[1]

Immediates

RISC-V handles 32-bit constants and addresses with instructions that set the upper 20 bits of a 32-bit register. Load upper immediate lui loads 20 bits into bits 31 through 12. Then a second instruction such as addi can set the bottom 12 bits.

This method is extended to permit position-independent code by adding an instruction, auipc that generates 20 upper address bits by adding an offset to the program counter and storing the result into a base register. This permits a program to generate 32-bit addresses that are relative to the program counter.

The base register can often be used as-is with the 12-bit offsets of the loads and stores. If needed, addi can set the lower 12 bits of a register. In 64-bit and 128-bit ISAs,lui and auipc sign-extend the result to get the larger address.[1]

Some fast CPUs may interpret combinations of instructions as single fused instructions. lui or auipc may be good candidates to fuse with addi, loads or stores.

Subroutine calls, jumps, and branches

RISC-V's subroutine call jal (jump and link) places its return address in a register. This is faster in many computer designs, because it saves a memory access compared to systems that push a return address directly on a stack in memory. jal has a 20-bit signed (2's complement) offset. The offset is multiplied by 2, then added to the PC to generate a relative address to a 32-bit instruction. If the result is not at a 32-bit address (i.e., evenly divisible by 4), the CPU may force an exception.[1]

RISC-V CPUs jump to calculated addresses using a jump and link-register, jalr instruction. jalr is similar to jal, but gets its destination address by adding a 12-bit offset to a base register. (In contrast,jal adds a larger 20-bit offset to the PC.)

jalr's bit format is like the register-relative loads and stores. Like them, jalr can be used with the instructions that set the upper 20 bits of a base register to make 32-bit branches, either to an absolute address (using lui) or a PC-relative one (using auipc for position-independent code). (Using a constant zero base address allows single-instruction calls to a small (the offset), fixed positive or negative address.)

RISC-V recycles jal and jalr to get unconditional 20-bit PC-relative jumps and unconditional register-based 12-bit jumps. Jumps just make the linkage register 0 so that no return address is saved.[1]

RISC-V also recycles jalr to return from a subroutine: To do this, jalr's base register is set to be the linkage register saved by jal or jalr. jalr's offset is zero and the linkage register is zero, so that there is no offset, and no return address is saved.

Like many RISC designs, in a subroutine call, a RISC-V compiler must use individual instructions to save registers to the stack at the start, and then restore these from the stack on exit. RISC-V has no save multiple or restore multiple register instructions. These were thought to make the CPU too complex, and possibly slow.[26] This can take more code space. Designers planned to reduce code size with library routines to save and restore registers.[27]

RISC-V has no condition code register or carry bit. The designers believed that condition codes make fast CPUs more complex by forcing interactions between instructions in different stages of execution. This choice makes multiple-precision arithmetic more complex. Also, a few numerical tasks need more energy. As a result, predication (the conditional execution of instructions) is not supported. The designers claim that very fast, out-of-order CPU designs do predication anyway, by doing the comparison branch and conditional code in parallel, then discarding the unused path's effects. They also claim that even in simpler CPUs, predication is less valuable than branch prediction, which can prevent most stalls associated with conditional branches. Code without predication is larger, with more branches, but they also claim that a compressed instruction set (such as RISC-V's set C) solves that problem in most cases.[1]

Instead, RISC-V has short branches that perform comparisons: equal, not-equal, less-than, unsigned less-than, greater-than or equal and unsigned greater-than or equal. Ten comparison-branch operations are implemented with only six instructions, by reversing the order of operands in the assembler. For example, branch if greater than can be done by less-than with a reversed order of operands.[1]

The comparing branches have a twelve-bit signed range, and jump relative to the PC.[1]

Unlike some RISC architectures, RISC-V does not include a branch delay slot, a position after a branch instruction that can be filled with an instruction that is executed whether or not the branch is taken. RISC-V omits a branch delay slot because it complicates multicycle CPUs, superscalar CPUs, and long pipelines. Dynamic branch predictors have succeeded well enough to reduce the need for delayed branches.[1]

On the first encounter with a branch, RISC-V CPUs should assume that a negative relative branch (i.e. the sign bit of the offset is "1") will be taken.[1] This assumes that a backward branch is a loop, and provides a default direction so that simple pipelined CPUs can fill their pipeline of instructions. Other than this, RISC-V does not require branch prediction, but core implementations are allowed to add it. RV32I reserves a "HINT" instruction space that presently does not contain any hints on branches.[1]

Arithmetic and logic sets

RISC-V segregates math into a minimal set of integer instructions (set I) with add, subtract, shift, bit-wise logic and comparing-branches. These can simulate most of the other RISC-V instruction sets with software. (The atomic instructions are a notable exception.) RISC-V currently lacks the count leading zero and bit-field operations normally used to speed software floating-point in a pure-integer processor.

The integer multiplication instructions (set M) includes signed and unsigned multiply and divide. Double-precision integer multiplies and divides are included, as multiplies and divides that produce the high word of the result. The ISA document recommends that implementors of CPUs and compilers fuse a standardized sequence of high and low multiply and divide instructions to one operation if possible.[1]

The floating-point instructions (set F) includes single-precision arithmetic and also comparison-branches similar to the integer arithmetic. It requires an additional set of 32 floating-point registers. These are separate from the integer registers. The double-precision floating point instructions (set D) generally assume that the floating-point registers are 64-bit (i.e., double-width), and the F subset is coordinated with the D set. A quad-precision 128-bit floating-point ISA (Q) is also defined. RISC-V computers without floating-point can use a floating-point software library.[1]

RISC-V does not cause exceptions on arithmetic errors, including overflow, underflow, subnormal, and divide by zero. Instead, both integer and floating-point arithmetic produce reasonable default values and set status bits. Divide-by-zero can be discovered by one branch after the division. The status bits can be tested by an operating system or periodic interrupt.[1]

Atomic memory operations

RISC-V supports computers that share memory between multiple CPUs and threads. RISC-V's standard memory consistency model is release consistency. That is, loads and stores may generally be reordered, but some loads may be designated as acquire operations which must precede later memory accesses, and some stores may be designated as release operations which must follow earlier memory accesses.[1]

The base instruction set includes minimal support in the form of a fence instruction to enforce memory ordering. Although this is sufficient (fence r, rw provides acquire and fence rw, w provides release), combined operations can be more efficient.[1]

The atomic memory operation extension supports two types of atomic memory operations for release consistency. First, it provides general purpose load-reserved lr and store-conditional sc instructions. lr performs a load, and tries to reserve that address for its thread. A later store-conditional sc to the reserved address will be performed only if the reservation is not broken by an intervening store from another source. If the store succeeds, a zero is placed in a register. If it failed, a non-zero value indicates that software needs to retry the operation. In either case, the reservation is released.[1]

The second group of atomic instructions perform read-modify-write sequences: a load (which is optionally a load-acquire) to a destination register, then an operation between the loaded value and a source register, then a store of the result (which may optionally be a store-release). Making the memory barriers optional permits combining the operations. The optional operations are enabled by acquire and release bits which are present in every atomic instruction. RISC-V defines nine possible operations: swap (use source register value directly); add; bitwise and, or, and exclusive-or; and signed and unsigned minimum and maximum.[1]

A system design may optimize these combined operations more than lr and sc. For example, if the destination register for a swap is the constant zero, the load may be skipped. If the value stored is unmodified since the load, the store may be skipped.[1]

The IBM System/370 and its successors including z/Architecture, and x86, both implement a compare-and-swap (cas) instruction, which tests and conditionally updates a location in memory: if the location contains an expected old value, cas replaces it with a given new value; it then returns an indication of whether it made the change. However, a simple load-type instruction is usually performed before the cas to fetch the old value. The classic problem is that if a thread reads (loads) a value A, calculates a new value C, and then uses (cas) to replace A with C, it has no way to know whether concurrent activity in another thread has replaced A with some other value B and then restored the A in between. In some algorithms (e.g., ones in which the values in memory are pointers to dynamically allocated blocks), this ABA problem can lead to incorrect results. The most common solution employs a double-wide cas instruction to update both the pointer and an adjacent counter; unfortunately, such an instruction requires a special instruction format to specify multiple registers, performs several reads and writes, and can have complex bus operation.[1]

The lr/sc alternative is more efficient. It usually requires only one memory load, and minimizing slow memory operations is desirable. It's also exact: it controls all accesses to the memory cell, rather than just assuring a bit pattern. However, unlike cas, it can permit livelock, in which two or more threads repeatedly cause each other's instructions to fail. RISC-V guarantees forward progress (no livelock) if the code follows rules on the timing and sequence of instructions: 1) It must use only the I subset. 2) To prevent repetitive cache misses, the code (including the retry loop) must occupy no more than 16 consecutive instructions. 3) It must include no system or fence instructions, or taken backward branches between the lr and sc. 4) The backward branch to the retry loop must be to the original sequence.[1]

The specification gives examples of how to use this subset to lock a data structure.[1]

Compressed subset

The standard RISC-V ISA specifies that all instructions are 32 bits. This makes for a particularly simple implementation, but like other RISC processors with such an instruction encoding, results in larger code size than in other instruction sets.[1][26] To compensate, RISC-V's 32-bit instructions are actually 30 bits; 34 of the opcode space is reserved for an optional (but recommended) variable-length compressed instruction set, RVC, that includes 16-bit instructions. Like ARM's Thumb and the MIPS16, the compressed instructions are simply aliases for a subset of the larger instructions. Unlike ARM's Thumb or the MIPS compressed set, space was reserved from the beginning so there is no separate operating mode. Standard and compressed instructions may be intermixed freely.[1][26] (letter C)[27]

Because (like Thumb-1 and MIPS16) the compressed instructions are simply alternate encodings (aliases) for a selected subset of larger instructions, the compression can be implemented in the assembler, and it is not essential for the compiler to even know about it.

A prototype of RVC was tested in 2011.[26] The prototype code was 20% smaller than an x86 PC and MIPS compressed code, and 2% larger than ARM Thumb-2 code.[26] It also substantially reduced both the needed cache memory and the estimated power use of the memory system.[26]

The researcher intended to reduce the code's binary size for small computers, especially embedded computer systems. The prototype included 33 of the most frequently used instructions, recoded as compact 16-bit formats using operation codes previously reserved for the compressed set.[26] The compression was done in the assembler, with no changes to the compiler. Compressed instructions omitted fields that are often zero, used small immediate values or accessed subsets (16 or 8) of the registers. addi is very common and often compressible.[26]

Much of the difference in size compared to ARM's Thumb set occurred because RISC-V, and the prototype, have no instructions to save and restore multiple registers. Instead, the compiler generated conventional instructions that access the stack. The prototype RVC assembler then often converted these to compressed forms that were half the size. However, this still took more code space than the ARM instructions that save and restore multiple registers. The researcher proposed to modify the compiler to call library routines to save and restore registers. These routines would tend to remain in a code cache and thus run fast, though probably not as fast as a save-multiple instruction.[26]

Standard RVC requires occasional use of 32-bit instructions. Several nonstandard RVC proposals are complete, requiring no 32-bit instructions, and are said to have higher densities than standard RVC.[28][29] Another proposal builds on these, and claims to use less coding range as well.[30]

Embedded subset

An instruction set for the smallest embedded CPUs (set E) is reduced in other ways: Only 16 of the 32 integer registers are supported. Floating-point instructions should not be supported (the specification forbids it as uneconomical), so a floating-point software library must be used.[1] The compressed set C is recommended. The privileged instruction set supports only machine mode, user mode and memory schemes that use base-and-bound address relocation.[15]

Discussion has occurred for a microcontroller profile for RISC-V, to ease development of deeply embedded systems. It centers on faster, simple C-language support for interrupts, simplified security modes and a simplified POSIX application binary interface.[31]

Correspondents have also proposed smaller, non-standard, 16-bit RV16E ISAs: Several serious proposals would use the 16-bit C instructions with 8 × 16-bit registers.[29][28] An April fools' joke proposed a very practical arrangement: Utilize 16 × 16-bit integer registers, with the standard EIMC ISAs (including 32-bit instructions.) The joke was to propose bank switching, when a 32-bit CPU would be clearly superior with the larger address space.[32]

Privileged instruction set

RISC-V's ISA includes a separate privileged instruction set specification. As of August 2019, version 1.11 is ratified by RISC-V International.[2][15]

Version 1.11 of the specification supports several types of computer systems:

  1. Systems that have only machine mode, perhaps for embedded systems,
  2. Systems with both machine mode (for the supervisor) and user-mode to implement operating systems that run the kernel in a privileged mode.
  3. Systems with machine-mode, hypervisors, multiple supervisors, and user-modes under each supervisor.

These correspond roughly to systems with up to four rings of privilege and security, at most: machine, hypervisor, supervisor and user. Each layer also is expected to have a thin layer of standardized supporting software that communicates to a more-privileged layer, or hardware.[15]

The overall plan for this ISA is to make the hypervisor mode orthogonal to the user and supervisor modes.[33] The basic feature is a configuration bit that either permits supervisor-level code to access hypervisor registers, or causes an interrupt on accesses. This bit lets supervisor mode directly handle the hardware needed by a hypervisor. This simplifies a type 2 hypervisor, hosted by an operating system. This is a popular mode to run warehouse-scale computers. To support type 1, unhosted hypervisors, the bit can cause these accesses to interrupt to a hypervisor. The bit simplifies nesting of hypervisors, in which a hypervisor runs under a hypervisor. It's also said to simplify supervisor code by letting the kernel use its own hypervisor features with its own kernel code. As a result, the hypervisor form of the ISA supports five modes: machine, supervisor, user, supervisor-under-hypervisor and user-under-hypervisor.

The privileged instruction set specification explicitly defines hardware threads, or harts. Multiple hardware threads are a common practice in more-capable computers. When one thread is stalled, waiting for memory, others can often proceed. Hardware threads can help make better use of the large number of registers and execution units in fast out-of-order CPUs. Finally, hardware threads can be a simple, powerful way to handle interrupts: No saving or restoring of registers is required, simply executing a different hardware thread. However, the only hardware thread required in a RISC-V computer is thread zero.[15]

The existing control and status register definitions support RISC-V's error and memory exceptions, and a small number of interrupts. For systems with more interrupts, the specification also defines an interrupt controller. Interrupts always start at the highest-privileged machine level, and the control registers of each level have explicit forwarding bits to route interrupts to less-privileged code. For example, the hypervisor need not include software that executes on each interrupt to forward an interrupt to an operating system. Instead, on set-up, it can set bits to forward the interrupt.[15]

Several memory systems are supported in the specification. Physical-only is suited to the simplest embedded systems. There are also three UNIX-style virtual memory systems for memory cached in mass-storage systems. The virtual memory systems have three sizes, with addresses sized 32, 39 and 48 bits. All virtual memory systems support 4 KiB pages, multilevel page-table trees and use very similar algorithms to walk the page table trees. All are designed for either hardware or software page-table walking. To optionally reduce the cost of page table walks, super-sized pages may be leaf pages in higher levels of a system's page table tree. SV32 has a two-layer page table tree and supports 4 MiB superpages. SV39 has a three level page table, and supports 2 MiB superpages and 1 GiB gigapages. SV48 is required to support SV39. It also has a 4-level page table and supports 2 MiB superpages, 1 GiB gigapages, and 512 GiB terapages. Superpages are aligned on the page boundaries for the next-lowest size of page.[15]

Bit manipulation

An unapproved bit-manipulation (B) ISA for RISC-V was under review in January 2020.[clarification needed] Done well, a bit-manipulation subset can aid cryptographic, graphic, and mathematical operations. The criteria for inclusion documented in the draft were compliance with RV5 philosophies and ISA formats, substantial improvements in code density or speed (i.e., at least a 3-for-1 reduction in instructions), and substantial real-world applications, including preexisting compiler support. Version 0.92 includes[34] instructions to count leading zeros, count one bits, perform logic operations with complement, pack two words in one register, take the min or max, sign-extend, single-bit operations, shift ones, rotates, a generalized bit-reverse and shuffle, or-combines, bit-field place and extract, carry-less multiply, CRC instructions, bit-matrix operations (RV64 only), conditional mix, conditional move, funnel shifts, and unsigned address calculations.

Packed SIMD

Packed-SIMD instructions are widely used by commercial CPUs to inexpensively accelerate multimedia and other digital signal processing.[1] For simple, cost-reduced RISC-V systems, the base ISA's specification proposed to use the floating-point registers' bits to perform parallel single instruction, multiple data (SIMD) sub-word arithmetic.

In 2017 a vendor published a more detailed proposal to the mailing list, and this can be cited as version 0.1.[35] As of 2019, the efficiency of this proposed ISA varies from 2x to 5x a base CPU for a variety of DSP codecs.[36] The proposal lacked instruction formats and a license assignment to RISC-V International, but it was reviewed by the mailing list.[35] Some unpopular parts of this proposal were that it added a condition code, the first in a RISC-V design, linked adjacent registers (also a first), and has a loop counter that could be difficult to implement in some microarchitectures.

A previous, well-regarded implementation for a 64-bit CPU was PA-RISC's multimedia instructions: Multimedia Acceleration eXtensions. It increased the CPU's performance on digital signal processing tasks by 48-fold or more, enabling practical real-time video codecs in 1995.[37][38] Besides its native 64-bit math, the PA-RISC MAX2 CPU could do arithmetic on four 16-bit subwords at once, with several overflow methods. It also could move subwords to different positions. PA-RISC's MAX2 was intentionally simplified. It lacked support for 8-bit or 32-bit subwords. The 16-bit subword size was chosen to support most digital signal processing tasks. These instructions were inexpensive to design and build.

Vector set

The proposed vector-processing instruction set may make the packed SIMD set obsolete. The designers hope to have enough flexibility that a CPU can implement vector instructions in a standard processor's registers. This would enable minimal implementations with similar performance to a multimedia ISA, as above. However, a true vector coprocessor could execute the same code with higher performance.[39]

As of 29 June 2015, the vector-processing proposal is a conservative, flexible design of a general-purpose mixed-precision vector processor, suitable to execute compute kernels. Code would port easily to CPUs with differing vector lengths, ideally without recompiling.[39]

In contrast, short-vector SIMD extensions are less convenient. These are used in x86, ARM and PA-RISC. In these, a change in word-width forces a change to the instruction set to expand the vector registers (in the case of x86, from 64-bit MMX registers to 128-bit Streaming SIMD Extensions (SSE), to 256-bit Advanced Vector Extensions (AVX), and AVX-512). The result is a growing instruction set, and a need to port working code to the new instructions.

In the RISC-V vector ISA, rather than fix the vector length in the architecture, an instruction (setvl) is available which takes a requested size and sets the vector length to the minimum of the hardware limit and the requested size. So, the RISC-V proposal is more like a Cray's long-vector design or ARM's Scalable Vector Extension. That is, each vector in up to 32 vectors is the same length.[39]

The application specifies the total vector width it requires, and the processor determines the vector length it can provide with available on-chip resources. This takes the form of an instruction (vsetcfg) with four immediate operands, specifying the number of vector registers of each available width needed. The total must be no more than the addressable limit of 32, but may be less if the application does not require them all. The vector length is limited by the available on-chip storage divided by the number of bytes of storage needed for each entry. (Added hardware limits may also exist, which in turn may permit SIMD-style implementations.)[39]

Outside of vector loops, the application can request zero-vector registers, saving the operating system the work of preserving them on context switches.[39]

The vector length is not only architecturally variable, but designed to vary at run time also. To achieve this flexibility, the instruction set is likely to use variable-width data paths and variable-type operations using polymorphic overloading.[39] The plan is that these can reduce the size and complexity of the ISA and compiler.[39]

Recent experimental vector processors with variable-width data paths also show profitable increases in operations per: second (speed), area (lower cost), and watt (longer battery life).[40]

Unlike a typical modern graphics processing unit, there are no plans to provide special hardware to support branch predication. Instead, lower cost compiler-based predication will be used.[39][41]

External debug system

There is a preliminary specification for RISC-V's hardware-assisted debugger. The debugger will use a transport system such as Joint Test Action Group (JTAG) or Universal Serial Bus (USB) to access debug registers. A standard hardware debug interface may support either a standardized abstract interface or instruction feeding.[42][43]

As of January 2017, the exact form of the abstract interface remains undefined, but proposals include a memory mapped system with standardized addresses for the registers of debug devices or a command register and a data register accessible to the communication system.[42] Correspondents claim that similar systems are used by Freescale's background debug mode interface (BDM) for some CPUs, ARM, OpenRISC, and Aeroflex's LEON.[42]

In instruction feeding, the CPU will process a debug exception to execute individual instructions written to a register. This may be supplemented with a data-passing register and a module to directly access the memory. Instruction feeding lets the debugger access the computer exactly as software would. It also minimizes changes in the CPU, and adapts to many types of CPU. This was said to be especially apt for RISC-V because it is designed explicitly for many types of computers. The data-passing register allows a debugger to write a data-movement loop to RAM, and then execute the loop to move data into or out of the computer at a speed near the maximum speed of the debug system's data channel.[42] Correspondents say that similar systems are used by MIPS Technologies MIPS, Intel Quark, Tensilica's Xtensa, and for Freescale Power ISA CPUs' background debug mode interface (BDM).[42]

A vendor proposed a hardware trace subsystem for standardization, donated a conforming design, and initiated a review.[44][45] The proposal is for a hardware module that can trace code execution on most RV5 CPUs. To reduce the data rate, and permit simpler or less-expensive paths for the trace data, the proposal does not generate trace data that could be calculated from a binary image of the code. It sends only data that indicates "uninferrable" paths through the program, such as which conditional branches are taken. To reduce the data rates, branches that can be calculated, such as unconditional branches, are not traced. The proposed interface between the module and the control unit is a logic signal for each uninferrable type of instruction. Addresses and other data are to be provided in a specialized bus attached to appropriate data sources in a CPU. The data structure sent to an external trace unit is a series of short messages with the needed data. The details of the data channel are intentionally not described in the proposal, because several are likely to make sense.

Implementations

The RISC-V organization maintains a list of RISC-V CPU and SoC implementations.[46]

Existing

Existing commercial implementations include:

  • Alibaba Group, in July 2019 announced the 2.5 GHz 16-core 64-bit (RV64GCV) XuanTie 910 out-of-order processor, the fastest RISC-V processor to date[47]
  • Andes Technology Corporation, a founding member of RISC-V International[48] which joined the consortium in 2016, released its first two RISC-V cores in 2017. The cores, the N25 and NX25, come with a complete design ecosystems and a number of RISC-V partners. Andes is actively driving the development of RISC-V ecosystem and expects to release several new RISC-V products in 2018.
  • CloudBEAR is a processor IP company that develops its own RISC-V cores for a range of applications.[49]
  • Codasip and UltraSoC have developed fully supported intellectual property for RISC-V embedded SOCs that combine Codasip's RISC-V cores and other IP with UltraSoC's debug, optimization and analytics.[50]
  • Cortus, a founding platinum member of the RISC-V foundation, has a number of RISC-V implementations and a complete IDE/toolchain/debug eco-system which it offers for free as part of its SoC design business.
  • GigaDevice has a series of MCUs based on RISC-V (RV32IMAC, GD32V series),[51] with one of them used on the Longan Nano board produced by a Chinese electronic company Sipeed.[52]
  • GreenWaves Technologies announced the availability of GAP8, a 32-bit 1 controller plus 8 compute cores, 32-bit SoC (RV32IMC) and developer board in February 2018. Their GAPuino GAP8 development board started shipping in May 2018.[53][54][55]
  • IAR Systems released the first version of IAR Embedded Workbench for RISC-V, which supports RV32 32-bit RISC-V cores and extensions in the first version. Future releases will include 64-bit support and support for the smaller RV32E base instruction set, as well as functional safety certification and security solutions.
  • Instant SoC RISC-V cores from FPGA Cores. System On Chip, including RISC-V cores, defined by C++.
  • SEGGER added support for RISC-V cores to their debug probe J-Link,[56] their integrated development environment Embedded Studio,[57] and their RTOS embOS and embedded software.[58]
  • SiFive, a company established specifically for developing RISC-V hardware, has processor models released in 2017.[59][60] These include a quad-core, 64-bit (RV64GC) system on a chip (SoC) capable of running general-purpose operating systems such as Linux.[61]
  • Syntacore,[62] a founding member of RISC-V International and one of the first commercial RISC-V IP vendors, develops and licenses family of RISC-V IP since 2015. As of 2018, product line includes eight 32- and 64-bit cores, including open-source SCR1 MCU core (RV32I/E[MC]).[63] First commercial SoCs, based on the Syntacore IP were demonstrated in 2016.[64]
  • UltraSOC proposed a standard trace system and donated an implementation.
  • Western Digital, in December 2018 announced an RV32IMC core called SweRV. The SweRV features an in-order 2-way superscalar and nine-stage pipeline design. WD plans to use SweRV based processors in their flash controllers and SSDs, and released it as open-source to third parties in January 2019.[65][66][67]
  • Espressif[68] added a RISC-V ULP coprocessor to their ESP32-S2 microcontroller.[69]

In development

Open source

There are many open-sourced RISC-V CPU designs, including:

  • The Berkeley CPUs. These are implemented in a unique hardware design language, Chisel, and some are named for famous train engines:
    • 64-bit Rocket.[83] Rocket may suit compact, low-power intermediate computers such as personal devices. Named for Stephenson's Rocket.
    • The 64-bit Berkeley Out of Order Machine (BOOM).[84] The Berkeley Out-of-Order Machine (BOOM) is a synthesizable and parameterizable open source RV64GC RISC-V core written in the Chisel hardware construction language. BOOM uses much of the infrastructure created for Rocket, and may be usable for personal, supercomputer, and warehouse-scale computers.
    • Five 32-bit Sodor CPU designs from Berkeley,[85] designed for student projects.[86] Sodor is the fictional island of trains in childrens' stories about Thomas the Tank Engine.
  • picorv32 by Claire Wolf,[87] a 32-bit microcontroller unit (MCU) class RV32IMC implementation in Verilog.
  • scr1 from Syntacore,[88]a 32-bit microcontroller unit (MCU) class RV32IMC implementation in Verilog.
  • PULPino (Riscy and Zero-Riscy) from ETH Zürich / University of Bologna.[89] The cores in PULPino implement a simple RV32IMC ISA for microcontrollers (Zero-Riscy) or a more powerful RV32IMFC ISA with custom DSP extensions for embedded signal processing.

Software

A normal problem for a new instruction set is a lack of CPU designs and software. Both issues limit its usability and reduce adoption.[10]

The design software includes a design compiler, Chisel,[90] which can reduce the designs to Verilog for use in devices. The website includes verification data for testing core implementations.

Available RISC-V software tools include a GNU Compiler Collection (GCC) toolchain (with GDB, the debugger), an LLVM toolchain, the OVPsim simulator (and library of RISC-V Fast Processor Models), the Spike simulator, and a simulator in QEMU (RV32GC/RV64GC).

Operating system support exists for the Linux kernel, FreeBSD, and NetBSD, but the supervisor-mode instructions were unstandardized prior to June 2019,[15] so this support is provisional. The preliminary FreeBSD port to the RISC-V architecture was upstreamed in February 2016, and shipped in FreeBSD 11.0.[91][73] Ports of Debian[92] and Fedora[93] are stabilizing (both only support 64-bit RISC-V, with no plans to support 32-bit version). A port of Das U-Boot exists.[94] UEFI Spec v2.7 has defined the RISC-V binding and a TianoCore port has been done by HPE engineers[95] and is expected to be upstreamed. There is a preliminary port of the seL4 microkernel.[96][97] Hex Five released the first Secure IoT Stack for RISC-V with FreeRTOS support.[98] Also xv6, a modern reimplementation of Sixth Edition Unix in ANSI C used for pedagogical purposes in MIT, was ported. Pharos RTOS has been ported to 64-bit RISC-V[99] (including time and memory protection). Also see Comparison of real-time operating systems.

A simulator exists to run a RISC-V Linux system on a web browser using JavaScript.[100]

The educational simulator WepSIM[101][102] implements (microprogrammed) a subset of RISC-V instructions (RV32I + M) and allows the execution of subroutines in assembly. Moreover, it is possible to add more RISC-V instructions (by microprogramming these instructions) and test the impact of its implementation. The WepSIM simulator can be used from a Web browser and facilitates learning various aspects of how a CPU works (microprogramming, interruptions, system calls, etc.) using RISC-V assembly.

QEMU supports running (using binary translation) 32- and 64-bit RISC-V systems (i.e. Linux) with a number of emulated or virtualized devices (serial, parallel, USB, network, storage, real time clock, watchdog, audio), as well running RISC-V Linux binaries (translating syscalls to the host kernel). It does support multi-core emulation (SMP).[103]

See also

References

  1. ^ a b c d e f g h i j k l m n o p q r s t u v w x y z aa ab ac ad ae af ag ah ai aj ak al am an ao ap aq ar as at au Waterman, Andrew; Asanović, Krste. "The RISC-V Instruction Set Manual, Volume I: Base User-Level ISA version 2.2". University of California, Berkeley. EECS-2016-118. Retrieved 25 May 2017.
  2. ^ a b c d "Privileged ISA Specification". RISC-V International.
  3. ^ Big and bi-endianness supported as extensions
  4. ^ Wolf, Clifford. "Alternative proposal for instruction length encoding". Cliffords Subversion Servier. Clifford Wolf. Retrieved 20 October 2019.
  5. ^ a b "Contributors". riscv.org. Regents of the University of California. Retrieved 25 August 2014.
  6. ^ Waterman, Andrew; Asanović, Krste. "The RISC-V Instruction Set Manual, Volume I: Unprivileged ISA version 20191213" (PDF). RISC-V International. Retrieved 27 August 2020.
  7. ^ Demerjian, Chuck (7 August 2013). "A long look at how ARM licenses chips: Part 1". SemiAccurate.
  8. ^ Demerjian, Chuck (8 August 2013). "How ARM licenses its IP for production: Part 2". SemiAccurate.
  9. ^ "Wave Computing Closes Its MIPS Open Initiative with Immediate Effect, Zero Warning". 15 November 2019.
  10. ^ a b c d e f Asanović, Krste. "Instruction Sets Should be Free" (PDF). U.C. Berkeley Technical Reports. Regents of the University of California. Retrieved 15 November 2016.
  11. ^ "Rocket Core Generator". RISC-V. Regents of the University of California. Retrieved 1 October 2014.
  12. ^ Celio, Christopher; Love, Eric. "ucb-bar/riscv-sodor". GitHub Inc. Regents of the University of California. Retrieved 12 February 2015.
  13. ^ a b "SHAKTI Processor Program". Indian Institute of Technology Madras. Retrieved 3 September 2019.
  14. ^ Celio, Christopher. "CS 152 Laboratory Exercise 3" (PDF). UC Berkeley. Regents of the University of California. Retrieved 12 February 2015.
  15. ^ a b c d e f g h i Waterman, Andrew; Lee, Yunsup; Avizienas, Rimas; Patterson, David; Asanović, Krste. "Draft Privileged ISA Specification 1.9". RISC-V. RISC-V International. Retrieved 30 August 2016.
  16. ^ a b Patterson, David A.; Ditzel, David R. (October 1980). "The Case for the Reduced Instruction Set Computer". ACM SIGARCH Computer Architecture News. 8 (6): 25. doi:10.1145/641914.641917.
  17. ^ "Amber ARM-compatible core". OpenCores. Retrieved 26 August 2014.
  18. ^ "ARM4U". OpenCores. OpenCores. Retrieved 26 August 2014.
  19. ^ "RISC-V History". Retrieved 19 November 2019.
  20. ^ "A new blueprint for microprocessors challenges the industry's giants". The Economist. 3 October 2019. ISSN 0013-0613. Retrieved 10 November 2019.
  21. ^ a b "RISC-V Foundation". RISC-V Foundation. Retrieved 15 March 2019.
  22. ^ "U.S.-based chip-tech group moving to Switzerland over trade curb fears". Reuters. 26 November 2019. Retrieved 26 November 2019.
  23. ^ "RISC-V History - RISC-V International". RISC-V International. Retrieved 14 May 2020.
  24. ^ "The Linley Group Announces Winners of Annual Analysts' Choice Awards" (Press release). The Linley Group. 12 January 2017. Retrieved 21 January 2018.
  25. ^ Lustig, Dan. Memory Consistency Model Status Update. Youtube. RISC-V International. Retrieved 4 January 2018.
  26. ^ a b c d e f g h i Waterman, Andrew (13 May 2011). Improving Energy Efficiency and Reducing Code Size with RISC-V Compressed. U.C. Berkeley: Regents of the University of California. p. 32. Retrieved 25 August 2014.
  27. ^ a b Waterman, Andrew; et al. "The RISC-V Compressed Instruction Set Manual Version 1.9 (draft)" (PDF). RISC-V. Retrieved 18 July 2016.
  28. ^ a b Brussee, Rogier. "A Complete 16-bit RVC". Google Groups. RISC-V Foundation. Retrieved 18 July 2019.
  29. ^ a b Brussee, Rogier. "Proposal: Xcondensed, [a] ... Compact ... 16 bit standalone G-ISA". RISC-V ISA Mail Server. Google Groups. Retrieved 10 November 2016.
  30. ^ Phung, Xan. "Improved Xcondensed". Google Groups. RISC-V Foundation. Retrieved 18 July 2019.
  31. ^ Ionescu, Liviu. "The RISC-V Microcontroller Profile". Github. Retrieved 5 April 2018.
  32. ^ Barros, Cesar. "Proposal: RV16E". RISC-V ISA Developers (Mailing list). Retrieved 2 April 2018.
  33. ^ Bonzini, Paolo; Waterman, Andrew. "Proposal for Virtualization without H mode". RISC-V ISA Developers (Mailing list). Retrieved 24 February 2017.
  34. ^ Wolf, Clifford. "riscv-bitmanip" (PDF). GitHub. RISC-V Foundation. Retrieved 13 January 2020.
  35. ^ a b "Instruction Summary for a "P" ISA Proposal". Google Groups. ANDES Technologies. Retrieved 13 January 2020.
  36. ^ Andes Technology. "Comprehensive RISC-V Solutions" (PDF). RISC-V Content. RISC-V Foundation. Retrieved 13 January 2020.
  37. ^ Lee, Ruby; Huck, Jerry (25 February 1996). "64-bit and Multimedia Extensions in the PA-RISC 2.0 Architecture". Proceedings of Compcon 96: 152–160. doi:10.1109/CMPCON.1996.501762. ISBN 0-8186-7414-8.
  38. ^ Lee, Ruby B. (April 1995). "Accelerating Multimedia with Enhanced Microprocessors" (PDF). IEEE Micro. 15 (2): 22–32. CiteSeerX 10.1.1.74.1688. doi:10.1109/40.372347. Retrieved 21 September 2014.
  39. ^ a b c d e f g h Schmidt, Colin; Ou, Albert; Lee, Yunsup; Asanović, Krste. "RISC-V Vector Extension Proposal" (PDF). RISC-V. Regents of the University of California. Retrieved 14 March 2016.
  40. ^ Ou, Albert; Nguyen, Quan; Lee, Yunsup; Asanović, Krste. "A Case for MVPs: Mixed-Precision Vector Processors" (PDF). UC Berkeley EECS. Regents of the University of California. Retrieved 14 March 2016.
  41. ^ Lee, Yunsup; Grover, Vinod; Krashinsky, Ronny; Stephenson, Mark; Keckler, Stephen W.; Asanović, Krste. "Exploring the Design Space of SPMD Divergence Management on Data-Parallel Architectures" (PDF). Berkeley's EECS Site. Regents of the University of California. Retrieved 14 March 2016.
  42. ^ a b c d e Bradbury, Alex; Wallentowitz, Stefan. "RISC-V Run Control Debug". Google Docs. RISC-V Foundation. Retrieved 20 January 2017.
  43. ^ Newsome, Tim. "RISC-V Debug Group > poll results". Google Groups, RISC-V Debug Group. RISC-V Foundation. Retrieved 20 January 2017.
  44. ^ McGooganus. "riscv-trace-spec". GitHub. Retrieved 13 January 2020.
  45. ^ Dahad, Nitin. "UltraSoC Tackles RISC-V Support Challenge by Donating Trace Encoder". EE Times. Aspencore. Retrieved 13 January 2020.
  46. ^ "RISC-V Cores and SoC Overview". RISC-V. 25 September 2019. Retrieved 5 October 2019.
  47. ^ "China's Alibaba is making a 16-core, 2.5 GHz RISC-V processor". www.techspot.com. Retrieved 30 July 2019.
  48. ^ "Andes Technology". RISC-V International. Retrieved 10 July 2018.
  49. ^ "CloudBEAR". Retrieved 16 October 2018.
  50. ^ Manners, David (23 November 2016). "Codasip and UltraSoC Combine on RISC-V". Electronics Weekly. Metropolis International Group, Ltd. Retrieved 23 November 2016.
  51. ^ "GigaDevice Unveils The GD32V Series With RISC-V Core in a Brand New 32-bit General Purpose Microcontroller". www.gigadevice.com. Retrieved 29 August 2019.
  52. ^ "Sipeed Longan Nano - RISC-V GD32VF103CBT6 Development Board". www.seeedstudio.com. Retrieved 29 August 2019.
  53. ^ "GreenWaves GAP8 is a Low Power RISC-V IoT Processor Optimized for Artificial Intelligence Applications". CNXSoft: Embedded Systems News. 27 February 2018. Retrieved 4 March 2018.
  54. ^ Yoshida, Junko (26 February 2018). "AI Comes to Sensing Devices". EE Times. Retrieved 10 July 2018.
  55. ^ "GreenWaves Technologies Announces Availability of GAP8 Software Development Kit and GAPuino Development Board" (Press release). 22 May 2018.
  56. ^ "SEGGER Adds Support for SiFive's Coreplex IP to Its Industry Leading J-Link Debug Probe". Retrieved 19 September 2017.
  57. ^ "PR: SEGGER Embedded Studio supports RISC-V architecture". Retrieved 23 November 2017.
  58. ^ "PR: SEGGER presents RTOS, stacks, middleware for RISC-V". Retrieved 8 December 2017.
  59. ^ "HiFive1". SiFive. Retrieved 10 July 2018.
  60. ^ SiFive. "Hi-Five1: Open-source Arduino-Compatible Development Kit". Crowd Supply. Retrieved 2 December 2016.
  61. ^ "FU540 SoC CPU". SiFive. Retrieved 24 October 2018.
  62. ^ "Syntacore". Retrieved 11 December 2018.
  63. ^ "SCR1, open-source RISC-V core". Retrieved 11 December 2018.
  64. ^ "RISC-V workshop proceedings". 11 December 2016. Retrieved 11 December 2018.
  65. ^ Shilov, Anton. "Western Digital Reveals SweRV RISC-V Core, Cache Coherency over Ethernet Initiative". www.anandtech.com. Retrieved 23 May 2019.
  66. ^ "Western Digital Releases SweRV RISC-V Core Source Code". AB Open. 28 January 2019. Archived from the original on 21 May 2019.
  67. ^ Cores-SweRV on GitHub
  68. ^ "Espressif". Retrieved 9 June 2020.
  69. ^ "ESP32-S2" (PDF). Retrieved 9 June 2020.
  70. ^ Ashenden, Peter (9 November 2016). "Re: [isa-dev] RISC V ISA for embedded systems". RISC-V ISA Developers (Mailing list). Retrieved 10 November 2016. At ASTC (www.astc-design.com), we have an implementation of RV32EC as a synthesizable IP core intended for small embedded applications, such as smart sensors and IoT.
  71. ^ "C-DAC announces Tech Conclave 2019". The Times of India. Retrieved 12 April 2019.
  72. ^ "NOEL-V Processor". Cobham Gaisler. Retrieved 14 January 2020.
  73. ^ a b "FreeBSD Foundation: Initial FreeBSD RISC-V Architecture Port Committed". 4 February 2016.
  74. ^ "Esperanto exits stealth mode, aims at AI with a 4,096 core 7nm RISC-V monster". wikichip.org. January 2018. Retrieved 2 January 2018.
  75. ^ "PULPino GitHub project". GitHub. Retrieved 2 February 2018.
  76. ^ "PULP Platform". PULP Platform. Retrieved 2 February 2018.
  77. ^ "Accelerator Stream". European Processor Initiative (EPI). Retrieved 22 February 2020.
  78. ^ Redmond, Calista (20 August 2019). "How the European Processor Initiative is Leveraging RISC-V for the Future of Supercomputing". RISC-V International News. RISC-V International.
  79. ^ "IIT Madras Open Source Processor Project". Rapid IO. IIT Madras. Retrieved 13 September 2014.
  80. ^ "lowRISC website". Retrieved 10 May 2015.
  81. ^ Xie, Joe (July 2016). NVIDIA RISC V Evaluation Story. 4th RISC-V Workshop. Youtube.
  82. ^ Andrei Frumusanu (30 October 2019). "SiFive Announces First RISC-V OoO CPU Core: The U8-Series Processor IP". Anandtech.
  83. ^ Asanović, Krste; et al. "rocket-chip". GitHub. RISC-V International. Retrieved 11 November 2016.
  84. ^ Celio, Christopher. "riscv-boom". GitHub. Regents of the University of California. Retrieved 29 March 2020.
  85. ^ Celio, Christopher. "riscv-sodor". GitHub. Regents of the University of California. Retrieved 11 November 2016.
  86. ^ Celio, Chris. "ucb-bar/riscv-sodor". github. Regents of the University of California. Retrieved 25 October 2019.
  87. ^ Wolf, Claire. "picorv32". GitHub. Retrieved 27 February 2020.
  88. ^ "scr1". GitHub. Syntacore. Retrieved 13 January 2020.
  89. ^ Traber, Andreas; et al. "PULP: Parallel Ultra Low Power". ETH Zurich, University of Bologna. Retrieved 5 August 2016.
  90. ^ "Chisel: Constructing Hardware in a Scala Embedded Language". UC Berkeley. Regents of the University of California. Retrieved 12 February 2015.
  91. ^ "riscv - FreeBSD Wiki". wiki.freebsd.org.
  92. ^ Montezelo, Manuel. "Debian GNU/Linux port for RISC-V 64". Google Groups. Retrieved 19 July 2018.
  93. ^ "Architectures/RISC-V". Fedora Wiki. Red Hat. Retrieved 26 September 2016.
  94. ^ Begari, Padmarao. "U-Boot port on RISC-V 32-bit is available". Google Groups. Microsemi. Retrieved 15 February 2017.
  95. ^ RiscVEdk2 on GitHub
  96. ^ Almatary, Hesham. "RISC-V, seL4". seL4 Documentation. Commonwealth Scientific and Industrial Research Organisation (CSIRO). Retrieved 13 July 2018.
  97. ^ Almatary, Hesham. "heshamelmatary". GitHub. Retrieved 13 July 2018.
  98. ^ "MultiZone Secure IoT Stack, the First Secure IoT Stack for RISC-V". Hex Five Security. Hex Five Security, Inc. 22 February 2019. Retrieved 3 March 2019.
  99. ^ "Pharos". SourceForge. Retrieved 1 April 2020.
  100. ^ "ANGEL is a Javascript RISC-V ISA (RV64) Simulator that runs riscv-linux with BusyBox". RISCV.org. Archived from the original on 11 November 2018. Retrieved 17 January 2019.
  101. ^ WepSIM with RISC-V_im example: https://acaldero.github.io/wepsim/ws_dist/wepsim-classic.html?mode=ep&example=36&simulator=assembly:registers&notify=false
  102. ^ WepSIM source code in GitHub: https://github.com/wepsim/wepsim
  103. ^ "Documentation/Platforms/RISCV - QEMU". wiki.qemu.org. Retrieved 7 May 2020.

Further reading