Jump to content

Comparison of instruction set architectures: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Line 501: Line 501:
| ARM Cortex-A57
| ARM Cortex-A57
|
|
| Deeply out-of-order, wide multi-issue, 3-way superscalar
| Deeply out-of-order, wide multi-issue, 3-way superscalar,[[RISC]]
|-
|-
| [[AVR32#AP7 Core|AVR32 AP7]]
| [[AVR32#AP7 Core|AVR32 AP7]]
Line 522: Line 522:
|
|
|Shared multithreaded L2 cache, multithreading, multicore, around 20 stage long pipeline,integrated memory controller,out-of-order,superscalar,up to 16mb lv2 cache,up to 16mb lv3 cache,Virtualization,FlexFPU which use [[simultaneous multithreading]]<ref>http://cdn3.wccftech.com/wp-content/uploads/2013/07/AMD-Steamroller-vs-Bulldozer.jpg</ref>
|Shared multithreaded L2 cache, multithreading, multicore, around 20 stage long pipeline,integrated memory controller,out-of-order,superscalar,up to 16mb lv2 cache,up to 16mb lv3 cache,Virtualization,FlexFPU which use [[simultaneous multithreading]]<ref>http://cdn3.wccftech.com/wp-content/uploads/2013/07/AMD-Steamroller-vs-Bulldozer.jpg</ref>
,up to 16 cores per chip,up to 5Ghz clock speed,up to 220w TDP,Turbo Core.
,up to 16 cores per chip,up to 5Ghz clock speed,up to 220w TDP,Turbo Core,[[CISC]]
|-
|-
| [[Transmeta Crusoe|Crusoe]]
| [[Transmeta Crusoe|Crusoe]]
Line 586: Line 586:
|[[Intel]] [[NetBurst (microarchitecture)|NetBurst]] ([[Pentium 4#Willamette|Willamette]])
|[[Intel]] [[NetBurst (microarchitecture)|NetBurst]] ([[Pentium 4#Willamette|Willamette]])
| 20
| 20
| 2-way [[Simultaneous multithreading]][[(Hyperthreading)]],Rapid Execution Engine,Execution Trace Cache,quad-pumped Front-Side Bus,Hyper-pipelined Technology,superscalar,out-of order
| 2-way [[Simultaneous multithreading]][[(Hyperthreading)]],Rapid Execution Engine,Execution Trace Cache,quad-pumped Front-Side Bus,Hyper-pipelined Technology,superscalar,out-of order,[[CISC]]
|-
|-
| NetBurst ([[Pentium 4#Northwood|Northwood]])
| NetBurst ([[Pentium 4#Northwood|Northwood]])
Line 606: Line 606:
| [[Intel Atom]]
| [[Intel Atom]]
| 16
| 16
|2-way Simultaneous multithreading, in-order. No instruction reordering, speculative execution, or register renaming.
|2-way Simultaneous multithreading, in-order. No instruction reordering, speculative execution, or register renaming,CISC
|-
|-
|[[Intel Atom]] Oak Trail
|[[Intel Atom]] Oak Trail
Line 626: Line 626:
|[[Intel]] [[Haswell (microarchitecture)|Haswell]]
|[[Intel]] [[Haswell (microarchitecture)|Haswell]]
| 14
| 14
| Multicore,multithreading,2-way simultaneous multithreading,[[Transactional memory]](in selected models),LV4 Cache(in GT3 models),Turbo Boost,out-of-order,superscalar,up to 8mb lv3 cache(mainstream),up to 20mb lv3 cache(Extreme),Virtualization
| Multicore,multithreading,2-way simultaneous multithreading,[[Transactional memory]](in selected models),LV4 Cache(in GT3 models),Turbo Boost,out-of-order,superscalar,up to 8mb lv3 cache(mainstream),up to 20mb lv3 cache(Extreme),Virtualization,[CISC]]
|-
|-
|[[Intel]] [[Xeon Phi]] 7120x
|[[Intel]] [[Xeon Phi]] 7120x
|
|
| Multicore,[[multithreading]],4 hardware based simultaneous threads per core which cant be disabled unlike regular [[Hyperthreading]],61 cores per chip,244 threads per chip,30.5mb lv2 cache,300w TDP ,Turbo Boost,in-order
| Multicore,[[multithreading]],4 hardware based simultaneous threads per core which cant be disabled unlike regular [[Hyperthreading]],61 cores per chip,244 threads per chip,30.5mb lv2 cache,300w TDP ,Turbo Boost,in-order,[[CISC]]
|-
|-
| [[LatticeMico32]]
| [[LatticeMico32]]
Line 654: Line 654:
|[[IBM]] [[POWER6]]
|[[IBM]] [[POWER6]]
|
|
| 2-way simultaneous multithreading, in-order execution,up to 5ghz
| 2-way simultaneous multithreading, in-order execution,up to 5ghz,[[RISC]]
|-
|-
|[[IBM]] [[POWER7+]]
|[[IBM]] [[POWER7+]]
|
|
| multicore,multithreading,out-of-order,superscalar,4 intelligent simultaneous threads per core, 12 execution units per core,8 cores per chip,80mb lv3 cache,Virtualization,true hardware entropy generator,hardware-assisted cryptographic acceleration,Fixed-point unit,Decimal-Fixed Unit,Decimal Floating-Point Unit
| multicore,multithreading,out-of-order,superscalar,4 intelligent simultaneous threads per core, 12 execution units per core,8 cores per chip,80mb lv3 cache,Virtualization,true hardware entropy generator,hardware-assisted cryptographic acceleration,Fixed-point unit,Decimal-Fixed Unit,Decimal Floating-Point Unit,[[RISC]]
|-
|-
| [[PowerPC 400#PowerPC 401|PowerPC 401]]
| [[PowerPC 400#PowerPC 401|PowerPC 401]]
Line 774: Line 774:
|[[Oracle Corporation]] [[SPARC T5]]
|[[Oracle Corporation]] [[SPARC T5]]
| 16
| 16
| Multithreading, multi-core, 8 fine-grained threads per core of which 2 can be executed simultaneously,2-way [[simultaneous multithreading]],16 cores per chip,out-of-order,16-way associative shared 8mb lv3 cache,hardware-assisted cryptographic acceleration,Stream-Processing unit,out-of order execution,Virtualization
| Multithreading, multi-core, 8 fine-grained threads per core of which 2 can be executed simultaneously,2-way [[simultaneous multithreading]],16 cores per chip,out-of-order,16-way associative shared 8mb lv3 cache,hardware-assisted cryptographic acceleration,Stream-Processing unit,out-of order execution,Virtualization,[[RISC]]
|-
|-
|Oracle [[Sparc M5]]
|Oracle [[Sparc M5]]
Line 782: Line 782:
|[[Fujitsu]] Sparc64 X
|[[Fujitsu]] Sparc64 X
|
|
|Multithreading,multicore,2-way SMT,16 cores per chip,out-of order,24mb lv2 cache,out-of order,Virtualization
|Multithreading,multicore,2-way SMT,16 cores per chip,out-of order,24mb lv2 cache,out-of order,Virtualization,[[RISC]]
|-
|-
|[[Imagination Technologies]] Mips Warrior
|[[Imagination Technologies]] Mips Warrior

Revision as of 12:58, 23 July 2013

Factors

Bits

Computer architectures are often described as n-bit architectures. Today n is often 8, 16, 32, or 64, but other sizes have been used. This is actually a strong simplification. A computer architecture often has a few more or less "natural" datasizes in the instruction set, but the hardware implementation of these may be very different. Many architectures have instructions operating on half and/or twice the size of respective processors major internal datapaths. Examples of this are the 8080, Z80, MC68000 as well as many others. On this type of implementations, a twice as wide operation typically also takes around twice as many clock cycles (which is not the case on high performance implementations). On the 68000, for instance, this means 8 instead of 4 clock ticks, and this particular chip may be described as a 32-bit architecture with a 16-bit implementation. The external databus width is often not useful to determine the width of the architecture; the NS32008, NS32016 and NS32032 were basically the same 32-bit chip with different external data buses. The NS32764 had a 64-bit bus, but used 32-bit registers.

The width of addresses may or may not be different than the width of data. Early 32-bit microprocessors often had a 24-bit address, as did the System/360 processors.

Operands

The number of operands is one of the factors that may give an indication about the performance of the instruction set. A three-operand architecture will allow

A := B + C

to be computed in one instruction.

A two-operand architecture will allow

A := A + B

to be computed in one instruction, so two instructions will need to be executed to simulate a single three-operand instruction

A := B
A := A + C

Endianness

An architecture may use "big" or "little" endianness, or both, or be configurable to use either. Little endian processors order bytes in memory with the least significant byte of a multi-byte value in the lowest-numbered memory location. Big endian architectures instead order them with the most significant byte at the lowest-numbered address. The x86 and the ARM architectures as well as several 8-bit architectures are little endian. Most RISC architectures (SPARC, Power, PowerPC, MIPS) were originally big endian, but many (including ARM) are now configurable.

Architectures

The table below compares basic information about CPU architectures.

Architecture Bits Version Introduced Max # Operands Type Design Registers Instruction encoding Branch Evaluation Endianness Extensions Open Royalty-free
Alpha 64 1992 3 Register-Register RISC 32 Fixed Condition register Bi Motion Video Instructions, Byte-Word Extensions, Floating-point Extensions, Count Extensions No Un­known
ARM 32 ARMv7 and earlier 1983 3 Register-Register RISC 16 Fixed (32-bit), Thumb: Fixed (16-bit), Thumb-2: Variable (16 and 32-bit) Condition code Bi NEON, Jazelle, Vector Floating Point, TrustZone, LPAE Un­known No
ARM 64 ARMv8[1] 2011[2] 3 Register-Register RISC 31 Fixed (32-bit), Thumb: Fixed (16-bit), Thumb-2: Variable (16 and 32-bit), A64 Condition code Bi NEON, Jazelle, Vector Floating Point, TrustZone Un­known No
AVR32 32 Rev 2 2006 2-3 RISC 15 Variable[3] Big Java Virtual Machine Un­known Un­known
Blackfin 32 2000 RISC[4] 8 Little[5] Un­known Un­known
DLX 32 1990 3 RISC 32 Fixed (32-bit) Big Un­known Un­known
eSi-RISC 16/32 2009 3 Register-Register RISC 8-72 Variable(16 or 32-bit) Compare and branch and condition register Bi User-defined instructions No No
Itanium (IA-64) 64 2001 Register-Register EPIC 128 Condition register Bi (selectable) Intel Virtualization Technology Yes Yes
M32R 32 1997 RISC 16 Fixed (16- or 32-bit) Bi Un­known Un­known
MC68K 32 1979 2 Register-Memory CISC 16 Variable Condition register Big Un­known Un­known
Mico32 32 2006 3 Register-Register RISC 32[6] Fixed (32-bit) Compare and branch Big User-defined instructions Yes[7] Yes
MIPS 64 (32→64) 5 1981 3 Register-Register RISC 32 Fixed (32-bit) Condition register Bi MDMX, MIPS-3D Un­known No
MMIX 64 1999 3 Register-Register RISC 256 Fixed (32-bit) Big Yes Yes
PA-RISC (HP/PA) 64 (32→64) 2.0 1986 3 RISC 32 Fixed Compare and branch Big Multimedia Acceleration eXtensions (MAX), MAX-2 No Un­known
PowerPC 32/64 (32→64) 2.06[8] 1991 3 Register-Register RISC 32 Fixed, Variable Condition code Big/Bi AltiVec, APU, VSX, Cell Yes[9] No
S+core 16/32 2005 RISC Little Un­known Un­known
Series 32000 32 1982 5 Memory-Memory CISC 8 Variable Huffman coded, up to 23 bytes long Condition Code Little BitBlt instructions Un­known Un­known
SPARC 64 (32→64) V9 1985 3 Register-Register RISC 31 (of at least 55) Fixed Condition code Big → Bi VIS 1.0, 2.0, 3.0 Yes Yes[10]
SuperH (SH) 32 1990s 2 Register-Register/ Register-Memory RISC 16 Fixed Condition Code (Single Bit) Bi Un­known Un­known
System/360 / System/370 / z/Architecture 64 (32→64) 3 1964 Register-Memory/Memory-Memory CISC 16 Fixed Condition code Big Un­known Un­known
VAX 32 1977 6 Memory-Memory CISC 16 Variable Compare and branch Little VAX Vector Architecture Un­known Un­known
x86 32 (16→32) 1978 2 Register-Memory CISC 8 Variable Condition code Little MMX, 3DNow!, SSE, PAE, No No
x86-64 64 2003 2 Register-Memory CISC 16 Variable Condition code Little MMX, 3DNow!, PAE, AVX No No
Architecture Bits Version Introduced Max # Operands Type Design Registers Instruction encoding Branch Evaluation Endianness Extensions Open Royalty-free

Microarchitectures

The following table compares specific microarchitectures.

Microarchitecture Pipeline stages Misc
AMD K5 Out-of-order execution, register renaming, speculative execution
AMD K6 Superscalar, branch prediction
AMD K6-III Branch prediction, speculative execution, out-of-order execution[11]
AMD K7 Out-of-order execution, branch prediction, Harvard architecture
AMD K8 64-bit, integrated memory controller, 16 byte instruction prefetching
AMD K10 Superscalar, out-of-order execution, 32-way set associative L3 victim cache, 32-byte instruction prefetching
ARM7TDMI(-S) 3
ARM7EJ-S 5
ARM810 5
ARM9TDMI 5
ARM1020E 6
XScale PXA210/PXA250 7
ARM1136J(F)-S 8
ARM1156T2(F)-S 9
ARM Cortex-A5 8 Single issue, in-order
ARM Cortex-A7 MPCore 8 Partial dual-issue, in-order
ARM Cortex-A8 13 Dual-issue
ARM Cortex-A9 MPCore 8-11 Out-of-order, speculative issue, superscalar
ARM Cortex-A15 MPCore 15 Multicore (up to 16), out-of-order, speculative issue, 3-way superscalar
ARM Cortex-A53 Partial dual-issue, in-order
ARM Cortex-A57 Deeply out-of-order, wide multi-issue, 3-way superscalar,RISC
AVR32 AP7 7
AVR32 UC3 3 Harvard architecture
Bobcat Out-of-order execution
Bulldozer Shared multithreaded L2 cache, multithreading, multicore, around 20 stage long pipeline,integrated memory controller,out-of-order,superscalar,up to 16 cores per chip,up to 16mb lv3 cache,Virtualization,Turbo Core,FlexFPU which use simultaneous multithreading.[12]
Piledriver Shared multithreaded L2 cache, multithreading, multicore, around 20 stage long pipeline,integrated memory controller,out-of-order,superscalar,up to 16mb lv2 cache,up to 16mb lv3 cache,Virtualization,FlexFPU which use simultaneous multithreading[13]

,up to 16 cores per chip,up to 5Ghz clock speed,up to 220w TDP,Turbo Core,CISC

Crusoe In-order execution, 128-bit VLIW, integrated memory controller
Efficeon In-order execution, 256-bit VLIW, fully integrated memory controller
Cyrix Cx5x86 6[14] Branch prediction
Cyrix 6x86 Superscalar, superpipelined, register renaming, speculative execution, out-of-order execution
DLX 5
eSi-3200 5 In-order, speculative issue
eSi-3250 5 In-order, speculative issue
EV4 (Alpha 21064) Superscalar
EV7 (Alpha 21364) Superscalar design with out-of-order execution, branch prediction, 4-way SMT, integrated memory controller
EV8 (Alpha 21464) Superscalar design with out-of-order execution
P5 (Pentium) 5 Superscalar
P6 (Pentium Pro) 14 Speculative execution, Register renaming, superscalar design with out-of-order execution
P6 (Pentium II) Branch prediction
P6 (Pentium III) 10
Intel Itanium 8[15] Speculative execution, branch prediction, register renaming, 30 execution units, multithreading,multicore,coarse-grained mutithreading,2-way simultaneous multithreading,Turbo Boost,Virtualization,VLIW
Intel NetBurst (Willamette) 20 2-way Simultaneous multithreading(Hyperthreading),Rapid Execution Engine,Execution Trace Cache,quad-pumped Front-Side Bus,Hyper-pipelined Technology,superscalar,out-of order,CISC
NetBurst (Northwood) 20 2-way Simultaneous multithreading
NetBurst (Prescott) 31 2-way Simultaneous multithreading
NetBurst (Cedar Mill) 31 2-way Simultaneous multithreading
Core 14 multicore,out-of-order,superscalar
Intel Atom 16 2-way Simultaneous multithreading, in-order. No instruction reordering, speculative execution, or register renaming,CISC
Intel Atom Oak Trail 2-way Simultaneous multithreading,in-order,Burst mode,512kb lv2 Cache.
Intel Atom Silvermont Out-of-order execution
Nehalem 2-way Simultaneous multithreading,out-of-order,superscalar,integrated memory controller, L1/L2/L3 cache,Turbo Boost
Sandy Bridge 2-way Simultaneous multithreading, multicore, integrated memory controller, L1/L2/L3 cache. 2 threads per core,Turbo Boost
Intel Haswell 14 Multicore,multithreading,2-way simultaneous multithreading,Transactional memory(in selected models),LV4 Cache(in GT3 models),Turbo Boost,out-of-order,superscalar,up to 8mb lv3 cache(mainstream),up to 20mb lv3 cache(Extreme),Virtualization,[CISC]]
Intel Xeon Phi 7120x Multicore,multithreading,4 hardware based simultaneous threads per core which cant be disabled unlike regular Hyperthreading,61 cores per chip,244 threads per chip,30.5mb lv2 cache,300w TDP ,Turbo Boost,in-order,CISC
LatticeMico32 6 Harvard architecture
POWER1 Superscalar, out-of-order execution
POWER3 Superscalar, out-of-order execution
POWER4 Superscalar, speculative execution, out-of-order execution
POWER5 2-way Simultaneous multithreading, out-of-order execution, integrated memory controller
IBM POWER6 2-way simultaneous multithreading, in-order execution,up to 5ghz,RISC
IBM POWER7+ multicore,multithreading,out-of-order,superscalar,4 intelligent simultaneous threads per core, 12 execution units per core,8 cores per chip,80mb lv3 cache,Virtualization,true hardware entropy generator,hardware-assisted cryptographic acceleration,Fixed-point unit,Decimal-Fixed Unit,Decimal Floating-Point Unit,RISC
PowerPC 401 3
PowerPC 405 5
PowerPC 440 7
PowerPC 470 9 SMP
PowerPC A2 15
PowerPC e300 4 Superscalar, Branch prediction
PowerPC e500 Dual 7 stage Multicore
PowerPC e600 3-issue 7 stage Superscalar out-of-order execution, branch prediction
PowerPC e5500 4-issue 7 stage Out-of-order, multicore
PowerPC e6500 multicore
PowerPC 603 4 5 execution units, branch prediction. No SMP.
PowerPC 603q 5 In-order
PowerPC 604 6 Superscalar, out-of-order execution, 6 execution units. SMP support.
PowerPC 620 5 Out-of-order execution- SMP support.
PWRficient Superscalar, out-of-order execution, 6 execution units
R4000 8 Scalar
StrongARM SA-110 5 Scalar, in-order
SuperH SH2 5
SuperH SH2A 5 Superscalar, Harvard architecture
SPARC Superscalar
HyperSPARC Superscalar
SuperSPARC Superscalar, in-order
SPARC64 VI/VII/VII+ Superscalar, out-of-order[16]
UltraSPARC 9
UltraSPARC T1 6 Open source, multithreading, multi-core, 4 threads per core, integrated memory controller
UltraSPARC T2 8 Open source, multithreading, multi-core, 8 threads per core,
SPARC T3 8 Multithreading, multi-core, 8 threads per core, SMP,16 cores per chip,2mb lv3 cache
Oracle SPARC T4 16 Multithreading, multi-core,8 fine-grained threads per core of which 2 can be executed simultaneously,2-way simultaneous multithreading,, SMP,16 cores per chip,out-of-order,4mb lv3 cache,In-order
Oracle Corporation SPARC T5 16 Multithreading, multi-core, 8 fine-grained threads per core of which 2 can be executed simultaneously,2-way simultaneous multithreading,16 cores per chip,out-of-order,16-way associative shared 8mb lv3 cache,hardware-assisted cryptographic acceleration,Stream-Processing unit,out-of order execution,Virtualization,RISC
Oracle Sparc M5 16 Multithreading, multi-core, 8 fine-grained threads per core of which 2 can be executed simultaneously,2-way simultaneous multithreading,6 Cores per chip,out-of-order,48mb lv3 cache,out-of order execution,Virtualization
Fujitsu Sparc64 X Multithreading,multicore,2-way SMT,16 cores per chip,out-of order,24mb lv2 cache,out-of order,Virtualization,RISC
Imagination Technologies Mips Warrior
VIA C7 In-order execution
VIA Nano (Isaiah) Superscalar out-of-order execution, branch prediction, 7 execution units
WinChip 4 In-order execution

See also

References

  1. ^ ARMv8 Technology Preview
  2. ^ "ARM goes 64-bit with new ARMv8 chip architecture". Retrieved 26 May 2012.
  3. ^ "AVR32 Architecture Document" (PDF). Atmel. Retrieved 2008-06-15.
  4. ^ "Blackfin Processor Architecture Overview". Analog Devices. Retrieved 2009-05-10.
  5. ^ "Blackfin memory architecture". Analog Devices. Retrieved 2009-12-18.
  6. ^ "LatticeMico32 Architecture". Lattice Semiconductor. Retrieved 2009-12-18.
  7. ^ "Open Source Licensing". Lattice Semiconductor. Retrieved 2009-12-18.
  8. ^ "Power ISA V2.06" (PDF). IBM. Retrieved 2009-07-04. [dead link]
  9. ^ http://www.ibm.com/developerworks/power/newto/#2 New to Cell/B.E., multicore, and Power Architecture technology
  10. ^ http://www.sparc.org/specificationsDocuments.html##ArchLic SPARC Architecture License
  11. ^ http://www.amd.com/us-en/Processors/ProductInformation/0,,30_118_1260_1288%5E1295,00.html
  12. ^ http://cdn3.wccftech.com/wp-content/uploads/2013/07/AMD-Steamroller-vs-Bulldozer.jpg
  13. ^ http://cdn3.wccftech.com/wp-content/uploads/2013/07/AMD-Steamroller-vs-Bulldozer.jpg
  14. ^ http://www.pcguide.com/ref/cpu/fam/g4C5x86-c.html
  15. ^ Intel Itanium 2 Processor Hardware Developer's Manual. p. 14. <http://www.intel.com/design/itanium2/manuals/25110901.pdf> (2002) [Retrieved November 28, 2011]
  16. ^ http://www.fujitsu.com/global/services/computing/server/sparcenterprise/technology/performance/processor.html