ARM Cortex-A72
General information | |
---|---|
Designed by | ARM Holdings |
Cache | |
L1 cache | 80 KiB (48 KiB I-cache with parity, 32 KiB D-cache with ECC) per core |
L2 cache | 512 KiB to 4 MiB |
L3 cache | none |
Architecture and classification | |
Microarchitecture | ARMv8-A |
Physical specifications | |
Cores |
|
The ARM Cortex-A72 is a microarchitecture implementing the ARMv8-A 64-bit instruction set designed by ARM Holdings. The Cortex-A72 is an out-of-order superscalar pipeline.[1] It is available as SIP core to licensees, and its design makes it suitable for integration with other SIP cores (e.g. GPU, display controller, DSP, image processor, etc.) into one die constituting a system on a chip (SoC).
The base-line architecture for the Cortex-A72 was the Cortex-A57; however, the design is more than just a simple revision. [2] The designers of the Cortex-A72 had three major themes when designing the new core: pushing the performance to the next generation; reducing the power significantly so that it can sustain maximum frequency performance; and reducing the area used by the design, again contributing to a reduction in power, but also enabling low cost designs as well. [3]
Overview
- Pipelined processor with deeply out of order, speculative issue 3-way superscalar execution pipeline
- DSP and NEON SIMD extensions are mandatory per core
- VFPv4 Floating Point Unit onboard (per core)
- Hardware virtualization support
- Thumb-2 instruction set encoding reduces the size of 32-bit programs with little impact on performance.
- TrustZone security extensions
- Program Trace Macrocell and CoreSight Design Kit for unobtrusive tracing of instruction execution
- 32 KiB data (2-way set-associative) + 48 KiB instruction (3-way set-associative) L1 cache per core
- Integrated low-latency level-2 (16-way set-associative) cache controller, 512 KB to 4 MB configurable size per cluster
- 48-entry fully associative L1 instruction Translation Lookaside Buffer (TLB) with native support for 4 KiB, 64 KiB, and 1 MB page sizes
- 4-way set-associative of 1024-entry L2 TLB
- Sophisticated branch prediction algorithm that significantly increases performance and reduces energy from mispredictionand speculation
- Early IC tag –3-way L1 cache at direct-mapped power*
- Regionalized TLB and μBTB tagging
- Small-offset branch-target optimizations
- Suppression of superfluous branch predictor accesses
See also
References
- ^ a b "Cortex-A72 Processor". ARM Holdings. Retrieved 2014-02-02.
- ^ "A closer look at the ARM Cortex-A72". Retrieved 2015-10-26.
- ^ "ARM lead architect talks to AA about the Cortex-A72". Retrieved 2015-10-26.