Jump to content

Row hammer

This is a good article. Click here for more information.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 2601:646:8c81:e60:b182:7206:131c:4f61 (talk) at 17:47, 6 November 2016. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Row hammer (also written as rowhammer) is an unintended side effect in dynamic random-access memory (DRAM) that causes memory cells to leak their charges and interact electrically between themselves, possibly altering the contents of nearby memory rows that were not addressed in the original memory access. This circumvention of the isolation between DRAM memory cells results from the high cell density in modern DRAM, and can be triggered by specially crafted memory access patterns that rapidly activate the same memory rows numerous times.[1][2][3]

The row hammer effect has been used in some privilege escalation computer security exploits.[2][4][5] Different hardware-based techniques exist to prevent the row hammer effect from occurring, including required support in some processors and types of DRAM memory modules.[6][7]

Background

A high-level illustration of DRAM organization, which includes memory cells (blue squares), address decoders (green rectangles), and sense amplifiers (red squares)

In dynamic RAM (DRAM), each bit of stored data occupies a separate memory cell that is electrically implemented with one capacitor and one transistor. The charge state of a capacitor (charged or discharged) is what determines whether a DRAM cell stores "1" or "0" as a binary value. Huge numbers of DRAM memory cells are packed into integrated circuits, together with some additional logic that organizes the cells for the purposes of reading, writing and refreshing the data.[8][9]

Memory cells (blue squares in the illustration) are further organized into matrices and addressed through rows and columns. A memory address applied to a matrix is broken into the row address and column address, which are processed by the row and column address decoders (in the illustration, vertical and horizontal green rectangles, respectively). After a row address selects the row for a read operation (the selection is also known as row activation), bits from all cells in the row are transferred into the sense amplifiers that form the row buffer (red squares in the illustration), from which the exact bit is selected using the column address. Consequently, read operations are of a destructive nature because the design of DRAM requires memory cells to be rewritten after their values have been read by transferring the cell charges into the row buffer. Write operations decode the addresses in a similar way, but as a result of the design entire rows must be rewritten for the value of a single bit to be changed.[1]: 2–3 [8][9][10]

As a result of storing data bits using capacitors that have a natural discharge rate, DRAM memory cells lose their state over time and require periodic rewriting of all memory cells, which is a process known as refreshing.[1]: 3 [8] As another result of the design, DRAM memory is susceptible to random changes in stored data, which are known as soft memory errors and attributed to cosmic rays and other causes. There are different techniques that counteract soft memory errors and improve the reliability of DRAM, of which error-correcting code (ECC) memory and its advanced variants (such as lockstep memory) are most commonly used.[11]

Overview

Rapid row activations (yellow rows) may change the values of bits stored in victim row (purple row).[12]: 2 

Increased densities of DRAM integrated circuits (ICs) have led to physically smaller memory cells capable of storing smaller charges, resulting in lower operational noise margins, increased rates of electromagnetic interactions between memory cells, and greater possibility of data loss. As a result, disturbance errors have been observed, being caused by cells interfering with each other's operation and manifesting as random changes in the values of bits stored in affected memory cells. The awareness of disturbance errors dates back to the early 1970s and Intel 1103 as the first commercially available DRAM IC; since then, DRAM manufacturers have employed various mitigation techniques to counteract disturbance errors, such as improving the isolation between cells and performing production testing. However, researchers proved in a 2014 analysis that commercially available DDR3 DRAM chips manufactured in 2012 and 2013 are susceptible to disturbance errors, while using the term row hammer to name the associated side effect that led to observed bit flips.[1][3][12]

The opportunity for the row hammer effect to occur in DDR3 memory[13] is primarily attributed to DDR3's high density of memory cells and the results of associated interactions between the cells, while rapid DRAM row activations have been determined as the primary cause. Frequent row activations cause voltage fluctuations on the associated row selection lines, which have been observed to induce higher-than-natural discharge rates in capacitors belonging to nearby (adjacent, in most cases) memory rows, which are called victim rows; if the affected memory cells are not refreshed before they lose too much charge, disturbance errors occur. Tests show that a disturbance error may be observed after performing around 139,000 subsequent memory row accesses (with cache flushes), and that up to one memory cell in every 1,700 cells may be susceptible. Those tests also show that the rate of disturbance errors is not substantially affected by increased environment temperature, while it depends on the actual contents of DRAM because certain bit patterns result in significantly higher disturbance error rates.[1][2][12][14]

A variant called double-sided hammering involves targeted activations of two DRAM rows surrounding a victim row: in the illustration provided in this section, this variant would be activating both yellow rows with the aim of inducing bit flips in the purple row, which in this case would be the victim row. Tests show that this approach may result in a significantly higher rate of disturbance errors, compared to the variant that activates only one of the victim row's neighbouring DRAM rows.[4][15]: 19–20 [16]

Mitigation

Different methods exist for more or less successful detection, prevention, correction or mitigation of the row hammer effect. Tests show that simple ECC solutions, providing single-error correction and double-error detection (SECDED) capabilities, are not able to correct or detect all observed disturbance errors because some of them include more than two flipped bits per memory word.[1]: 8 [12]: 32  A less effective solution is to introduce more frequent memory refreshing, with the refresh intervals shorter than the usual 64 ms,[a] but this technique results in higher power consumption and increased processing overhead; some vendors provide firmware updates that implement this type of mitigation.[17] One of the more complex prevention measures performs counter-based identification of frequently accessed memory rows and proactively refreshes their neighboring rows; another method issues additional infrequent random refreshes of memory rows neighboring the accessed rows regardless of their access frequency. Research shows that these two prevention measures cause negligible performance impacts.[1]: 10–11 [18]

Since the release of Ivy Bridge microarchitecture, Intel Xeon processors support the so-called pseudo target row refresh (pTRR) that can be used in combination with pTRR-compliant DDR3 dual in-line memory modules (DIMMs) to mitigate the row hammer effect by automatically refreshing possible victim rows, with no negative impacts on performance or power consumption. When used with DIMMs that are not pTRR-compliant, these Xeon processors by default fall back on performing DRAM refreshes at twice the usual frequency, which results in slightly higher memory access latency and may reduce the memory bandwidth by up to 2–4%.[6]

The LPDDR4 memory standard published by JEDEC[19] includes optional hardware support for the so-called target row refresh (TRR) that prevents the row hammer effect without negatively impacting performance or power consumption.[7][20][21] Additionally, some manufacturers implement TRR in their DDR4 products,[22][23] although it is not part of the DDR4 memory standard published by JEDEC.[24] Internally, TRR identifies possible victim rows, by counting the number of row activations and comparing it against predefined chip-specific maximum activate count (MAC) and maximum activate window (tMAW) values, and refreshes these rows to prevent bit flips. The MAC value is the maximum total number of row activations that may be encountered on a particular DRAM row within a time interval that is equal or shorter than the tMAW amount of time before its neighbouring rows are identified as victim rows; TRR may also flag a row as a victim row if the sum of row activations for its two neighboring rows reaches the MAC limit within the tMAW time window.[19][25]

Due to their necessity of huge numbers of rapidly performed DRAM row activations, row hammer exploits issue large numbers of uncached memory accesses that cause cache misses, which can be detected by monitoring the rate of cache misses for unusual peaks using hardware performance counters.[4][26] Version 6.0.0 of the memtest86 memory diagnostic software, released on February 13, 2015, includes a so-called hammer test that checks whether computer hardware is susceptible to disturbance errors.[27]

Implications

Memory protection, as a way of preventing processes from accessing memory that has not been assigned to each of them, is one of the concepts behind most modern operating systems. By using memory protection in combination with other security-related mechanisms such as protection rings, it is possible to achieve privilege separation between processes, in which programs and computer systems in general are divided into parts limited to the specific privileges they require to perform a particular task. Using privilege separation can also reduce the extent of potential damage caused by computer security attacks by restricting their effects to specific parts of the system.[28][29]

Disturbance errors (explained in the section above) effectively defeat various layers of memory protection by "short circuiting" them at a very low hardware level, practically creating a unique attack vector type that allows processes to alter the contents of arbitrary parts of the main memory by directly manipulating the underlying memory hardware.[2][4][15][30] In comparison, "conventional" attack vectors such as buffer overflows aim at circumventing the protection mechanisms at the software level, by exploiting various programming mistakes to achieve alterations of otherwise inaccessible main memory contents.[31]

Exploits

code1a:
  mov (X), %eax  // read from address X
  mov (Y), %ebx  // read from address Y
  clflush (X)    // flush cache for address X
  clflush (Y)    // flush cache for address Y
  jmp code1a
A snippet of x86 assembly code that induces the row hammer effect (memory addresses X and Y must map to different DRAM rows in the same memory bank)[1]: 3 [4][15]: 13–15 

The initial research into the row hammer effect, publicized by a group of authors in June 2014, described the nature of disturbance errors and indicated the potential for constructing an attack, but did not provide any examples of a working security exploit.[1] Another research paper, created by a group of authors and published in October 2014, did not imply the existence of any security-related issues arising from the row hammer effect.[13]

On March 9, 2015, Google's Project Zero revealed two working privilege escalation exploits based on the row hammer effect, establishing its exploitable nature on the x86-64 architecture. One of the revealed exploits targets the Google Native Client (NaCl) mechanism for running a limited subset of x86-64 machine instructions within a sandbox,[15]: 27  exploiting the row hammer effect to escape from the sandbox and gain the ability to issue system calls directly. This NaCl vulnerability, tracked as CVE-2015-0565, has been mitigated by modifying the NaCl so it does not allow execution of the clflush (cache line flush[32]) machine instruction, which has been found to be required for constructing an effective row hammer attack.[2][4][30]

The second exploit revealed by Project Zero runs as an unprivileged Linux process on the x86-64 architecture, exploiting the row hammer effect to gain unrestricted access to all physical memory installed in a computer. By combining the disturbance errors with memory spraying, this exploit is capable of altering page table entries (PTEs)[15]: 35  used by the virtual memory system for mapping virtual addresses to physical addresses, which results in the exploit gaining unrestricted memory access.[15]: 34, 36–57  Due to its nature and the inability of the x86-64 architecture to make clflush a privileged machine instruction, this exploit can hardly be mitigated on computers that do not use hardware with built-in row hammer prevention mechanisms. While testing the viability of exploits, Project Zero found that about half of the 29 tested laptops experienced disturbance errors, with some of them occurring on vulnerable laptops in less than five minutes of running row-hammer-inducing code; the tested laptops were manufactured between 2010 and 2014 and used non-ECC DDR3 memory.[2][4][30]

In July 2015, a group of security researchers published a paper that describes an architecture- and instruction-set-independent way for exploiting the row hammer effect. Instead of relying on the clflush instruction to perform cache flushes, this approach achieves uncached memory accesses by causing a very high rate of cache eviction using carefully selected memory access patterns. Although the cache replacement policies differ between processors, this approach overcomes the architectural differences by employing an adaptive cache eviction strategy algorithm.[15]: 64–68  The proof of concept for this approach is provided both as a native code implementation, and as a pure JavaScript implementation that runs on Firefox 39. The JavaScript implementation, called Rowhammer.js,[33] uses large typed arrays and relies on their internal allocation using large pages; as a result, it demonstrates a very high-level exploit of a very low-level vulnerability.[34][35]

Not all software states are vulnerable to rowhammer attacks. An attacker thus needs to find right target states in order to utilize rowhammer errors. In practice, one of the main challenges is in identifying target states. Such typically have been done by domain experts. The mainstream fault tolerance community responded to rowhammer attacks with a systematic methodology[36] which can be used to identify, validate, and evaluate rowhammer attack target states and their exploitability. That work is based on the well-established fault injection-based experimental methodology, and generalized attack target states and found a few practical target states that were previously unknown.

See also

  • Memory scrambling – memory controller feature that turns user data written to the memory into pseudo-random patterns
  • Radiation hardening – the act of making electronic components resistant to damage or malfunctions caused by ionizing radiation
  • Single event upset (SEU) – a change of state caused by ions or electromagnetic radiation striking a sensitive node in an electronic device
  • Soft error – a type of error involving erroneous changes to signals or data but no changes to the underlying device or circuit

Notes

  1. ^ Research shows that the rate of disturbance errors in a selection of DDR3 memory modules closes to zero when the memory refresh interval becomes roughly seven times shorter than the default of 64 ms.[12]: 17, 26 

References

  1. ^ a b c d e f g h i Yoongu Kim; Ross Daly; Jeremie Kim; Chris Fallin; Ji Hye Lee; Donghyuk Lee; Chris Wilkerson; Konrad Lai; Onur Mutlu (June 24, 2014). "Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors" (PDF). ece.cmu.edu. IEEE. Retrieved March 10, 2015.
  2. ^ a b c d e f Dan Goodin (March 10, 2015). "Cutting-edge hack gives super user status by exploiting DRAM weakness". Ars Technica. Retrieved March 10, 2015.
  3. ^ a b Paul Ducklin (March 12, 2015). "'Row hammering' – how to exploit a computer by overworking its memory". Sophos. Retrieved March 14, 2015.
  4. ^ a b c d e f g Mark Seaborn; Thomas Dullien (March 9, 2015). "Exploiting the DRAM rowhammer bug to gain kernel privileges". googleprojectzero.blogspot.com. Google. Retrieved March 10, 2015.
  5. ^ "Using Rowhammer bitflips to root Android phones is now a thing". Ars Technica. Retrieved October 25, 2016.
  6. ^ a b Marcin Kaczmarski (August 2014). "Thoughts on Intel Xeon E5-2600 v2 Product Family Performance Optimisation – Component selection guidelines" (PDF). Intel. p. 13. Retrieved March 11, 2015.
  7. ^ a b Marc Greenberg (October 15, 2014). "Reliability, Availability, and Serviceability (RAS) for DDR DRAM interfaces" (PDF). memcon.com. pp. 2, 7, 10, 20, 27. Retrieved March 11, 2015.
  8. ^ a b c "Lecture 12: DRAM Basics" (PDF). utah.edu. February 17, 2011. pp. 2–7. Retrieved March 10, 2015.
  9. ^ a b "Understanding DRAM Operation" (PDF). IBM. December 1996. Retrieved March 10, 2015.
  10. ^ David August (November 23, 2004). "Lecture 20: Memory Technology" (PDF). cs.princeton.edu. pp. 3–5. Archived from the original (PDF) on May 19, 2005. Retrieved March 10, 2015.
  11. ^ Bianca Schroeder; Eduardo Pinheiro; Wolf-Dietrich Weber (June 25, 2009). "DRAM Errors in the Wild: A Large-Scale Field Study" (PDF). cs.toronto.edu. ACM. Retrieved March 10, 2015.
  12. ^ a b c d e Yoongu Kim; Ross Daly; Jeremie Kim; Chris Fallin; Ji Hye Lee; Donghyuk Lee; Chris Wilkerson; Konrad Lai; Onur Mutlu (June 24, 2014). "Flipping Bits in Memory Without Accessing Them: DRAM Disturbance Errors" (PDF). ece.cmu.edu. Retrieved March 10, 2015.
  13. ^ a b Kyungbae Park; Sanghyeon Baeg; ShiJie Wen; Richard Wong (October 2014). "Active-Precharge Hammering on a Row Induced Failure in DDR3 SDRAMs under 3x nm Technology". IEEE. doi:10.1109/IIRW.2014.7049516. Retrieved March 16, 2015.
  14. ^ Yoongu Kim; Ross Daly; Jeremie Kim; Chris Fallin; Ji Hye Lee; Donghyuk Lee; Chris Wilkerson; Konrad Lai; Onur Mutlu (July 30, 2015). "RowHammer: Reliability Analysis and Security Implications" (PDF). ece.cmu.edu. Retrieved August 7, 2015.
  15. ^ a b c d e f g Mark Seaborn; Thomas Dullien (August 6, 2015). "Exploiting the DRAM rowhammer bug to gain kernel privileges: How to cause and exploit single bit errors" (PDF). Black Hat. Retrieved August 7, 2015.
  16. ^ Andy Greenberg (March 10, 2015). "Googlers' Epic Hack Exploits How Memory Leaks Electricity". Wired. Retrieved March 17, 2015. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)
  17. ^ "Row Hammer Privilege Escalation (Lenovo Security Advisory LEN-2015-009)". Lenovo. August 5, 2015. Retrieved August 6, 2015.
  18. ^ Dae-Hyun Kim; Prashant J. Nair; Moinuddin K. Qureshi (October 9, 2014). "Architectural Support for Mitigating Row Hammering in DRAM Memories" (PDF). ece.gatech.edu. IEEE. Retrieved March 11, 2015.
  19. ^ a b "JEDEC standard JESD209-4A: Low Power Double Data Rate (LPDDR4)" (PDF). JEDEC. November 2015. pp. 222–223. Retrieved January 10, 2016.
  20. ^ Kishore Kasamsetty (October 22, 2014). "DRAM scaling challenges and solutions in LPDDR4 context" (PDF). memcon.com. p. 11. Retrieved January 10, 2016.
  21. ^ Omar Santos (March 9, 2015). "Mitigations Available for the DRAM Row Hammer Vulnerability". cisco.com. Retrieved March 11, 2015.
  22. ^ Marc Greenber (March 9, 2015). "Row Hammering: What it is, and how hackers could use it to gain access to your system". synopsys.com. Retrieved January 10, 2016.
  23. ^ Jung-Bae Lee (November 7, 2014). "Green Memory Solution (Samsung Investors Forum 2014)" (PDF). teletogether.com. Samsung Electronics. p. 15. Retrieved January 10, 2016.
  24. ^ "JEDEC standard JESD79-4A: DDR4 SDRAM" (PDF). JEDEC. November 2013. Retrieved January 10, 2016.
  25. ^ "Data Sheet: 4 Gb ×4, ×8 and ×16 DDR4 SDRAM Features" (PDF). Micron Technology. November 20, 2015. pp. 48, 131. Retrieved January 10, 2016.
  26. ^ Nishad Herath; Anders Fogh (August 6, 2015). "These are Not Your Grand Daddy's CPU Performance Counters: CPU Hardware Performance Counters for Security" (PDF). Black Hat. pp. 29, 38–68. Retrieved January 9, 2016.
  27. ^ "PassMark MemTest86 – Version History". memtest86.com. February 13, 2015. Retrieved March 11, 2015.
  28. ^ Pehr Söderman (2011). "Memory Protection" (PDF). csc.kth.se. Retrieved March 11, 2015.
  29. ^ Niels Provos; Markus Friedl; Peter Honeyman (August 10, 2003). "Preventing Privilege Escalation" (PDF). niels.xtdnet.nl. Retrieved March 11, 2015.
  30. ^ a b c Liam Tung (March 10, 2015). ""Rowhammer" DRAM flaw could be widespread, says Google". ZDNet. Retrieved March 11, 2015.
  31. ^ Murat Balaban (June 6, 2009). "Buffer Overflows Demystified" (TXT). enderunix.org. Retrieved March 11, 2015.
  32. ^ "CLFLUSH: Flush Cache Line (x86 Instruction Set Reference)". renejeschke.de. March 3, 2013. Retrieved August 6, 2015.
  33. ^ Daniel Gruss; Clémentine Maurice (July 27, 2015). "IAIK/rowhammerjs: rowhammerjs/rowhammer.js at master". github.com. Retrieved July 29, 2015.
  34. ^ Gruss, Daniel; Maurice, Clémentine; Mangard, Stefan (July 27, 2015). "Rowhammer.js: A Remote Software-Induced Fault Attack in JavaScript". arXiv:1507.06955 [cs.CR].
  35. ^ David Auerbach (July 28, 2015). "Rowhammer security exploit: Why a new security attack is truly terrifying". slate.com. Retrieved July 29, 2015.
  36. ^ Keun Soo Yim (2016). "The rowhammer attack injection methodology".