|Computer memory types|
|Early stage NVRAM|
Racetrack memory or domain-wall memory (DWM) is an experimental non-volatile memory device under development at IBM's Almaden Research Center by a team led by physicist Stuart Parkin. In early 2008, a 3-bit version was successfully demonstrated. If it were to be developed successfully, racetrack would offer storage density higher than comparable solid-state memory devices like flash memory and similar to conventional disk drives, with higher read/write performance.
Racetrack memory uses a spin-coherent electric current to move magnetic domains along a nanoscopic permalloy wire about 200 nm across and 100 nm thick. As current is passed through the wire, the domains pass by magnetic read/write heads positioned near the wire, which alter the domains to record patterns of bits. A racetrack memory device is made up of many such wires and read/write elements. In general operational concept, racetrack memory is similar to the earlier bubble memory of the 1960s and 1970s. Delay line memory, such as mercury delay lines of the 1940s and 1950s, are a still-earlier form of similar technology, as used in the UNIVAC and EDSAC computers. Like bubble memory, racetrack memory uses electrical currents to "push" a sequence of magnetic domains through a substrate and past read/write elements. Improvements in magnetic detection capabilities, based on the development of spintronic magnetoresistive sensors, allow the use of much smaller magnetic domains to provide far higher bit densities.
In production, it was expected that the wires could be scaled down to around 50 nm. There were two arrangements considered for racetrack memory. The simplest was a series of flat wires arranged in a grid with read and write heads arranged nearby. A more widely studied arrangement used U-shaped wires arranged vertically over a grid of read/write heads on an underlying substrate. This would allow the wires to be much longer without increasing its 2D area, although the need to move individual domains further along the wires before they reach the read/write heads results in slower random access times. Both arrangements offered about the same throughput performance. The primary concern in terms of construction was practical; whether or not the three dimensional vertical arrangement would be feasible to mass-produce.
Comparison to other memory devices
||This section contains content that is written like an advertisement. (April 2009) (Learn how and when to remove this template message)|
||This section may require cleanup to meet Wikipedia's quality standards. The specific problem is: Descriptions of Flash and memory cost make the section verbose (September 2016) (Learn how and when to remove this template message)|
Projections in 2008 suggested that racetrack memory would offer performance on the order of 20-32 ns to read or write a random bit. This compared to about 10,000,000 ns for a hard drive, or 20-30 ns for conventional DRAM. The primary authors discussed ways to improve the access times with the use of a "reservoir" to about 9.5 ns. Aggregate throughput, with or without the reservoir, would be on the order of 250-670 Mbit/s for racetrack memory, compared to 12800 Mbit/s for a single DDR3 DRAM, 1000 Mbit/s for high-performance hard drives, and 30 to 100 Mbit/s for flash memory devices. The only current technology that offered a clear latency benefit over racetrack memory was SRAM, on the order of 0.2 ns, but at a higher cost. larger feature size "F" of about 45 nm (as of 2011) with a cell area of about 140 F2.
Flash memory is asymmetrical. Read performance is faster than writing. Flash memory works by "trapping" electrons in the chip surface. It requires a burst of high voltage to remove this charge and reset the cell. In order to do this, charge is accumulated in a charge pump, which takes time. In the case of NOR flash memory, which allows bit-wise random access like racetrack memory, read times were on the order of 70 ns, while write times were much slower, about 2,500 ns. To address this concern, NAND flash memory allows reading and writing only in blocks, but this means that the time to access any random bit is increased to about 1,000 ns. In addition, the use of the burst of high voltage physically degrades the cell, so most flash devices allow on the order of 100,000 writes to any particular bit before their operation becomes unpredictable. Wear leveling and other techniques can spread this out if the underlying data can be re-arranged.
The key determinant of the cost of any memory device is the physical size of the storage medium. This is due to the way memory devices are fabricated. In the case of solid-state devices like flash memory or DRAM, a large "wafer" of silicon is processed into many individual devices, which are then cut apart and packaged. The cost of packaging is about $1 per device, so, as the density increases and the number of bits per devices increases with it, the cost per bit falls by an equal amount. In hard drives, data is stored on rotating platters, and the cost of the drive is strongly related to the number of platters. Increasing the density allows the number of platters to be reduced for any given amount of storage.
In most cases, memory devices store one bit in any given location, so they are typically compared in terms of "cell size", a cell storing one bit. Cell size itself is given in units of F², where "F" is the feature size design rule, representing usually the metal line width. Flash and racetrack both store multiple bits per cell, but the comparison can still be made. For instance, hard drives appeared to be reaching theoretical limits around 650 nm²/bit, defined primarily by the capability to read and write to specific areas of the magnetic surface. DRAM has a cell size of about 6 F², SRAM is much less dense at 120 F². NAND flash memory is currently the densest form of non-volatile memory in widespread use, with a cell size of about 4.5 F², but storing three bits per cell for an effective size of 1.5 F². NOR flash memory is slightly less dense, at an effective 4.75 F², accounting for 2-bit operation on a 9.5 F² cell size. In the vertical orientation (U-shaped) racetrack, nearly 10-20 bits are stored per cell, which itself would have a physical size of at least about 20 F². In addition, bits at different positions on the "track" would take different times (from ~10 to ~1000 ns, or 10 ns/bit) to be accessed by the read/write sensor, because the "track" would move the domains at a fixed rate of ~100 m/s past the read/write sensor.
Racetrack memory is one among several emerging technologies that aim to replace conventional memories such as DRAM and Flash, and potentially offer a "universal" memory device applicable to a wide variety of roles. Other contenders included magnetoresistive random-access memory (MRAM), phase-change memory (PCRAM) and ferroelectric RAM (FeRAM). Most of these technologies offer densities similar to flash memory, in most cases worse, and their primary advantage is the lack of write-endurance limits like those in flash memory. Field-MRAM offers excellent performance as high as 3 ns access time, but requires a large 25-40 F² cell size. It might see use as an SRAM replacement, but not as a mass storage device. The highest densities from any of these devices is offered by PCRAM, with a cell size of about 5.8 F², similar to flash memory, as well as fairly good performance around 50 ns. Nevertheless, none of these can come close to competing with racetrack memory in overall terms, especially density. For example, 50 ns allows about five bits to be operated in a racetrack memory device, resulting in an effective cell size of 20/5=4 F², easily exceeding the performance-density product of PCM. On the other hand, without sacrificing bit density, the same 20 F² area could fit 2.5 2-bit 8 F² alternative memory cells (such as resistive RAM (RRAM) or spin-torque transfer MRAM), each of which individually operating much faster (~10 ns).
A difficulty for racetrack technology arises from the need for high current density (>108 A/cm²); a 30 nm x 100 nm cross-section would require >3 mA. The resulting power draw would be higher than, for example, spin-transfer torque memory (STT-RAM) or flash memory.
One limitation of the early experimental devices was that the magnetic domains could be pushed only slowly through the wires, requiring current pulses on the orders of microseconds to move them successfully. This was unexpected, and led to performance equal roughly to that of hard drives, as much as 1000 times slower than predicted. Recent research at the University of Hamburg has traced this problem to microscopic imperfections in the crystal structure of the wires which led to the domains becoming "stuck" at these imperfections. Using an X-ray microscope to directly image the boundaries between the domains, their research found that domain walls would be moved by pulses as short as a few nanoseconds when these imperfections were absent. This corresponds to a macroscopic performance of about 110 m/s.
The voltage required to drive the domains along the racetrack would be proportional to the length of the wire. The current density must be sufficiently high to push the domain walls (as in electromigration).
- Giant magnetoresistance (GMR) effect
- Magnetoresistive random-access memory (MRAM)
- Spin transistor
- Spintronics Devices Research, Magnetic Racetrack Memory Project
- Masamitsu Hayashi et al. (April 2008). "Current-Controlled Magnetic Domain-Wall Nanowire Shift Register". Science. 320 (5873): 209–211. doi:10.1126/science.1154587.
- "ITRS 2011". Retrieved 8 November 2012.
- Parkin; et al. (11 April 2008). "Magnetic Domain-Wall Racetrack Memory". Science. 320: 190. doi:10.1126/science.1145799.
- 1 Tbit/in² is approx. 650nm²/bit.
- "A Survey of Techniques for Architecting Processor Components using Domain Wall Memory", ACM JETC, 2016
- 'Racetrack' memory could gallop past the hard disk