HyperTransport

HyperTransport Consortium logo

HyperTransport (HT), formerly known as Lightning Data Transport (LDT), is a technology for interconnection of computer processors. It is a bidirectional serial/parallel high-bandwidth, low-latency point-to-point link that was introduced on April 2, 2001.^[1] The HyperTransport Consortium is in charge of promoting and developing HyperTransport technology.

HyperTransport is best known as the system bus architecture of modern AMD central processing units (CPUs) and the associated Nvidia nForce motherboard chipsets. HyperTransport has also been used by IBM and Apple for the Power Mac G5 machines, as well as a number of modern MIPS systems.

The current specification HTX3.1 remains competitive for 2014 high speed (2666 and 3200 MT/s or about 10.4GB/s and 12.8GB/s) DDR4 RAM and slower terabyte (around 1GB/sec [1] similar to high end PCIe SSDs ULLtraDIMM flash RAM) technology^{[clarification needed]} - a wider range of RAM speeds on a common CPU bus than any Intel Front Side Bus. Intel technologies require each speed range of RAM to have its own interface, resulting in a more complex motherboard layout but with fewer bottlenecks. HTX 3.1 at 26GB/s can continue to serve as a unified bus for as many as four DDR4 sticks running at the fastest proposed speeds. Beyond that DDR4 RAM may require two or more HTX 3.1 buses diminishing its value as unified transport.

Overview

Links and rates

HyperTransport comes in four versions—1.x, 2.0, 3.0, and 3.1—which run from 200 MHz to 3.2 GHz. It is also a DDR or "double data rate" connection, meaning it sends data on both the rising and falling edges of the clock signal. This allows for a maximum data rate of 6400 MT/s when running at 3.2 GHz. The operating frequency is autonegotiated with the motherboard chipset (North Bridge) in current computing.

HyperTransport supports an autonegotiated bit width, ranging from 2 to 32 bits per link; there are two unidirectional links per HyperTransport bus. With the advent of version 3.1, using full 32-bit links and utilizing the full HyperTransport 3.1 specification's operating frequency, the theoretical transfer rate is 25.6 GB/s (3.2 GHz × 2 transfers per clock cycle × 32 bits per link) per direction, or 51.2 GB/s aggregated throughput, making it faster than most existing bus standard for PC workstations and servers as well as making it faster than most bus standards for high-performance computing and networking.

Links of various widths can be mixed together in a single system configuration as in one 16-bit link to another CPU and one 8-bit link to a peripheral device, which allows for a wider interconnect between CPUs, and a lower bandwidth interconnect to peripherals as appropriate. It also supports link splitting, where a single 16-bit link can be divided into two 8-bit links. The technology also typically has lower latency than other solutions due to its lower overhead.

Electrically, HyperTransport is similar to low-voltage differential signaling (LVDS) operating at 1.2 V.^[2] HyperTransport 2.0 added post-cursor transmitter deemphasis. HyperTransport 3.0 added scrambling and receiver phase alignment as well as optional transmitter precursor deemphasis.

Packet-oriented

HyperTransport is packet-based, where each packet consists of a set of 32-bit words, regardless of the physical width of the link. The first word in a packet always contains a command field. Many packets contain a 40-bit address. An additional 32-bit control packet is prepended when 64-bit addressing is required. The data payload is sent after the control packet. Transfers are always padded to a multiple of 32 bits, regardless of their actual length.

HyperTransport packets enter the interconnect in segments known as bit times. The number of bit times required depends on the link width. HyperTransport also supports system management messaging, signaling interrupts, issuing probes to adjacent devices or processors, I/O transactions, and general data transactions. There are two kinds of write commands supported: posted and non-posted. Posted writes do not require a response from the target. This is usually used for high bandwidth devices such as uniform memory access traffic or direct memory access transfers. Non-posted writes require a response from the receiver in the form of a "target done" response. Reads also require a response, containing the read data. HyperTransport supports the PCI consumer/producer ordering model.

Power-managed

HyperTransport also facilitates power management as it is compliant with the Advanced Configuration and Power Interface specification. This means that changes in processor sleep states (C states) can signal changes in device states (D states), e.g. powering off disks when the CPU goes to sleep. HyperTransport 3.0 added further capabilities to allow a centralized power management controller to implement power management policies.

Applications

Front-side bus replacement

The primary use for HyperTransport is to replace the Intel-defined front-side bus, which is different for every type of Intel processor. For instance, a Pentium cannot be plugged into a PCI Express bus directly, but must first go through an adapter to expand the system. The proprietary front-side bus must connect through adapters for the various standard buses, like AGP or PCI Express. These are typically included in the respective controller functions, namely the northbridge and southbridge.

In contrast, HyperTransport is an open specification, published by a multi-company consortium. A single HyperTransport adapter chip will work with a wide spectrum of HyperTransport enabled microprocessors.

AMD uses HyperTransport to replace the front-side bus in their Opteron, Athlon 64, Athlon II, Sempron 64, Turion 64, Phenom, Phenom II and FX families of microprocessors.

Multiprocessor interconnect

Another use for HyperTransport is as an interconnect for NUMA multiprocessor computers. AMD uses HyperTransport with a proprietary cache coherency extension as part of their Direct Connect Architecture in their Opteron and Athlon 64 FX (Dual Socket Direct Connect (DSDC) Architecture) line of processors. The HORUS interconnect from Newisys extends this concept to larger clusters. The Aqua device from 3Leaf Systems virtualizes and interconnects CPUs, memory, and I/O.

Router or switch bus replacement

HyperTransport can also be used as a bus in routers and switches. Routers and switches have multiple network interfaces, and must forward data between these ports as fast as possible. For example, a four-port, 1000 Mbit/s Ethernet router needs a maximum 8000 Mbit/s of internal bandwidth (1000 Mbit/s × 4 ports × 2 directions)—HyperTransport greatly exceeds the bandwidth this application requires. However a 4 + 1 port 10 Gb router would require 100 Gbit/s of internal bandwidth. Add to that 802.11ac 8 antennas and the WiGig 60 GHz standard (802.11ad) and HyperTransport becomes more feasible (with anywhere between 20 and 24 lanes used for the needed bandwidth).

Co-processor interconnect

The issue of latency and bandwidth between CPUs and co-processors has usually been the major stumbling block to their practical implementation. Recently, co-processors such as FPGAs have appeared that can access the HyperTransport bus and become first-class citizens on the motherboard. Current generation FPGAs from both main manufacturers (Altera and Xilinx) directly support the HyperTransport interface, and have IP Cores available. Companies such as XtremeData, Inc. and DRC take these FPGAs (Xilinx in DRC's case) and create a module that allows FPGAs to plug directly into the Opteron socket.

AMD started an initiative named Torrenza on September 21, 2006 to further promote the usage of HyperTransport for plug-in cards and coprocessors. This initiative opened their "Socket F" to plug-in boards such as those from XtremeData and DRC.

Add-on card connector (HTX and HTX3)

A connector specification that allows a slot-based peripheral to have direct connection to a microprocessor using a HyperTransport interface was released by the HyperTransport Consortium. It is known as HyperTransport eXpansion (HTX). Using a reversed instance of the same mechanical connector as a 16-lane PCI-Express slot (plus an x1 connector for power pins), HTX allows development of plug-in cards that support direct access to a CPU and DMA to the system RAM. The initial card for this slot was the QLogic InfiniPath InfiniBand HCA. IBM and HP, among others, have released HTX compliant systems.

The original HTX standard is limited to 16 bits and 800 MHz.^[3]

In August 2008, the HyperTransport Consortium released HTX3, which extends the clock rate of HTX to 2.6 GHz (5.2 GT/s, 10.7 GTi, 5.2 real GHz data rate, 3 MT/s edit rate) and retains backwards compatibility.^[4]

Testing

The "DUT" test connector^[5] is defined to enable standardized functional test system interconnection.

Implementations

AMD AMD64 and Direct Connect Architecture based CPUs
AMD chipsets
- AMD-8000 series
- AMD 480 series
- AMD 580 series
- AMD 690 series
- AMD 700 series
- AMD 800 series
- AMD 900 series
ATI chipsets
- ATI Radeon Xpress 200 for AMD Processor
- ATI Radeon Xpress 3200 for AMD Processor
Broadcom (then ServerWorks ) HyperTransport SystemI/O Controllers
- HT-2000
- HT-2100
Cisco QuantumFlow Processors
ht_tunnel from OpenCores project (MPL licence)
IBM CPC925 and CPC945 (PowerPC 970 northbridges) chipsets
Loongson-3 MIPS processor
Nvidia nForce chipsets
- nForce Professional MCPs (Media and Communication Processor)
- nForce 3 series
- nForce 4 series
- nForce 500 series
- nForce 600 series
- nForce 700 series
- nForce 900 series
PMC-Sierra RM9000X2 MIPS CPU
Power Mac G5^[6]
Raza Thread Processors
SiByte MIPS CPUs from Broadcom
VIA chipsets K8 series

Frequency specifications

HyperTransport version	Year	Max. HT frequency	Max. link width	Max. aggregate bandwidth (bi-directional)	Max. bandwidth at 16-bit unidirectional (GB/s)	Max. bandwidth at 32-bit unidirectional* (GB/s)
1.0	2001	800 MHz	32-bit	12.8 GB/s	3.2	6.4
1.1	2002	800 MHz	32-bit	12.8 GB/s	3.2	6.4
2.0	2004	1.4 GHz	32-bit	22.4 GB/s	5.6	11.2
3.0	2006	2.6 GHz	32-bit	41.6 GB/s	10.4	20.8
3.1	2008	3.2 GHz	32-bit	51.2 GB/s	12.8	25.6

AMD Athlon 64, Athlon 64 FX, Athlon 64 X2, Athlon X2, Athlon II, Phenom, Phenom II, Sempron, Turion series and later use one 16-bit HyperTransport link. AMD Athlon 64 FX (1207), Opteron use up to three 16-bit HyperTransport links. Common clock rates for these processor links are 800 MHz to 1 GHz (older single and multi socket systems on 754/939/940 links) and 1.6 GHz to 2.0 GHz (newer single socket systems on AM2+/AM3 links – most newer cpus using 2.0 GHz). While HyperTransport itself is capable of 32-bit width links, that width is not currently utilized by any AMD processors. Some chipsets though do not even utilize the 16-bit width used by the processors. Those include the Nvidia nForce3 150, nForce3 Pro 150, and the ULi M1689—which use a 16-bit HyperTransport downstream link but limit the HyperTransport upstream link to 8 bits.

Name

There has been some marketing confusion between the use of HT referring to HyperTransport and the later use of HT to refer to Intel's Hyper-Threading feature on some Pentium 4-based and the newer Nehalem and Westmere-based Intel Core microprocessors. Hyper-Threading is officially known as Hyper-Threading Technology (HTT) or HT Technology. Because of this potential for confusion, the HyperTransport Consortium always uses the written-out form: "HyperTransport."

References

^ "API NetWorks Accelerates Use of HyperTransport Technology With Launch of Industry's First HyperTransport Technology-to-PCI Bridge Chip" (Press release). HyperTransport Consortium. 2001-04-02.
^ Overview (PDF), Hyper transport.
^ Emberson, David; Holden, Brian (2007-12-12). "HTX specification" (PDF): 4. Retrieved 2008-01-30. {{cite journal}}: Cite journal requires |journal= (help)
^ Emberson, David (2008-06-25). "HTX3 specification" (PDF): 4. Retrieved 2008-08-17. {{cite journal}}: Cite journal requires |journal= (help)
^ Holden, Brian; Meschke, Michael ‘Mike’; Abu-Lebdeh, Ziad; D’Orfani, Renato. "DUT Connector and Test Environment for HyperTransport" (PDF). {{cite journal}}: Cite journal requires |journal= (help)
^ Steve Jobs, Apple (25 June 2003). "WWDC 2003 Keynote". YouTube. Retrieved 2009-10-16.

External links

HyperTransport Consortium (home).
Technology, HyperTransport.
Technical Specifications, HyperTransport.
Center of Excellence for HyperTransport, DE: Uni HD.

[1] "API NetWorks Accelerates Use of HyperTransport Technology With Launch of Industry's First HyperTransport Technology-to-PCI Bridge Chip" (Press release). HyperTransport Consortium. 2001-04-02.

[2] Overview (PDF), Hyper transport.

[3] Emberson, David; Holden, Brian (2007-12-12). "HTX specification" (PDF): 4. Retrieved 2008-01-30. {{cite journal}}: Cite journal requires |journal= (help)

[4] Emberson, David (2008-06-25). "HTX3 specification" (PDF): 4. Retrieved 2008-08-17. {{cite journal}}: Cite journal requires |journal= (help)

[5] Holden, Brian; Meschke, Michael ‘Mike’; Abu-Lebdeh, Ziad; D’Orfani, Renato. "DUT Connector and Test Environment for HyperTransport" (PDF). {{cite journal}}: Cite journal requires |journal= (help)

[6] Steve Jobs, Apple (25 June 2003). "WWDC 2003 Keynote". YouTube. Retrieved 2009-10-16.

[1]

[2]

[3]

[4]

[5]

[6]

v t e Technical and de facto standards for wired computer buses
General	System bus Front-side bus Back-side bus Daisy chain Control bus Address bus Bus contention Bus mastering Network on a chip Plug and play List of bus bandwidths
Standards	SS-50 bus S-100 bus Multibus Unibus VAXBI MBus STD Bus SMBus Q-Bus Europe Card Bus ISA STEbus Zorro II Zorro III CAMAC FASTBUS LPC HP Precision Bus EISA VME VXI VXS VPX NuBus TURBOchannel MCA SBus VLB HP GSC bus InfiniBand Ethernet UPA PCI PCI Extended (PCI-X) PXI PCI Express (PCIe) AGP Compute Express Link (CXL) Direct Media Interface (DMI) RapidIO Intel QuickPath Interconnect NVLink HyperTransport Infinity Fabric Intel Ultra Path Interconnect Coherent Accelerator Processor Interface (CAPI) SpaceWire
Storage	ST-506 ESDI IPI SMD Parallel ATA (PATA) Bus and Tag DSSI HIPPI Serial ATA (SATA) SCSI Parallel SAS ESCON Fibre Channel SSA SATAe PCI Express (via AHCI or NVMe logical device interface)
Peripheral	Apple Desktop Bus Atari SIO DCB Commodore bus HP-IL HIL MIDI RS-232 RS-422 RS-423 RS-485 Lightning DMX512-A IEEE-488 (GPIB) IEEE-1284 (parallel port) IEEE-1394 (FireWire) UNI/O 1-Wire I²C (ACCESS.bus, PMBus, SMBus) I3C SPI D²B Parallel SCSI Profibus USB Camera Link External PCIe Thunderbolt
Audio	ADAT Lightpipe AES3 Intel HD Audio I²S MADI McASP S/PDIF TOSLINK
Portable	PC Card ExpressCard
Embedded	Multidrop bus CoreConnect AMBA (AXI) Wishbone SLIMbus
Interfaces are listed by their speed in the (roughly) ascending order, so the interface at the end of each section should be the fastest. Category