Jump to content

Cray XMT: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
m updated reference link from using http to https
Rescuing 3 sources and tagging 0 as dead. #IABot (v1.5beta)
Line 5: Line 5:
The Threadstorm processors are plugged into systems otherwise identical to the [[Cray XT4]].
The Threadstorm processors are plugged into systems otherwise identical to the [[Cray XT4]].


The primary advantage of these processors come from efficiency masking memory access time. In a simplified model, at each clock cycle an instruction from one of the threads is executed and another memory request is queued with the understanding that by the time the next round of execution is ready the requested data has arrived.<ref>{{cite journal |last1=Nieplocha |first1=Jarek |last2=Marquez |first2=Andres |last3=Petrini |first3=Fabrizio |last4=Chavarria-Miranda |first4=Daniel |date=2007 |title=Unconventional Architectures for High-Throughput Sciences |url=http://scidacreview.org/0703/pdf/hardware.pdf |journal=SciDAC Review |publisher=[[Pacific Northwest National Laboratory]] |issue=5, Fall 2007 |pages=46–50 |name-list-format=vanc |access-date=February 14, 2015 }}</ref>
The primary advantage of these processors come from efficiency masking memory access time. In a simplified model, at each clock cycle an instruction from one of the threads is executed and another memory request is queued with the understanding that by the time the next round of execution is ready the requested data has arrived.<ref>{{cite journal |last1=Nieplocha |first1=Jarek |last2=Marquez |first2=Andres |last3=Petrini |first3=Fabrizio |last4=Chavarria-Miranda |first4=Daniel |date=2007 |title=Unconventional Architectures for High-Throughput Sciences |url=http://scidacreview.org/0703/pdf/hardware.pdf |journal=SciDAC Review |publisher=[[Pacific Northwest National Laboratory]] |issue=5, Fall 2007 |pages=46–50 |name-list-format=vanc |access-date=February 14, 2015 |deadurl=yes |archiveurl=https://web.archive.org/web/20150214174324/http://scidacreview.org/0703/pdf/hardware.pdf |archivedate=February 14, 2015 |df= }}</ref>
This is contrast to many conventional architectures which stall on memory access. The architecture excels in data walking schemes where subsequent memory access cannot be easily predicted and thus wouldn't be well suited to a conventional cache model.<ref name=yarcdatablog>{{cite web |url=http://www.cray.com/yarcdata/blog/?p=182 |title=Why is uRiKA So Fast on Graph-Oriented Queries? |author=<!--Staff writer; David.--> |date=November 14, 2012 |publisher=Cray, Inc. |access-date=February 14, 2015}}</ref>
This is contrast to many conventional architectures which stall on memory access. The architecture excels in data walking schemes where subsequent memory access cannot be easily predicted and thus wouldn't be well suited to a conventional cache model.<ref name=yarcdatablog>{{cite web |url=http://www.cray.com/yarcdata/blog/?p=182 |title=Why is uRiKA So Fast on Graph-Oriented Queries? |author=<!--Staff writer; David.--> |date=November 14, 2012 |publisher=Cray, Inc. |access-date=February 14, 2015 |deadurl=yes |archiveurl=https://web.archive.org/web/20150214173827/http://www.cray.com/yarcdata/blog/?p=182 |archivedate=February 14, 2015 |df= }}</ref>


The Threadstorm processors only execute user code, on top of a simple [[BSD Unix]]-based [[microkernel]] called MTX; system [[Input/output|I/O]] is performed by Opteron processors running [[Linux]].<ref name=yarcdatablog /> This third generation MTA system improves clock speed from 220 [[Frequency|MHz]] to 500&nbsp;MHz, the maximal processor count from 256 to 8192, and maximum memory to 512 [[Terabyte|TB]].
The Threadstorm processors only execute user code, on top of a simple [[BSD Unix]]-based [[microkernel]] called MTX; system [[Input/output|I/O]] is performed by Opteron processors running [[Linux]].<ref name=yarcdatablog /> This third generation MTA system improves clock speed from 220 [[Frequency|MHz]] to 500&nbsp;MHz, the maximal processor count from 256 to 8192, and maximum memory to 512 [[Terabyte|TB]].
Line 20: Line 20:
== External links ==
== External links ==
* [http://www.cray.com/Assets/PDF/products/xmt/CrayXMTBrochure.pdf Cray XMT product brochure]
* [http://www.cray.com/Assets/PDF/products/xmt/CrayXMTBrochure.pdf Cray XMT product brochure]
* [http://www-csag.ucsd.edu/teaching/cse294/20050711-eldorado.ppt Cray ''Eldorado'' presentation, 2005 (PowerPoint)]
* [https://archive.is/20121209180721/http://www-csag.ucsd.edu/teaching/cse294/20050711-eldorado.ppt Cray ''Eldorado'' presentation, 2005 (PowerPoint)]
* [http://www.cray.com/products/analytics Cray Urika product page]
* [http://www.cray.com/products/analytics Cray Urika product page]



Revision as of 06:57, 14 August 2017

The Cray XMT (codenamed Eldorado) is the third generation of the Cray MTA supercomputer architecture originally developed by Tera. The earlier generations were called the Cray MTA and the Cray MTA-2.[1] The XMT makes the MTA's multithreaded processors, now dubbed Threadstorm, compatible with the 1207-pin Socket F used by AMD Opteron processors.[2] The Threadstorm processors are plugged into systems otherwise identical to the Cray XT4.

The primary advantage of these processors come from efficiency masking memory access time. In a simplified model, at each clock cycle an instruction from one of the threads is executed and another memory request is queued with the understanding that by the time the next round of execution is ready the requested data has arrived.[3] This is contrast to many conventional architectures which stall on memory access. The architecture excels in data walking schemes where subsequent memory access cannot be easily predicted and thus wouldn't be well suited to a conventional cache model.[4]

The Threadstorm processors only execute user code, on top of a simple BSD Unix-based microkernel called MTX; system I/O is performed by Opteron processors running Linux.[4] This third generation MTA system improves clock speed from 220 MHz to 500 MHz, the maximal processor count from 256 to 8192, and maximum memory to 512 TB.

The architecture has been reworked in CPUs such as Threadstorm4,[5] which is used by the XMTs technological successor: Cray's Urika line of big data appliances. Currently, the future of the line is unclear due to competition from commodity processors such as Intel's Xeon.[6]

See also

Barrel processor

References

  1. ^ "Cray History". Cray Inc. Archived from the original on July 12, 2014. Retrieved August 19, 2014. {{cite web}}: Unknown parameter |dead-url= ignored (|url-status= suggested) (help)
  2. ^ Timothy Prickett Morgan (February 20, 2011). "Swiss boffins go nuts for Cray supers". The Register. Retrieved February 14, 2015.
  3. ^ Nieplocha, Jarek; Marquez, Andres; Petrini, Fabrizio; Chavarria-Miranda, Daniel (2007). "Unconventional Architectures for High-Throughput Sciences" (PDF). SciDAC Review (5, Fall 2007). Pacific Northwest National Laboratory: 46–50. Archived from the original (PDF) on February 14, 2015. Retrieved February 14, 2015. {{cite journal}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help); Unknown parameter |name-list-format= ignored (|name-list-style= suggested) (help)
  4. ^ a b "Why is uRiKA So Fast on Graph-Oriented Queries?". Cray, Inc. November 14, 2012. Archived from the original on February 14, 2015. Retrieved February 14, 2015. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  5. ^ Kopser, Andrew; Vollrath, Dennis (May 2011). Overview of the Next Generation Cray XMT (PDF). 53rd Cray User Group meeting, CUG 2011. Fairbanks, Alaska. Retrieved February 14, 2015. {{cite conference}}: Unknown parameter |name-list-format= ignored (|name-list-style= suggested) (help)
  6. ^ "Cray CTO Connects The Dots On Future Interconnects". The Next Platform. 8 January 2016. Retrieved 2 May 2016. Steve Scott: You can do it just great with a Xeon. We are not planning on doing another ThreadStorm processor. But it does take some software technology that comes out of the ThreadStorm legacy.