Jump to content

Talk:Non-uniform memory access

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 216.136.25.72 (talk) at 00:49, 9 October 2008 (Where are the Sun, IBM, or other _real_ NUMA systems?). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Template:FOLDOC talk

Why No Sun / Fireplane?

This architecture predates Intel/AMD NUMA by quite a bit and is still very common in many datacenters. This article almost makes NUMA sound like a new idea or one that Intel/AMD invented recently.
For that matter, why is Intel/AMD the only mention here anyway? There are many other, larger scale, proven examples of NUMA. Opteron/HT is a tiny baby in comparison. It's silly that we're even talking about it.
For reference:
<a href=http://www.sun.com/servers/x64/x4500/arch-wp.pdf>Example of Opteron/HT architecture</a>
<a href=http://docs.sun.com/source/817-4136-13/1_Introduction.html#98130>REAL NUMA</a>
<a href=http://www.repton.co.uk/newsletter/repton_pages/docs/v490_v890_wp.pdf>Sun Fireplane diagrams</a>
"Current implementations of ccNUMA systems are multi processor systems based on the AMD Opteron processor." Were you born yesterday?


Opteron integrated memory management versus a new chipset to handle memory more efficiently.

The Opteron has a good idea but I don't think they took it to the level they could have. With multiple procs there are still CPU cycles used to communicate with ALL the other processors in the event that a proc needs to use memory that's being held by a different proc. The dedicated memory for each CPU takes care of alot of these snoops but not all of them. This is why the Opteron's STILL require a dedicated hypertransport for inter CPU communication. wasted cpu cycles are minimized, yes, but not to the level it could be.

Why not just rearchitect the chipset to maintain a table of what proc has what memory to virtually elliminate cpu snoops all together? A one stop memory shop. This will elliminate the need for L3 cache all together and possibly elliminate a northbridge chip as well.

SGI's processors?

The article says:

Current implementations of ccNUMA systems are multi processor systems based on the AMD Opteron processor. Earlier ccNUMA approaches were systems based on the Alpha processor EV7 of Digital Equipment Corporation (DEC).

What about the MIPS and Itanium processors used in SGI ccNUMA systems? Ericfluger 12:11, 16 October 2006 (UTC)[reply]

AFAIK SGI actually invented ccNUMA. This was the base of their SN series, starting with the Origin 200 and Origin 2000 (aka SN-0) in 1996. They used MIPS R10000 CPUs; the 200 had only 4 of them, the 2000 had up to 512. The Origin 3000 (aka SN-1 or SN-MIPS) and 3900 completed the MIPS-based ccNUMA systems. Then came the Altix (aka SN-IA) with Itanium2 processors. Actually, the Silicon Graphics page has most of this information, and this corresponds to talks that I had with former SGI engineers. But since I don't have references handy, I won't edit the page. jschrod 11:27, 7 December 2006 (UTC)[reply]

Sandra Memory Bandwidth Benchmarks

I don't know if this should really go into the article, but even with the highly superior Core 2 processors today, AMD with its NUMA memory capability still outperforms Intel in memory-bandwidth benchmarks. For example: http://www.gamepc.com/labs/view_content.asp?id=o2000&page=6 —The preceding unsigned comment was added by DonPMitchell (talkcontribs) 22:07, 26 February 2007 (UTC).[reply]

Cache coherence needed for NUMA?

I would take issue with the statement "Although simpler to design and build, non-cache-coherent NUMA systems become prohibitively complex to program in the standard von Neumann architecture programming model" The ICL 2900 Series implemented NUMA in its Series 39 incarnation and it was very successful - and commercial programmers found it quite friendly to program. Semaphore instructions were introduced in the original 2900s and were always needed when using shared memory so there wasn't a pile of packages hanging around like with Unix trying to do tricks with normal reads and writes. Series 39 nodes could be up to 500 meters apart and were connected using fibre optic bundles. -Dmcq 14:27, 4 April 2007 (UTC)[reply]

I believe IBM's Blue Gene is non-cache coherent NUMA; inter-thread communication only works via MPI. Konrad Schwarz 192.35.17.11 10:09, 17 October 2007 (UTC)[reply]

Burroughs???

The article very correctly states that Burroughs was a NUMA pioneer, but Sperry-Univac bought Burroughs (called it a merger; the result was Unisys) in the 1980's. Hard to understand how 'Burroughs' was doing ANYTHING in the 1990s - Burroughs did not EXIST in 1990. —The preceding unsigned comment was added by 64.86.28.237 (talk) 08:41, 15 April 2007 (UTC).[reply]