|WikiProject Computing / Hardware|
DSPs aren't exactly past-tense. There are more of them in use now than there are processors with SIMD units.
Other examples of early SIMD machines were:
- Xplor supercomputer, from Pyxsys, Inc., circa 2001
- Connection Machine, models 1 and 2 (CM-1 and CM-2), from Thinking Machines Corporation, circa 1985
- Zephyr DTC computer from Wavetracer, circa 1991
- Massively Parallel Processor (MPP), from NASA/Goddard Space Flight Center, circa 1983-1991
There were many others from this era as well. At least some of them deserve mention on the SIMD page.
Max. Popularity :
- SIMD is very used for these slowest video-codecs of complex design :
XviD, DivX, H.263+ AVC (the best of 2004 was AVC NeroDigital), ...
The reasons are that many users are P2P users, movies pirates, porn viewers, ... and they need to compress the space of MPEG-2 DVD 4.7 GB to XviD/DivX/AVC MPEG-4 CD 700 MB (for slow Internet) using the tool VirtualDub. The typical time to complete the compression with SIMD depending on the used codec is between 2 hours using the fastest PC and 12 hours using the slowest PC.
Another SIMD/MIMD architecture has emerged in stream processors. I am not well aware of where to put this update in the page (maybe a previous editor has an idea). I believe it's quite important this page (the only which is not a stub) also mentions the new paradigm functionality. MaxDZ8 09:52, 27 October 2005 (UTC)
- $1 = $2 + $3
- Single instruction ? Yes: +
- Multiple data ? Yes: $1, $2, $3
There's no definition of what SIMD means. Please write one in the text. --Hdante 11:25, 4 March 2006 (UTC)
there is a bad link in http://www.teranex.com/support/docs/TeranexParallelProc.pdf 
I'm surprised the Z80 isn't mentioned. It was arguably the first mass-market CPU with SIMD instructions, albeit not terribly well implemented ones (the main purpose appeared to be to save memory by allowing block copies to be implemented with one two-byte instruction. The instruction was refetched on every cycle, meaning that it didn't offer significant performance advantages over coding the loops by hand.) 22.214.171.124 13:11, 13 August 2006 (UTC)
- It's unincluded because that's not really SIMD in the modern sense: it's really operating on just one chunk of data per clock cycle -- in other words it's executing the instruction many times, with one word processed per instruction executed. Most modern engineers use SIMD to refer to an architecture that processes several orthogonal chunks of data in parallel with a single instruction executed. This is a vague concept on an out-of-order CISC processor like the Intel (because what the heck does a "clock" mean anyway when a given instruction can take anywhere between two and twenty clocks?) but a very real one on something like the PlayStation 2's vector unit, where you really can add two groups of four floating point numbers to each other every clock. Collabi 19:12, 14 August 2006 (UTC)
- Also, the Z80 had a single ALU for processing the LDIR and LDDR instructions. The ALU was only involved in the inc/dec-then-repeat part, I suspect. Typical about a SIMD architecture is that there is a processing unit of some kind (the ALU in case of the Z80) in each of the parallel branches. The LDIR and LDDR instructions are microcoded loops, but they are not parallelisation concepts. Rick 11:13, July 6, 2015 (UTC) — Preceding unsigned comment added by 126.96.36.199 (talk)
Why the speedup is 75% in the graph?
I agree, if it can do four times as many operations in the same time it's a 300% speedup. Also, does anyone else think the font in those images is hideous and inappropriate? Jrmrjnck (talk) 19:54, 27 February 2013 (UTC)