Jump to content

SSSE3

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by EduardoS (talk | contribs) at 23:56, 1 October 2006. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Supplemental Streaming SIMD Extension 3 (SSSE3)[1] is Intel's name for the SSE instruction set's fourth iteration, as they appear to consider it merely a revision of SSE3. Before Intel used the official name, it was often referred to as SSE4. It has also been referred to by the code names Tejas New Instructions (TNI) or Merom New Instructions (MNI) for the first processor designs intended to support it. Introduced in Intel's Core Microarchitecture, SSSE3 is available in the Xeon 5100 series (Server and Workstation) processors and the Intel Core 2 (Notebook and Desktop) processors. The earlier SIMD instruction sets on the x86 platform, from oldest to newest, are MMX, 3DNow! (developed by AMD), 3DNow! Professional, SSE, SSE2, and SSE3.

SSSE3 contains 16 new instructions over SSE3; Intel's advertising material counts this as 32, since each instruction comes in forms that act on 64-bit MMX or 128-bit XMM registers.

CPUs with SSSE3

New Instructions

In the table below, satsw(X) (read as 'saturate to signed word') takes a signed integer X, and converts it to -32768 if it's less than -32768, to +32767 if it's greater than 32767, and leaves it alone otherwise. As normal for the Intel architecture, bytes are 8 bits, words 16 bits, and dwords 32 bits; 'register' refers to an MMX or XMM vector register.

PSIGNB, PSIGNW, PSIGND Packed Sign Fill the elements of a register of bytes, words or dwords with +1, 0 or -1, depending on the sign of the elements of another register.
PABSB, PABSW, PABSD Packed Absolute Value Fill the elements of a register of bytes, words or dwords with the absolute values of the elements of another register
PALIGNR Packed Align Right take two registers, concatenate their values, and pull out a register-length section from an offset given by an immediate value encoded in the instruction.
PSHUFB Packed Shuffle Bytes takes registers of bytes A = [a0 a1 a2 ...] and B = [b0 b1 b2 ...] and replaces A with [ab0 ab1 ab2 ...]; except that it replaces the ith entry with 0 if the top bit of bi is set.
PMULHRSW Packed Multiply High with Round and Scale treat the sixteen-bit words in registers A and B as signed 15-bit fixed-point numbers between -1 and 1 (eg 0x4000 is treated as 0.5 and 0xa000 as -0.75), and multiply them together with correct rounding.
PMADDUBSW Multiply and Add Packed Signed and Unsigned Bytes Take the bytes in registers A and B, multiply them together, add pairs, signed-saturate and store. IE [a0 a1 a2 ...] pmaddubsw [b0 b1 b2 ...] = [satsw(a0b0+a1b1) satsw(a2b2+a3b3) ...]
PHSUBW, PHSUBD Packed Horizontal Subtract (Words or Doublewords) takes registers A = [a0 a1 a2 ...] and B = [b0 b1 b2 ...] and outputs [a0-a1 a2-a3 ... b0-b1 b2-b3 ...]
PHSUBSW Packed Horizontal Subtract and Saturate Words like PHSUBW, but outputs [satsw(a0-a1) satsw(a2-a3) ... satsw(b0-b1) satsw(b2-b3) ...]
PHADDW, PHADDD Packed Horizontal Add (Words or Doublewords) takes registers A = [a0 a1 a2 ...] and B = [b0 b1 b2 ...] and outputs [a0+a1 a2+a3 ... b0+b1 b2+b3 ...]
PHADDSW Packed Horizontal Add and Saturate Words like PHADDW, but outputs [satsw(a0+a1) satsw(a2+a3) ... satsw(b0+b1) satsw(b2+b3) ...]

See also