Jump to content

FMA4 instruction set: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Conti (talk | contribs)
Removing Template:Future per its new guidelines. Feel free to leave comments at Template talk:Future#Guidelines.
Afog (talk | contribs)
No edit summary
Line 1: Line 1:
{{future chip}}
The '''FMA4''' instruction set, announced by [[Advanced Micro Devices|AMD]] on May 1, 2009, is an extension to the 128-bit [[Streaming SIMD Extensions|SSE]] core instructions in the [[X86]] and [[AMD64]] instruction set for the [[Bulldozer (processor)|Bulldozer]] processor core, due to begin production in 2011<ref>{{cite web | url=http://support.amd.com/us/Processor_TechDocs/43479.pdf | title=AMD64 Architecture Programmer’s Manual Volume 6: 128-Bit and 256-Bit XOP, FMA4 and CVT16 Instructions | date=[[May 1]] [[2009]] | publisher=[[AMD]]}}</ref>.


The '''FMA''' or '''FMA4''' instruction set is a future extension to the 128-bit [[Streaming SIMD Extensions|SIMD]] instructions in the [[X86]] instruction set proposed by Intel in March 2008<ref>{{cite web | url=http://softwareprojects.intel.com/avx/ | title=Intel Software Network | publisher=Intel | accessdate=2008-04-05}}</ref> and later adopted by [[AMD]]<ref>{{cite web | url=http://support.amd.com/us/Processor_TechDocs/43479.pdf | title=AMD64 Architecture Programmer’s Manual Volume 6: 128-Bit and 256-Bit XOP, FMA4 and CVT16 Instructions | date=[[May 1]] [[2009]] | publisher=[[AMD]]}}</ref>, while Intel had changed their plans to [[FMA3 instruction set|FMA3]] in the meantime.
The FMA4 instruction set, together with the [[XOP_instruction_set|XOP]] and [[CVT16_instruction_set|CVT16]] instruction sets,
is a revision of the [[SSE5]] instruction set proposal announced on August 30, 2007. This revision makes the binary coding of the proposed new instructions compatible with the first published version of [[Intel|Intels]] [[fused multiply-add]] instruction extensions. Unfortunately, Intel have later changed their specifications so that the compatibility between FMA instructions on future AMD and Intel processors is currently uncertain.


The FMA4 instruction set, together with the [[XOP_instruction_set|XOP]] and [[CVT16_instruction_set|CVT16]] instruction sets, is a revision of the [[SSE5]] instruction set proposal announced on August 30, 2007. This revision makes the binary coding of the proposed new instructions compatible with the first published version of [[Intel|Intels]] [[fused multiply-add]] instruction extensions. Unfortunately, Intel have later changed their specifications so that the compatibility between FMA instructions on future AMD and Intel processors is currently uncertain.
The incompatibility concerns the issue of whether the instruction can have three or four different operands. The fused multiply-add operation has the form:

==New instructions==
The FMA4 instruction set contains [[Multiply-accumulate|fused multiply-and-add]] instructions for [[floating point]] scalar and [[SIMD]] operations.

==Compatibility issue==
The incompatibility between Intel and AMD concerns the issue of whether the instruction can have three or four different operands. The fused multiply-add operation has the form:


<math>d=a+b\times c</math>
<math>d=a+b\times c</math>


The 4-operand form allows a, b, c and d to be four different registers, while the 3-operand form requires that d is the same register as either a, b or c. The 3-operand form makes the code shorter and the hardware implementation slightly simpler.
The 4-operand form (FMA4) allows a, b, c and d to be four different registers, while the 3-operand form ([[FMA3 instruction set|FMA3]]) requires that d is the same register as either a, b or c. The 3-operand form makes the code shorter and the hardware implementation slightly simpler.

==CPUs with FMA4==
* AMD
** [[Bulldozer (processor)|Bulldozer]] processor core, due to begin production in 2011<ref>{{cite web | url=http://support.amd.com/us/Processor_TechDocs/43479.pdf | title=AMD64 Architecture Programmer’s Manual Volume 6: 128-Bit and 256-Bit XOP, FMA4 and CVT16 Instructions | date=[[May 1]] [[2009]] | publisher=[[AMD]]}}</ref>.
* Intel
** It is uncertain whether future Intel processors will support FMA4, due to Intel's announced change to FMA3.


==Timeline==
==Timeline==
* August 2007: [[AMD]] announces the [[SSE5]] instruction set, which includes 3-operand fused multiply-add instructions. A new coding scheme (DREX) is introduced for allowing instructions to have three operands <ref>{{cite web | url=http://developer.amd.com/SSE5 | title=128-Bit SSE5 Instruction Set | publisher=[[Advanced Micro Devices|AMD]] Developer Central | accessdate=2008-01-28}}</ref>.
* August 2007: [[AMD]] announces the [[SSE5]] instruction set, which includes 3-operand fused multiply-add instructions. A new coding scheme (DREX) is introduced for allowing instructions to have three operands <ref>{{cite web | url=http://developer.amd.com/SSE5 | title=128-Bit SSE5 Instruction Set | publisher=[[Advanced Micro Devices|AMD]] Developer Central | accessdate=2008-01-28}}</ref>.
* April 2008: [[Intel]] announces their [[Advanced_Vector_Extensions|AVX]] and FMA instruction sets, including 4-operand fused multiply-add instructions. The coding of these instructions uses the new VEX coding scheme which is more flexible than AMD's DREX scheme <ref>{{cite web | url=http://softwarecommunity.intel.com/isn/downloads/intelavx/Intel-AVX-Programming-Reference-31943302.pdf | title=Intel Advanced Vector Extensions Programming Reference | publisher=[[Intel]] | accessdate=2008-04-05}}</ref>.
* April 2008: [[Intel]] announces their [[Advanced_Vector_Extensions|AVX]] and FMA instruction sets, including 4-operand fused multiply-add instructions. The coding of these instructions uses the new [[VEX prefix|VEX]] coding scheme which is more flexible than AMD's DREX scheme <ref>{{cite web | url=http://softwarecommunity.intel.com/isn/downloads/intelavx/Intel-AVX-Programming-Reference-31943302.pdf | title=Intel Advanced Vector Extensions Programming Reference | publisher=[[Intel]] | accessdate=2008-04-05}}</ref>.
* December 2008: Intel changes the specification for their FMA instructions from 4-operand to 3-operand instructions. The VEX coding scheme is still used <ref>{{cite web | url=http://software.intel.com/en-us/avx/ | title=Intel Advanced Vector Extensions Programming Reference | publisher=[[Intel]] | accessdate=2009-05-06}}</ref>.
* December 2008: Intel changes the specification for their FMA instructions from 4-operand to 3-operand instructions. The VEX coding scheme is still used <ref>{{cite web | url=http://software.intel.com/en-us/avx/ | title=Intel Advanced Vector Extensions Programming Reference | publisher=[[Intel]] | accessdate=2009-05-06}}</ref>.
* May 2009: AMD changes the specification of their FMA instructions from the 3-operand DREX form to the 4-operand VEX form, compatible with the April 2008 Intel specification rather than the December 2008 Intel specification.
* May 2009: AMD changes the specification of their FMA instructions from the 3-operand DREX form to the 4-operand VEX form, compatible with the April 2008 Intel specification rather than the December 2008 Intel specification<ref>{{cite web | url=http://forums.amd.com/devblog/blogpost.cfm?threadid=112934&catid=208 | title=Striking a balance | date=May 7, 2009 | publisher=Dave Christie, AMD Developer blogs | accessdate=2009-05-08}}</ref>.


It is currently uncertain whether the 3-operand VEX coded form (which we may call FMA3) or the 4-operand form (FMA4) will be the dominating standard in the future. It is also possible that future processors will support both forms.
It is currently uncertain whether the 3-operand VEX coded form (which we may call [[FMA3 instruction set|FMA3]]) or the 4-operand form (FMA4) will be the dominating standard in the future. It is also possible that future processors will support both forms.


==See also==
==See also==
* [[FMA3 instruction set|FMA3]]
* [[fused multiply-add]]
* [[fused multiply-add]]
* [[SSE5]]
* [[SSE5]]

Revision as of 15:01, 4 June 2009

Template:Future chip

The FMA or FMA4 instruction set is a future extension to the 128-bit SIMD instructions in the X86 instruction set proposed by Intel in March 2008[1] and later adopted by AMD[2], while Intel had changed their plans to FMA3 in the meantime.

The FMA4 instruction set, together with the XOP and CVT16 instruction sets, is a revision of the SSE5 instruction set proposal announced on August 30, 2007. This revision makes the binary coding of the proposed new instructions compatible with the first published version of Intels fused multiply-add instruction extensions. Unfortunately, Intel have later changed their specifications so that the compatibility between FMA instructions on future AMD and Intel processors is currently uncertain.

New instructions

The FMA4 instruction set contains fused multiply-and-add instructions for floating point scalar and SIMD operations.

Compatibility issue

The incompatibility between Intel and AMD concerns the issue of whether the instruction can have three or four different operands. The fused multiply-add operation has the form:

The 4-operand form (FMA4) allows a, b, c and d to be four different registers, while the 3-operand form (FMA3) requires that d is the same register as either a, b or c. The 3-operand form makes the code shorter and the hardware implementation slightly simpler.

CPUs with FMA4

  • AMD
    • Bulldozer processor core, due to begin production in 2011[3].
  • Intel
    • It is uncertain whether future Intel processors will support FMA4, due to Intel's announced change to FMA3.

Timeline

  • August 2007: AMD announces the SSE5 instruction set, which includes 3-operand fused multiply-add instructions. A new coding scheme (DREX) is introduced for allowing instructions to have three operands [4].
  • April 2008: Intel announces their AVX and FMA instruction sets, including 4-operand fused multiply-add instructions. The coding of these instructions uses the new VEX coding scheme which is more flexible than AMD's DREX scheme [5].
  • December 2008: Intel changes the specification for their FMA instructions from 4-operand to 3-operand instructions. The VEX coding scheme is still used [6].
  • May 2009: AMD changes the specification of their FMA instructions from the 3-operand DREX form to the 4-operand VEX form, compatible with the April 2008 Intel specification rather than the December 2008 Intel specification[7].

It is currently uncertain whether the 3-operand VEX coded form (which we may call FMA3) or the 4-operand form (FMA4) will be the dominating standard in the future. It is also possible that future processors will support both forms.

See also

References

  1. ^ "Intel Software Network". Intel. Retrieved 2008-04-05.
  2. ^ "AMD64 Architecture Programmer's Manual Volume 6: 128-Bit and 256-Bit XOP, FMA4 and CVT16 Instructions" (PDF). AMD. May 1 2009. {{cite web}}: Check date values in: |date= (help)
  3. ^ "AMD64 Architecture Programmer's Manual Volume 6: 128-Bit and 256-Bit XOP, FMA4 and CVT16 Instructions" (PDF). AMD. May 1 2009. {{cite web}}: Check date values in: |date= (help)
  4. ^ "128-Bit SSE5 Instruction Set". AMD Developer Central. Retrieved 2008-01-28.
  5. ^ "Intel Advanced Vector Extensions Programming Reference" (PDF). Intel. Retrieved 2008-04-05.
  6. ^ "Intel Advanced Vector Extensions Programming Reference". Intel. Retrieved 2009-05-06.
  7. ^ "Striking a balance". Dave Christie, AMD Developer blogs. May 7, 2009. Retrieved 2009-05-08.