C++ AMP

Original author(s)	Microsoft
Type	Library
License	Inconclusive
Website	docs.microsoft.com/en-us/cpp/parallel/amp/cpp-amp-cpp-accelerated-massive-parallelism

C++ Accelerated Massive Parallelism (C++ AMP) is a native programming model that contains elements that span the C++ programming language and its runtime library. It provides an easy way to write programs that compile and execute on data-parallel hardware, such as graphics cards (GPUs).

C++ AMP is a library implemented on DirectX 11 and an open specification from Microsoft for implementing data parallelism directly in C++. It is intended to make programming GPUs easy for the developer by supporting a range of expertise from none (in which case the system does its best) to being more finely controllable, but still portable. In Microsoft's implementation, code that cannot be run on GPUs will fall back onto one or more CPUs instead and use SSE instructions.^{[citation needed]} The Microsoft implementation is included in Visual Studio 2012, including debugger and profiler support.

The initial C++ AMP release from Microsoft requires at least Windows 7 or Windows Server 2008 R2.^[1] As C++ AMP is an open specification it is expected that in time implementations outside Microsoft will appear; one early example of this is Shevlin Park, Intel's experimental implementation of C++ AMP on Clang/LLVM and OpenCL.^[2]

On November 12, 2013 the HSA Foundation announced a C++ AMP compiler that outputs to OpenCL, Standard Portable Intermediate Representation (SPIR), and HSA Intermediate Language (HSAIL) supporting the current C++ AMP specification.^[3] The source is available at https://github.com/RadeonOpenCompute/hcc

Features

Microsoft added the restrict(amp) feature, which can be applied to any function (including lambdas) to declare that the function can be executed on a C++ AMP accelerator. The compiler will automatically generate a compute kernel, saving the boilerplate of management and having to use a separate language. The restrict keyword instructs the compiler to statically check that the function uses only those language features that are supported by most GPUs, for example, void myFunc() restrict(amp) {…} Microsoft or other implementer of the open C++ AMP specification could add other restrict specifiers for other purposes, including for purposes that are unrelated to C++ AMP.

Beyond the new language feature, the rest of C++ AMP is available through the <amp.h> header file in the concurrency namespace. The key C++ AMP classes are: array (container for data on an accelerator), array_view (wrapper for data), index (N-dimensional point), extent (N-dimensional size), accelerator (computational resource, such as a GPU, on which to allocate memory and execute), and accelerator_view (view of an accelerator). There is also a global function, parallel_for_each, which you use to write a C++ AMP parallel loop.

References

^ C++ AMP One-page summary
^ Shevlin Park: Implementing C++ AMP with Clang/LLVM and OpenCL
^ "Bringing C++AMP Beyond Windows via CLANG and LLVM". Retrieved January 9, 2014.

External links

C++ AMP : Language and Programming Model — Version 1.0 : August 2012
Parallel Programming in Native Code - C++ AMP Team Blog
http://hsafoundation.com/bringing-camp-beyond-windows-via-clang-llvm/ C++ AMP Support in CLANG and LLVM compiler
https://github.com/RadeonOpenCompute/hcc C++ AMP Support in CLANG and LLVM compiler

[summary-1] C++ AMP One-page summary

[shevlinPark-2] Shevlin Park: Implementing C++ AMP with Clang/LLVM and OpenCL

[3] "Bringing C++AMP Beyond Windows via CLANG and LLVM". Retrieved January 9, 2014.

[1]

[2]

[3]

v t e Parallel computing
General	Distributed computing Parallel computing Massively parallel Cloud computing High-performance computing Multiprocessing Manycore processor GPGPU Computer network Systolic array
Levels	Bit Instruction Thread Task Data Memory Loop Pipeline
Multithreading	Temporal Simultaneous (SMT) Simultaneous and heterogenous Speculative (SpMT) Preemptive Cooperative Clustered multi-thread (CMT) Hardware scout
Theory	PRAM model PEM model Analysis of parallel algorithms Amdahl's law Gustafson's law Cost efficiency Karp–Flatt metric Slowdown Speedup
Elements	Process Thread Fiber Instruction window Array
Coordination	Multiprocessing Memory coherence Cache coherence Cache invalidation Barrier Synchronization Application checkpointing
Programming	Stream processing Dataflow programming Models Implicit parallelism Explicit parallelism Concurrency Non-blocking algorithm
Hardware	Flynn's taxonomy SISD SIMD Array processing (SIMT) Pipelined processing Associative processing MISD MIMD Dataflow architecture Pipelined processor Superscalar processor Vector processor Multiprocessor symmetric asymmetric Memory shared distributed distributed shared UMA NUMA COMA Massively parallel computer Computer cluster Beowulf cluster Grid computer Hardware acceleration
APIs	Ateji PX Boost Chapel HPX Charm++ Cilk Coarray Fortran CUDA Dryad C++ AMP Global Arrays GPUOpen MPI OpenMP OpenCL OpenHMPP OpenACC Parallel Extensions PVM pthreads RaftLib ROCm UPC TBB ZPL
Problems	Automatic parallelization Deadlock Deterministic algorithm Embarrassingly parallel Parallel slowdown Race condition Software lockout Scalability Starvation
Category: Parallel computing

C++ AMP

Features

See also

References

Further reading

External links