ROCm
Developer(s) | AMD |
---|---|
Initial release | November 14, 2016 |
Stable release | 5.7
/ September 16, 2023[1] |
Repository | Meta-repository github |
Written in | C, C++, Python, Fortran, Julia |
Middleware | HIP |
Engine | AMDgpu kernel driver, HIPCC, a LLVM-based compiler |
Operating system | Linux, Windows[2] |
Platform | Supported GPUs |
Predecessor | Close to metal, Stream, HSA |
Size | <2 GiB |
Type | GPGPU libraries and APIs |
License | MIT License |
Website | www |
ROCm[3] is an Advanced Micro Devices (AMD) software stack for graphics processing unit (GPU) programming. ROCm spans several domains: general-purpose computing on graphics processing units (GPGPU), high performance computing (HPC), heterogeneous computing. It offers several programming models: HIP (GPU-kernel-based programming), OpenMP/Message Passing Interface (MPI) (directive-based programming), OpenCL.
ROCm is free, libre and open-source software (except the GPU firmware blobs[4]), it is distributed under various licenses. ROCm is short for Radeon Open Compute platform.
Background
The first GPGPU software stack from ATI/AMD was Close to Metal, which became Stream.
ROCm was launched around 2016[5] with the Boltzmann Initiative.[6] ROCm stack builds upon previous AMD GPU stacks, some tools trace back to GPUOpen, others to the Heterogeneous System Architecture (HSA).
Heterogeneous System Architecture Intermediate Language
HSAIL[7] was aimed at producing a middle-level, hardware-agnostic intermediate representation, that could be JIT-compiled to the eventual hardware (GPU, FPGA...) using the appropriate finalizer. This approach was dropped for ROCm: now it builds only GPU code, using LLVM, and its AMDGPU backend that was upstreamed,[8] although there is still research on such enhanced modularity with LLVM MLIR.[9]
Programming abilities
This section needs expansion. You can help by adding to it. (January 2022) |
ROCm as a stack ranges from the kernel driver to the end-user applications. AMD has introductory videos about AMD GCN hardware,[10] and ROCm programming[11] via its learning portal.[12]
One of the best technical introductions about the stack and ROCm/HIP programming, remains, to date, to be found on Reddit.[13]
High-level programming
HIP programming
HIP(HCC) kernel language
Memory allocation
NUMA
Heterogeneous Memory Model and Shared Virtual Memory
ROCm code objects
Compute/Graphics interop
Low-level programming
Hardware support
ROCm is primarily targeted at discrete professional GPUs,[14] but unofficial support includes Vega-family and RDNA 2 consumer GPUs.
Accelerated Processor Units (APU) are "enabled", but not officially supported. Having ROCm functional there is involved.[15]
Professional-grade GPUs
AMD Instinct accelerators are the first-class ROCm citizens, alongside the prosumer Radeon Pro GPU series: they mostly see full support.
The only consumer-grade GPU that has relatively equal support is, as of January 2022, the Radeon VII (GCN 5 - Vega).
Consumer-grade GPUs
Name of GPU series | Southern Islands |
Sea Islands |
Volcanic Islands |
Arctic Islands/Polaris |
Vega | Navi 1X | Navi 2X | |
---|---|---|---|---|---|---|---|---|
Released | Jan 2012 | Sep 2013 | Jun 2015 | Jun 2016 | Jun 2017 | Jul 2019 | Nov 2020 | |
Marketing Name | Radeon HD 7000 | Radeon Rx 200 | Radeon Rx 300 | Radeon RX 400/500 | Radeon RX Vega/Radeon VII(7 nm) | Radeon RX 5000 | Radeon RX 6000 | |
AMD support | ||||||||
Instruction set | GCN instruction set | RDNA instruction set | ||||||
Microarchitecture | GCN 1st gen | GCN 2nd gen | GCN 3rd gen | GCN 4th gen | GCN 5th gen | RDNA | RDNA 2 | |
Type | Unified shader model | |||||||
ROCm[16] | [17] | [18] | ||||||
OpenCL | 1.2 (on Linux: 1.1 (no Image support) with Mesa 3D) | 2.0 (Adrenalin driver on Win7+) (on Linux: 1.1 (no Image support) with Mesa 3D, 2.0 with AMD drivers or AMD ROCm) |
2.0 | 2.1[19] | ||||
Vulkan | 1.0 (Win 7+ or Mesa 17+) |
1.2 (Adrenalin 20.1, Linux Mesa 3D 20.0) | ||||||
Shader model | 5.1 | 5.1 6.3 |
6.4 | 6.5 | ||||
OpenGL | 4.6 (on Linux: 4.6 (Mesa 3D 20.0)) | |||||||
Direct3D | 11 (11_1) 12 (11_1) |
11 (12_0) 12 (12_0) |
11 (12_1) 12 (12_1) |
11 (12_1) 12 (12_2) | ||||
/drm/amdgpu [a]
|
Experimental[20] |
- ^ DRM (Direct Rendering Manager) is a component of the Linux kernel.
Software ecosystem
Learning resources
This section needs expansion. You can help by adding to it. (January 2022) |
AMD ROCm product manager Terry Deem gave a tour of the stack.[21]
Third-party integration
The main consumers of the stack are machine learning and high-performance computing/GPGPU applications.
Machine learning
Various Deep Learning frameworks have a ROCm backend:[22]
- PyTorch
- TensorFlow
- ONNX
- MXNet
- CuPy[23]
- MIOpen
- Caffe
- Iree (which uses LLVM Multi-Level Intermediate Representation (MLIR))
- llama.cpp
Supercomputing
ROCm is gaining significant traction in the top 500.[24] ROCm is used with the Exascale supercomputers ElCapitan[25][26] and Frontier.
Some related software is to be found at AMD Infinity hub.
Other acceleration & graphics interoperation
As of version 3.0, Blender can now use HIP compute kernels for its renderer Cycles.[27]
Other Languages
Julia
Julia has the AMDGPU.jl package,[28] which integrates with LLVM and selects components of the ROCm stack. Instead of compiling code through HIP, AMDGPU.jl uses Julia's compiler to generate LLVM IR directly, which is later consumed by LLVM to generate native device code. AMDGPU.jl uses ROCr's HSA implementation to upload native code onto the device and execute it, similar to how HIP loads its own generated device code.
AMDGPU.jl also supports integration with ROCm's rocBLAS (for BLAS), rocRAND (for random number generation), and rocFFT (for FFTs). Future integration with rocALUTION, rocSOLVER, MIOpen, and certain other ROCm libraries is planned.
Software distribution
Official
ROCm software is currently spread across dozens of public GitHub repositories. Within the main public meta-repository, there is an xml manifest for each official release: using git-repo, a version control tool built on top of git, is the recommended way to synchronize with the stack locally.[29]
Stack area | Public GitHub organisation |
---|---|
Low-level (mostly) | https://github.com/radeonopencompute |
Mid-level (mostly) | https://github.com/rocm-developer-tools |
High-level (mostly) | https://github.com/rocmsoftwareplatform/ |
AMD starts distributing containerized applications for ROCm, notably scientific research applications gathered under AMD Infinity Hub.[30]
AMD distributes itself packages tailored to various Linux distributions.
Third-party
There is a growing third-party ecosystem packaging ROCm.
Linux distributions are packaging officially (natively) ROCm, with various degrees of advancement: Arch,[31] Gentoo,[32] Debian and Fedora,[33] GNU Guix, NixOS.
There are spack packages.[34]
Components
This section needs expansion. You can help by adding to it. (January 2022) |
There is one kernel-space component, ROCk, and the rest - there is roughly a hundred components in the stack - is made of user-space modules.
The unofficial typographic policy is to use: uppercase ROC lowercase following for low-level libraries, i.e. ROCt, and the contrary for user-facing libraries, i.e. rocBLAS.[35]
AMD is active developing with the LLVM community, but upstreaming is not instantaneous, and as of January 2022, still lagging.[36] AMD still packages officially various LLVM forks[37][38][9] for parts that are not yet upstreamed – compiler optimizations destined to remain proprietary, debug support, OpenMP offloading...
Low-level
ROCk – Kernel driver
ROCm – Device libraries
Support libraries implemented as LLVM bitcode. These provide various utilities and functions for math operations, atomics, queries for launch parameters, on-device kernel launch, etc.
ROCt – Thunk
The thunk is responsible for all the thinking and queuing that goes into the stack.
ROCr – Runtime
The ROC runtime is a set of APIs/libraries that allows the launch of compute kernals by host applications. It is AMD's implementation of the HSA runtime API.[39] It is different from the ROC Common Language Runtime.
ROCm – CompilerSupport
ROCm code object manager is in charge of interacting with LLVM intermediate representation.
Mid-level
ROCclr Common Language Runtime
The common language runtime is an indirection layer adapting calls to ROCr on linux and PAL on windows. It used to be able to route between different compilers like the HSAIL-compiler. It is now being absorbed by the upper indirection layers (HIP, OpenCL).
OpenCL
ROCm ships its Installable Client Driver ICD loader and an OpenCL[40] implementation bundled together. As of January 2022, ROCm 4.5.2 ships OpenCL 2.2, and is lagging behind competition.[41]
The AMD implementation for its GPUs is called HIPAMD. There is also a CPU implementation mostly for demonstration purposes.
HIPCC
HIP builds a `HIPCC` compiler that either wraps Clang and compiles with LLVM open AMDGPU backend, or redirects to the NVIDIA compiler.[42]
HIPIFY
HIPIFY is a source-to-source compiling tool, it translates CUDA to HIP and reverse, either using a clang-based tool, or a sed-like Perl script.
GPUFORT
Like HIPIFY, GPUFORT is a tool compiling source code into other third-generation-language sources, allowing users to migrate from CUDA Fortran to HIP Fortran. It is also in the repertoire of research projects, even more so.[43]
High-level
ROCm high-level libraries are usually consumed directly by application software, such as machine learning frameworks. Most of the following libraries are in the General Matrix Multiply (GEMM) category, which GPU architecture excels at.
The majority of these user-facing libraries comes in dual-form: hip for the indirection layer that can route to Nvidia hardware, and roc for AMD implementation.[44]
rocBLAS / hipBLAS
rocBLAS and hipBLAS are central in high-level libraries, it is the AMD implementation for Basic Linear Algebra Subprograms. It uses the library Tensile privately.
rocSOLVER / hipSOLVER
This pair of libraries constitutes the LAPACK implementation for ROCm and is strongly coupled to rocBLAS.
Utilities
- ROCm developer tools: Debug, tracer, profiler, System Management Interface, Validation suite, Cluster management.
- GPUOpen tools: GPU analyzer, memory visualizer...
- External tools: radeontop (TUI overview)
Comparison with competitors
ROCm is a competitor to similar stacks aimed at GPU computing: Nvidia CUDA and Intel OneAPI. Support is only offered to professional hardware. The software stack requires an extremely narrow set of libraries/system configuration. In that regard, this software stack is quite fragile: any change to kernels, libraries or any configuration can render the entire stack unusable. Also, there is no set/dedicated place on the internet where there are a 'definitive' set of intallation instructions. There are thousands of places on the internet claiming to be 'official', the majority of which offer outdated installations that will prove for the end user a waste of time/effort. As no official support is offered for this stack, there is no recourse to this problem.
NVidia CUDA
Nvidia is close-source until cuBLAS and such high-level libraries.
Nvidia vendors the Clang frontend and its Parallel Thread Execution (PTX) LLVM GPU backend as the Nvidia CUDA Compiler (NVCC).
There is an open-source layer above it, for example RAPIDS.
Intel OneAPI
oneAPI is open source, and all the corresponding libraries are published on its GitHub Page.
See also
- AMD Software – a general overview of AMD's drivers, APIs, and development endeavors.
- GPUOpen – AMD's complementary graphics stack
- AMD Radeon Software – AMD's software distribution channel
References
- ^ "ROCm 5.7 release". GitHub. September 16, 2023. Retrieved September 27, 2023.
- ^ "New HIP SDK helps democratize GPU Computing".
- ^ "Question: What does ROCm stand for? · Issue #1628 · RadeonOpenCompute/ROCm". Github.com. Retrieved January 18, 2022.
- ^ "Debian -- Details of package firmware-amd-graphics in buster". Packages.debian.org. Retrieved January 18, 2022.
- ^ "AMD @ SC16: Radeon Open Compute Platform (ROCm) 1.3 Released, Boltzmann Comes to Fruition". anandtech.com. Retrieved January 19, 2022.
- ^ "AMD @ SC15: Boltzmann Initiative Announced - C++ and CUDA Compilers for AMD GPUs". anandtech.com. Retrieved January 19, 2022.
- ^ "HSA Programmer's Reference Manual: HSAIL Virtual ISA and Programming Model, Compiler Writer, and Object Format (BRIG)" (PDF). HSA Foundation. May 2, 2018. Retrieved August 1, 2023.
- ^ "User Guide for AMDGPU Backend — LLVM 13 documentation". Llvm.org. Retrieved January 18, 2022.
- ^ a b "The LLVM Compiler Infrastructure". GitHub. January 19, 2022.
- ^ "Introduction to AMD GPU Hardware" – via www.youtube.com.
- ^ "Fundamentals of HIP Programming". Archived from the original on February 7, 2023.
- ^ "ROCm™ Learning Center". AMD.
- ^ "AMD ROCm / HCC programming: Introduction". December 26, 2018.
- ^ "AMD Documentation - Portal".
- ^ "Here's something you don't see every day: PyTorch running on top of ROCm on a 6800M (6700XT) laptop! Took a ton of minor config tweaks and a few patches but it actually functionally works. HUGE!". December 10, 2021.
- ^ "ROCm Getting Started Guide v5.2.3".
- ^ "HOW-TO: Stable Diffusion on an AMD GPU".
- ^ "Any update on 5700 Xt support?".
- ^ "AMD Radeon RX 6800 XT Specs". TechPowerUp. Retrieved January 1, 2021.
- ^ Larabel, Michael (December 7, 2016). "The Best Features of the Linux 4.9 Kernel". Phoronix. Retrieved December 7, 2016.
- ^ "ROCm presentation". HPCwire.com. July 6, 2020. Retrieved January 18, 2022.
- ^ "AMD Introduces Its Deep-Learning Accelerator Instinct MI200 Series GPUs". Infoq.com. Retrieved January 18, 2022.
- ^ "Installation".
- ^ "AMD Chips Away at Intel in World's Top 500 Supercomputers as GPU War Looms". November 16, 2020.
- ^ "El Capitan Supercomputer Detailed: AMD CPUs & GPUs to Drive 2 Exaflops of Compute".
- ^ "Livermore's el Capitan Supercomputer to Debut HPE 'Rabbit' Near Node Local Storage". February 18, 2021.
- ^ "Blender 3.0 takes support for AMD GPUs to the next level. Beta support available now!". Gpuopen.com. November 15, 2021. Retrieved January 18, 2022.
- ^ "AMD ROCm ⋅ JuliaGPU". juliagpu.org.
- ^ "ROCm Installation v4.3 — ROCm 4.5.0 documentation". Rocmdocs.amd.com. Retrieved January 18, 2022.
- ^ "Running Scientific Applications on AMD Instinct Accelerators Just Got Easier". HPCwire.com. October 18, 2021. Retrieved January 25, 2022.
- ^ "ROCm for Arch Linux". Github.com. January 17, 2022. Retrieved January 18, 2022.
- ^ "Gentoo Linux Packages Up AMD ROCm, Makes Progress On RISC-V, LTO+PGO Python". Phoronix.com. Retrieved January 18, 2022.
- ^ "Fedora & Debian Developers Look At Packaging ROCm For Easier Radeon GPU Computing Experience". Phoronix.com. Retrieved January 18, 2022.
- ^ Gamblin, Todd; LeGendre, Matthew; Collette, Michael R.; Lee, Gregory L.; Moody, Adam; de Supinski, Bronis R.; Futral, Scott (November 15, 2015). "The Spack Package Manager: Bringing Order to HPC Software Chaos" – via GitHub.
- ^ Bloor, Cordell. "20211221 Packaging session notes and small update". debian-ai@lists.debian.org (Mailing list). Retrieved January 18, 2022.
- ^ "[Debian official packaging] How is ROCm LLVM fork still needed? · Issue #2449 · ROCm-Developer-Tools/HIP". GitHub.
- ^ "Aomp - V 14.0-1". GitHub. January 22, 2022.
- ^ "The LLVM Compiler Infrastructure". GitHub. January 10, 2022.
- ^ "HSA Runtime Programmer's Reference Manual" (PDF). HSA Foundation. May 2, 2018. Retrieved August 1, 2023.
- ^ "Khronos OpenCL Registry - The Khronos Group Inc". www.khronos.org.
- ^ "List of OpenCL Conformant Products - The Khronos Group Inc". www.khronos.org. February 3, 2022.
- ^ "Figure 3. HIPCC compilation process illustration. The clang compiler".
- ^ "AMD Publishes Open-Source "GPUFORT" as Newest Effort to Help Transition Away from CUDA".
- ^ Maia, Julio; Chalmers, Noel; T. Bauman, Paul; Curtis, Nicholas; Malaya, Nicholas; McDougall, Damon; van Oostrum, Rene; Wolfe, Noah (May 2021). ROCm Library Support & Profiling Tools (PDF). AMD.
External links
- "ROCm official documentation". AMD. February 10, 2022.
- "ROCm Learning Center". AMD. January 25, 2022.
- "ROCm official documentation on the github super-project". AMD. January 25, 2022.
- "ROCm official documentation - pre 5.0". AMD. January 19, 2022.
- "GPU-Accelerated Applications with AMD Instinct Accelerators & AMD ROCm Software" (PDF). AMD. January 25, 2022.
- "AMD Infinity Hub". AMD. January 25, 2022. — Docker containers for scientific applications.