Page protected with pending changes

DeepSpeed

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

DeepSpeed
Original author(s)Microsoft Research
Developer(s)Microsoft
Initial releaseMay 18, 2020; 13 months ago (2020-05-18)
Stable release
v0.3.16 / April 30, 2021; 53 days ago (2021-04-30)
Repositorygithub.com/microsoft/DeepSpeed
Written inPython, CUDA, C++
TypeSoftware library
LicenseMIT License
Websitedeepspeed.ai

DeepSpeed is an open source deep learning optimization library for PyTorch.[1] The library is designed to reduce computing power and memory use and to train large distributed models with better parallelism on existing computer hardware.[2][3] DeepSpeed is optimized for low latency, high throughput training. It includes the Zero Redundancy Optimizer (ZeRO) for training models with 100 billion or more parameters.[4] Features include mixed precision training, single-GPU, multi-GPU, and multi-node training as well as custom model parallelism. The DeepSpeed source code is licensed under MIT License and available on GitHub.[5]

The team claimed to achieve up to a 6.2x throughput improvement, 2.8x faster convergence, and 4.6x less communication.[6]

See also[edit]

References[edit]

  1. ^ "Microsoft Updates Windows, Azure Tools with an Eye on The Future". PCMag UK. May 22, 2020.
  2. ^ Yegulalp, Serdar (February 10, 2020). "Microsoft speeds up PyTorch with DeepSpeed". InfoWorld.
  3. ^ "Microsoft unveils "fifth most powerful" supercomputer in the world". Neowin.
  4. ^ "Microsoft trains world's largest Transformer language model". February 10, 2020.
  5. ^ "microsoft/DeepSpeed". July 10, 2020 – via GitHub.
  6. ^ "DeepSpeed: Accelerating large-scale model inference and training via system optimizations and compression". Microsoft Research. 2021-05-24. Retrieved 2021-06-19.

Further reading[edit]

External links[edit]