Slurm Workload Manager
This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages)
|
Stable release | 18.08.1, 17.11.10
|
---|---|
Repository | |
Written in | C |
Operating system | Linux, BSDs |
Type | Job Scheduler for Clusters and Supercomputers |
License | GNU General Public License |
Website | slurm |
The Slurm Workload Manager (formerly known as Simple Linux Utility for Resource Management or SLURM), or Slurm, is a free and open-source job scheduler for Linux and Unix-like kernels, used by many of the world's supercomputers and computer clusters.
It provides three key functions:
- allocating exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work,
- providing a framework for starting, executing, and monitoring work (typically a parallel job such as MPI) on a set of allocated nodes, and
- arbitrating contention for resources by managing a queue of pending jobs.
Slurm is the workload manager on about 60% of the TOP500 supercomputers. [citation needed]
Slurm uses a best fit algorithm based on Hilbert curve scheduling or fat tree network topology in order to optimize locality of task assignments on parallel computers.[1]
History
Slurm began development as a collaborative effort primarily by Lawrence Livermore National Laboratory, SchedMD,[2] Linux NetworX, Hewlett-Packard, and Groupe Bull as a Free Software resource manager. It was inspired by the closed source Quadrics RMS and shares a similar syntax. The name is a reference to the soda in Futurama.[3] Over 100 people around the world have contributed to the project. It has since evolved into a sophisticated batch scheduler capable of satisfying the requirements of many large computer centers.
As of November 2017[update], TOP500 list of most powerful computers in the world indicates that Slurm is the workload manager on six of the top ten systems including the number 1 system, Sunway TaihuLight with 10,649,600 computing cores.
Structure
Slurm's design is very modular with about 100 optional plugins. In its simplest configuration, it can be installed and configured in a couple of minutes. More sophisticated configurations provide database integration for accounting, management of resource limits and workload prioritization.
Notable features
Notable Slurm features include the following:[citation needed]
- No single point of failure, backup daemons, fault-tolerant job options
- Highly scalable (schedules up to 100,000 independent jobs on the 100,000 sockets of IBM Sequoia)
- High performance (up to 1000 job submissions per second and 600 job executions per second)
- Free and open-source software (GNU General Public License)
- Highly configurable with about 100 plugins
- Fair-share scheduling with hierarchical bank accounts
- Preemptive and gang scheduling (time-slicing of parallel jobs)
- Integrated with database for accounting and configuration
- Resource allocations optimized for network topology and on-node topology (sockets, cores and hyperthreads)
- Advanced reservation
- Idle nodes can be powered down
- Different operating systems can be booted for each job
- Scheduling for generic resources (e.g. Graphics processing unit)
- Real-time accounting down to the task level (identify specific tasks with high CPU or memory usage)
- Resource limits by user or bank account
- Accounting for power usage by job
- Support of IBM Parallel Environment (PE/POE)
- Support for job arrays
- Job profiling (periodic sampling of each tasks CPU use, memory use, power consumption, network and file system use)
- Accounting for a job's power consumption
- Sophisticated multifactor job prioritization algorithms
- Support for MapReduce+
The following features are announced for version 14.11 of Slurm, was released in November 2014:[4]
- Improved job array data structure and scalability
- Support for heterogeneous generic resources
- Add user options to set the CPU governor
- Automatic job requeue policy based on exit value
- Report API use by user, type, count and time consumed
- Communication gateway nodes improve scalability
Supported platforms
Slurm is primarily developed to work alongside Linux distributions, although there is also support for a few other POSIX-based operating systems, including BSDs (FreeBSD, NetBSD and OpenBSD).[5] Slurm also supports several unique computer architectures, including:
- IBM BlueGene/Q models, including the 20 petaflop IBM Sequoia
- Cray XT, XE and Cascade
- Tianhe-2 a 33.9 petaflop system with 32,000 Intel Ivy Bridge chips and 48,000 Intel Xeon Phi chips with a total of 3.1 million cores
- IBM Parallel Environment
- Anton
License
Slurm is available under the GNU General Public License V2.
Commercial support
In 2010, the developers of Slurm founded SchedMD, which maintains the canonical source, provides development, level 3 commercial support and training services. Commercial support is also available from Bright Computing, Bull, Cray, and Science + Computing.
See also
- Job Scheduler and Batch Queuing for Clusters
- Beowulf cluster
- Maui Cluster Scheduler
- Open Source Cluster Application Resources (OSCAR)
- TORQUE
- Univa Grid Engine
References
- ^ Pascual, Jose Antonio; Navaridas, Javier; Miguel-Alonso, Jose (2009). Effects of Topology-Aware Allocation Policies on Scheduling Performance. Job Scheduling Strategies for Parallel Processing. Lecture Notes in Computer Science. Vol. 5798. pp. 138–144. doi:10.1007/978-3-642-04633-9_8. ISBN 978-3-642-04632-2.
- ^ "Slurm Commercial Support, Development, and Installation". SchedMD. Retrieved 2014-02-23.
- ^ "SLURM: Simple Linux Utility for Resource Management" (PDF). 23 June 2003. Retrieved 11 January 2016.
- ^ "Slurm - What's New". SchedMD. Retrieved 2014-08-29.
- ^ Slurm Platforms
Further reading
- Balle, Susanne M.; Palermo, Daniel J. (2008). Enhancing an Open Source Resource Manager with Multi-core/Multi-threaded Support. Job Scheduling Strategies for Parallel Processing. Lecture Notes in Computer Science. Vol. 4942. p. 37. doi:10.1007/978-3-540-78699-3_3. ISBN 978-3-540-78698-6.
- Jette, M.; Grondona, M. (June 2003). "SLURM: Simple Linux Utility for Resource Management" (PDF). Proceedings of ClusterWorld Conference and Expo. San Jose, California.
- Layton, Jeffrey B. (5 February 2009). "Caos NSA and Perceus: All-in-one Cluster Software Stack". Linux Magazine.
- Yoo, Andy B.; Jette, Morris A.; Grondona, Mark (2003). SLURM: Simple Linux Utility for Resource Management. Job Scheduling Strategies for Parallel Processing. Lecture Notes in Computer Science. Vol. 2862. p. 44. doi:10.1007/10968987_3. ISBN 978-3-540-20405-3.
SLURM Commands
The following is a list of useful commands available for SLURM. Some of these were built by CCR to allow easier reporting for users.
For usage information for these commands, use --help (example: sinfo --help)
Use the linux command 'man' for more information about most of these commands (example: man sinfo)
Bold-italicized font on the commands below indicates user supplied information. Brackets indicate optional flags.
List SLURM commands | slurmhelp |
---|---|
[View information about SLURM nodes & partitions ] | sinfo [-p partition_name or -M cluster_name] |
[List example SLURM scripts | ls -p /util/slurm-scripts less |
[Submit a job script for later execution | sbatch 'script-file |
[Cancel a pending or running job | scancel jobid |
[Check the state of a user’s jobs | squeue --user=username |
[Allocate compute nodes for interactive use | salloc |
[Run a command on allocated compute nodes | srun |
[Display node information | snodes [node cluster/partition state] |
[Launch an interactive job | fisbatch [various sbatch options] |
[List priorities of queued jobs | sranks |
[Get the efficiency of a running job | sueff user-name |
[Get SLURM accounting information for a user’s jobs from start date to now | suacct start-date user-name |
[Get SLURM accounting and node information for a job | slist jobid |
[Get resource usage and accounting information for a user’s jobs from start date to now | slogs start-date user-list |
[Get estimated starting times for queued jobs | stimes [various squeue options] |
[Monitor performance of a SLURM job | /util/ccrjobvis/slurmjobvis jobid |