Comparison of cluster software

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

The following tables compare general and technical information for notable computer cluster software. This software can be grossly separated in four categories: Job scheduler, nodes management, nodes installation and integrated stack (all the above).

General information[edit]

Software Maintainer Category Development status ArchitectureOCS High-Performance/ High-Throughput Computing License Platforms supported Cost Paid support available
Amoeba MIT
Base One Foundation Component Library Proprietary
DIET INRIA, SysFera, Open Source All in one GridRPC, SPMD, Hierarchical and distributed architecture, CORBA HTC/HPC CeCILL Unix-like, Mac OS X, AIX Free
Enduro/X Mavimax, Ltd. Job/Data Scheduler actively developed SOA Grid HTC/HPC/HA GPLv2 or Commercial Linux, FreeBSD, MacOS, Solaris, AIX Free / Cost Yes
Ganglia Monitoring actively developed BSD Unix, Linux, Windows NT/XP/2000/2003/2008, FreeBSD, NetBSD, OpenBSD, DragonflyBSD, Mac OS X, Solaris, AIX, IRIX, Tru64, HPUX. Free
Globus Toolkit Globus Alliance, Argonne National Laboratory Job/Data Scheduler actively developed SOA Grid Linux Free
Grid MP Univa (formerly United Devices) Job Scheduler no active development Distributed master/worker HTC/HPC Proprietary Windows, Linux, Mac OS X, Solaris Cost
Apache Mesos Apache actively developed Apache license v2.0 Linux Free Yes
Moab Cluster Suite Adaptive Computing Job Scheduler actively developed HPC Proprietary Linux, Mac OS X, Windows, AIX, OSF/Tru-64, Solaris, HP-UX, IRIX, FreeBSD & other UNIX platforms Cost Yes
NetworkComputer Runtime Design Automation actively developed HTC/HPC Proprietary Unix-like, Windows Cost
OpenHPC OpenHPC project all in one actively developed HPC Linux (CentOS) Free No
OpenLava Teraproc Job Scheduler actively developed Master/Worker, multiple admin/submit nodes HTC/HPC GPL Linux Free Yes
PBS Pro Altair Job Scheduler actively developed Master/worker distributed with fail-over HPC/HTC AGPL or Proprietary Linux, Windows Free or Cost Yes
Platform LSF IBM Platform Job Scheduler actively developed HPC/HTC Proprietary Unix, Linux, Windows Cost
Rocks Cluster Distribution Open Source/NSF grant All in one actively developed HTC/HPC OpenSource CentOS Free
Popular Power
ProActive INRIA, ActiveEon, Open Source All in one actively developed Master/Worker, SPMD, Distributed Component Model, Skeletons HTC/HPC GPL Unix-like, Windows, Mac OS X Free
RPyC Tomer Filiba actively developed MIT License *nix/Windows Free
SLURM SchedMD Job Scheduler actively developed HPC/HTC GPL Linux/*nix Free Yes
Oracle Grid Engine Univa Job Scheduler active Development moved to Univa Grid Engine Master node/exec clients, multiple admin/submit nodes HPC/HTC Proprietary *nix/Windows Cost
Son of Grid Engine Open Source Job Scheduler Master node/exec clients, multiple admin/submit nodes HPC/HTC Various Linux
SynfiniWay Fujitsu actively developed HPC/HTC ? Unix, Linux, Windows Cost
TORQUE Resource Manager Adaptive Computing Job Scheduler actively developed Proprietary Linux, *nix Cost Yes
UniCluster Univa All in One Functionality and development moved to UniCloud (see above) Free Yes
UNICORE
Univa Grid Engine Univa Job Scheduler actively developed Master node/exec clients, multiple admin/submit nodes HPC/HTC Proprietary *nix/Windows Cost
Xgrid Apple Computer
Software Maintainer Category Development status Architecture High-Performance/ High-Throughput Computing License Platforms supported Cost Paid support available

Table explanation

  • Software: The name of the application that is described

Technical information[edit]

Software Implementation Language Authentication Encryption Integrity Global File System Global File System + Kerberos Heterogeneous/ Homogeneous exec node Jobs priority Group priority Queue type SMP aware Max exec node Max job submitted CPU scavenging Parallel job Job checkpointing
Enduro/X C/C++ OS Authentication GPG, AES-128, SHA1 None Any cluster Posix FS (gfs, gpfs, ocfs, etc.) Any cluster Posix FS (gfs, gpfs, ocfs, etc.) Heterogeneous OS Nice level OS Nice level SOA Queues, FIFO Yes OS Limits OS Limits Yes Yes No
HTCondor C++ GSI, SSL, Kerberos, Password, File System, Remote File System, Windows, Claim To Be, Anonymous None, Triple DES, BLOWFISH None, MD5 None, NFS, AFS Not official, hack with ACL and NFS4 Heterogeneous Yes Yes Fair-share with some programmability basic (hard separation into different node) tested ~10000? tested ~100000? Yes MPI, OpenMP, PVM Yes
PBS Pro C/Python OS Authentication, Munge Any, e.g., NFS, Lustre, GPFS, AFS Limited availability Heterogeneous Yes Yes Fully configurable Yes tested ~50,000 Millions Yes MPI, OpenMP Yes
OpenLava C/C++ OS authentication None NFS Heterogeneous Linux Yes Yes Configurable Yes Yes, supports preemption based on priority Yes Yes
Platform LSF yes Yes to start jobs. Did it suspend job when the person come back? Yes
Slurm C Munge, None, Kerberos Heterogeneous Yes Yes Multifactor Fair-share yes tested 120k tested 100k No Yes Yes
Torque C SSH, munge None, any Heterogeneous Yes Yes Programmable Yes tested tested Yes Yes Yes
Univa Grid Engine C OS Authentication/Kerberos/Oauth2 Certificate Based Integrity Arbitrary, e.g. NFS, Lustre, HDFS, AFS AFS Fully heterogeneous Yes; automatically policy controlled (e.g. fair-share, deadline, resource dependent) or manual Yes; can be dependent on user groups as well as projects and is governed by policies Batch, interactive, checkpointing, parallel and combinations Yes, with core binding, GPU and Intel Xeon Phi support commercial deployments with many tens of thousands hosts >300K tested in commercial deployments Yes; can suspend job on interactive usage Yes, with support of arbitrary parallel environments such as OpenMPI, MPICH 1/2, MVAPICH 1/2, LAM, etc. Yes, with support for user, kernel or library level checkpointing environments
Software programation language Authentication Encryption Integrity Global File System Global File System + Kerberos Heterogeneous/ Homogeneous exec node Jobs priority Group priority Queue type SMP aware Max exec node Max job submitted CPU scavenging Parallel job Job checkpointing

Table Explanation

  • Software: The name of the application that is described
  • SMP aware:
    • basic: hard split into multiple virtual host
    • basic+: hard split into multiple virtual host with some minimal/incomplete communication between virtual host on the same computer
    • dynamic: split the resource of the computer(CPU/Ram) on demand

History and adoption[edit]

See also[edit]

Notes[edit]

External links[edit]