Ceph (software)

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
Ceph logo.png
Original author(s)Inktank Storage (Sage Weil, Yehuda Sadeh Weinraub, Gregory Farnum, Josh Durgin, Samuel Just, Wido den Hollander)
Developer(s)Canonical, CERN, Cisco, Fujitsu, Intel, Red Hat, SanDisk, and SUSE[1]
Stable release
14.2.0 "Nautilus"[2] / 19 March 2019; 3 months ago (2019-03-19)
Preview release
13.1.0 "Mimic"[3] / May 11, 2018; 13 months ago (2018-05-11)
Repository Edit this at Wikidata
Written inC++, Python[4]
Operating systemLinux, FreeBSD[5]
TypeDistributed object store

In computing, Ceph (pronounced /ˈsɛf/ or /ˈkɛf/) is a free-software storage platform, implements object storage on a single distributed computer cluster, and provides interfaces for object-, block- and file-level storage. Ceph aims primarily for completely distributed operation without a single point of failure, scalable to the exabyte level, and freely available.

Ceph replicates data and makes it fault-tolerant,[7] using commodity hardware and requiring no specific hardware support. As a result of its design, the system is both self-healing and self-managing, aiming to minimize administration time and other costs.


A high-level overview of the Ceph's internal organization[8]:4

Ceph employs five distinct kinds of daemons:[8]

  • Cluster monitors (ceph-mon) that keep track of active and failed cluster nodes, cluster configuration, and information about data placement and global cluster state.
  • Object storage devices (ceph-osd) that use a direct, journaled disk storage (named BlueStore,[9] since the v12.x release) or store the content of files in a filesystem (preferably XFS, the storage is named Filestore)[10]
  • Metadata servers (ceph-mds) that cache and broker access to inodes and directories inside a CephFS filesystem.
  • HTTP gateways (ceph-rgw) that expose the object storage layer as an interface compatible with Amazon S3 or OpenStack Swift APIs
  • Managers (ceph-mgr) that perform cluster monitoring, bookkeeping, and maintenance tasks, and interface to external monitoring systems and management (e.g. balancer, dashboard, Prometheus, Zabbix plugin) [11]

All of these are fully distributed, and may run on the same set of servers. Clients with different use cases directly interact with different subsets of them.[12]

Ceph does striping of individual files across multiple nodes to achieve higher throughput, similar to how RAID0 stripes partitions across multiple hard drives. Adaptive load balancing is supported whereby frequently accessed objects are replicated over more nodes.[citation needed] As of September 2017, BlueStore is the default and recommended storage type for production environments,[13] which is Ceph's own storage implementation providing better latency and configurability than the filestore backend, and avoiding the shortcomings of the filesystem based storage involving additional processing and caching layers. The Filestore backend is still considered useful and very stable; XFS is the recommended underlying filesystem type for production environments, while Btrfs is recommended for non-production environments. ext4 filesystems are not recommended because of resulting limitations on the maximum RADOS objects length.[14]

Object storage[edit]

An architecture diagram showing the relations between components of the Ceph storage platform

Ceph implements distributed object storage. Ceph's software libraries provide client applications with direct access to the reliable autonomic distributed object store (RADOS) object-based storage system, and also provide a foundation for some of Ceph's features, including RADOS Block Device (RBD), RADOS Gateway, and the Ceph File System.

The "librados" software libraries provide access in C, C++, Java, PHP, and Python. The RADOS Gateway also exposes the object store as a RESTful interface which can present as both native Amazon S3 and OpenStack Swift APIs.

Block storage[edit]

Ceph's object storage system allows users to mount Ceph as a thin-provisioned block device. When an application writes data to Ceph using a block device, Ceph automatically stripes and replicates the data across the cluster. Ceph's RADOS Block Device (RBD) also integrates with Kernel-based Virtual Machines (KVMs).

Ceph RBD interfaces with the same Ceph object storage system that provides the librados interface and the CephFS file system, and it stores block device images as objects. Since RBD is built on librados, RBD inherits librados's abilities, including read-only snapshots and revert to snapshot. By striping images across the cluster, Ceph improves read access performance for large block device images.

The block device can be virtualized, providing block storage to virtual machines, in virtualization platforms such as Apache CloudStack, OpenStack, OpenNebula, Ganeti, and Proxmox Virtual Environment.

File system[edit]

Ceph's file system (CephFS) runs on top of the same object storage system that provides object storage and block device interfaces. The Ceph metadata server cluster provides a service that maps the directories and file names of the file system to objects stored within RADOS clusters. The metadata server cluster can expand or contract, and it can rebalance the file system dynamically to distribute data evenly among cluster hosts. This ensures high performance and prevents heavy loads on specific hosts within the cluster.

Clients mount the POSIX-compatible file system using a Linux kernel client. On March 19, 2010, Linus Torvalds merged the Ceph client into Linux kernel version 2.6.34[15] which was released on May 16, 2010. An older FUSE-based client is also available. The servers run as regular Unix daemons.


Ceph made its debut at the 2006 USENIX Conference on Operating System Design (OSDI 2006) in a paper by Weil, Brandt, Miller, Long and Maltzahn;[16] a more detailed description was published the following year in Sage Weil's doctoral dissertation.[17]

After his graduation in fall 2007, Weil continued to work on Ceph full-time, and the core development team expanded to include Yehuda Sadeh Weinraub and Gregory Farnum. In 2012, Weil created Inktank Storage for professional services and support for Ceph.[18][19]

In April 2014, Red Hat purchased Inktank, bringing the majority of Ceph development in-house.[20]

In October 2015, the Ceph Community Advisory Board was formed to assist the community in driving the direction of open source software-defined storage technology. The charter advisory board includes Ceph community members from global IT organizations that are committed to the Ceph project, including individuals from Canonical, CERN, Cisco, Fujitsu, Intel, Red Hat, SanDisk, and SUSE.[21]

  • Argonaut – on July 3, 2012, the Ceph development team released Argonaut, the first major "stable" release of Ceph. This release will receive stability fixes and performance updates only, and new features will be scheduled for future releases.[22]
  • Bobtail (v0.56) – on January 1, 2013, the Ceph development team released Bobtail, the second major stable release of Ceph. This release focused primarily on stability, performance, and upgradability from the previous Argonaut stable series (v0.48.x).[23]
  • Cuttlefish (v0.61) – on May 7, 2013, the Ceph development team released Cuttlefish, the third major stable release of Ceph. This release included a number of feature and performance enhancements as well as being the first stable release to feature the 'ceph-deploy' deployment tool in favor of the previous 'mkcephfs' method of deployment.[24]
  • Dumpling (v0.67) – on August 14, 2013, the Ceph development team released Dumpling, the fourth major stable release of Ceph. This release included a first pass at global namespace and region support, a REST API for monitoring and management functions, improved support for Red Hat Enterprise Linux derivatives (RHEL)-based platforms.[25]
  • Emperor (v0.72) – on November 9, 2013, the Ceph development team released Emperor, the fifth major stable release of Ceph. This release brings several new features, including multi-datacenter replication for the radosgw, improved usability, and lands a lot of incremental performance and internal refactoring work to support upcoming features in Firefly.[26]
  • Firefly (v0.80) – on May 7, 2014, the Ceph development team released Firefly, the sixth major stable release of Ceph. This release brings several new features, including erasure coding, cache tiering, primary affinity, key/value OSD backend (experimental), standalone radosgw (experimental).[27]
  • Giant (v0.87) – on October 29, 2014, the Ceph development team released Giant, the seventh major stable release of Ceph.[28]
  • Hammer (v0.94) – on April 7, 2015, the Ceph development team released Hammer, the eighth major stable release of Ceph. It is expected to form the basis of the next long-term stable series. It is intended to supersede v0.80.x Firefly.[29]
  • Infernalis (v9.2.0) – on November 6, 2015, the Ceph development team released Infernalis, the ninth major stable release of Ceph. it will be the foundation for the next stable series. There have been some major changes since v0.94.x Hammer, and the upgrade process is non-trivial.[30]
  • Jewel (v10.2.0) – on April 21, 2016, the Ceph development team released Jewel, the first Ceph release in which CephFS is considered stable. The CephFS repair and disaster recovery tools are feature-complete (bidirectional failover, active/active configurations), some functionalities are disabled by default. This release includes new experimental RADOS backend named BlueStore which is planned to be the default storage backend in the upcoming releases.[31]
  • Kraken (v11.2.0) – on January 20, 2017, the Ceph development team released Kraken. The new BlueStore storage format, introduced in Jewel, has now a stable on-disk format and is part of the test suite. Despite still marked as experimental, BlueStore is near-production ready, and should be marked as such in the next release, Luminous.[32]
  • Luminous (v12.2.0) – on August 29, 2017, the Ceph development team released Luminous.[13] Among other features the BlueStore storage format (using the raw disk instead of a filesystem) is now considered stable and recommended for use.
  • Mimic (v13.2.0) – on June 1, 2018, the Ceph development team released Mimic.[33] With the release of Mimic, snapshots are now stable when combined with multiple MDS daemons, and the RESTful gateways frontend Beast is now declared stable and ready for production use.
  • Nautilus (v14.2.0) – on March 19, 2019, the Ceph development team released Nautilus.[34]


The name "Ceph" is an abbreviation of "cephalopod", a class of molluscs that includes the octopus. The name (emphasized by the logo) suggests the highly parallel behavior of an octopus and was chosen to connect the file system with UCSC's mascot, a banana slug called "Sammy".[8] Both cephalopods and banana slugs are molluscs.

See also[edit]


  1. ^ "Ceph Community Forms Advisory Board". 2015-10-28. Retrieved 2016-01-20.
  2. ^ "v14.2.0 Nautilus released".
  3. ^ "v13.1.0 Mimic RC1 released".
  4. ^ "GitHub Repository".
  5. ^ "FreeBSD Quarterly Status Report".
  6. ^ "LGPL2.1 license file in the Ceph sources". 2014-10-24. Retrieved 2014-10-24.
  7. ^ Jeremy Andrews (2007-11-15). "Ceph Distributed Network File System". KernelTrap. Archived from the original on 2007-11-17. Retrieved 2007-11-15.
  8. ^ a b c M. Tim Jones (2010-06-04). "Ceph: A Linux petabyte-scale distributed file system" (PDF). IBM. Retrieved 2014-12-03.
  9. ^ "BlueStore". Ceph. Retrieved 2017-09-29.
  10. ^ "Hard Disk and File System Recommendations". Retrieved 2017-03-17.
  11. ^ "Ceph Manager Daemon — Ceph Documentation". docs.ceph.com. Retrieved 2019-01-31.
  12. ^ Jake Edge (2007-11-14). "The Ceph filesystem". LWN.net.
  13. ^ a b Sage Weil (2017-08-29). "v12.2.0 Luminous Released". Ceph Blog.
  14. ^ "Hard Disk and File System Recommendations". ceph.com. Retrieved 2017-06-26.
  15. ^ Sage Weil (2010-02-19). "Client merged for 2.6.34". ceph.newdream.net.
  16. ^ 1. "Ceph: A scalable, high-performance distributed file system,SA Weil, SA Brandt, EL Miller, DDE Long, C Maltzahn, Proc. OSDI 2006
  17. ^ Sage Weil (2007-12-01). "Ceph: Reliable, Scalable, and High-Performance Distributed Storage" (PDF). University of California, Santa Cruz.
  18. ^ Bryan Bogensberger (2012-05-03). "And It All Comes Together". Inktank Blog. Archived from the original on 2012-07-19. Retrieved 2012-07-10.
  19. ^ Joseph F. Kovar (July 10, 2012). "The 10 Coolest Storage Startups Of 2012 (So Far)". CRN. Retrieved July 19, 2013.
  20. ^ Red Hat Inc (2014-04-30). "Red Hat to Acquire Inktank, Provider of Ceph". Red Hat. Retrieved 2014-08-19.
  21. ^ "Ceph Community Forms Advisory Board". 2015-10-28. Retrieved 2016-01-20.
  22. ^ Sage Weil (2012-07-03). "v0.48 "Argonaut" Released". Ceph Blog.
  23. ^ Sage Weil (2013-01-01). "v0.56 Released". Ceph Blog.
  24. ^ Sage Weil (2013-05-17). "v0.61 "Cuttlefish" Released". Ceph Blog.
  25. ^ Sage Weil (2013-08-14). "v0.67 Dumpling Released". Ceph Blog.
  26. ^ Sage Weil (2013-11-09). "v0.72 Emperor Released". Ceph Blog.
  27. ^ Sage Weil (2014-05-07). "v0.80 Firefly Released". Ceph Blog.
  28. ^ Sage Weil (2014-10-29). "v0.87 Giant Released". Ceph Blog.
  29. ^ Sage Weil (2015-04-07). "v0.94 Hammer Released". Ceph Blog.
  30. ^ Sage Weil (2015-11-06). "v9.2.0 Infernalis Released". Ceph Blog.
  31. ^ Sage Weil (2016-04-21). "v10.2.0 Infernalis Released". Ceph Blog.
  32. ^ Abhishek L (2017-01-20). "v11.2.0 Kraken Released". Ceph Blog.
  33. ^ Abhishek L (2018-06-01). "v13.2.0 Mimic Released". Ceph Blog.
  34. ^ "v14.2.0 Nautilus released".

Further reading[edit]

External links[edit]