dBM (computing)
- For other uses, see DBM (disambiguation)
dbm was the first of a family of simple database engines, originally written by Ken Thompson and released by AT&T in 1979 [1]. The name is a three letter acronym for database manager.
dbm stores arbitrary data by use of a single key (a primary key) in fixed-size buckets and uses hashing techniques to enable fast retrieval of the data by key.
The hashing scheme used is a form of extensible hashing, so that the hashing scheme expands as new buckets are added to the database, meaning that, when nearly empty, the database starts with one bucket, which is then split when it becomes full. The two resulting child buckets will themselves split when they become full, so the database grows as keys are added.
While dbm and its derivatives are pre-relational databases--effectively a hash fixed to disk--in practice they can offer a more practical solution for high-speed storage looked up by-key as they do not require the overhead of connecting and preparing queries. This is balanced by the fact that they can generally only be opened for writing by a single process at a time. While this can be addressed by the use of an agent daemon which can receive signals from multiple processes, this does, in practice, add back some of the overhead (though not all). In simpler terms, they may be old tech but they're fast.
Successors
dbm has had many successors:
- Ndbm: In 1986 Berkeley produced ndbm (standing for New Database Manager). This added support for having multiple databases open concurrently.
- Sdbm: Some versions of unix were excluding ndbm due to licencing issues, so in 1987 Ozan Yigit produced this public domain clone[2].
- Gdbm: Standing for 'GNU Database Manager' this open source version was written by Philip A. Nelson for the GNU project. It added support for arbitrary length data in the database, as previously all data had a fixed maximum length.
- Tdbm: Provided support for atomic transactions.
- TDB: Released by the Samba team, under the GPL (source files in Samba package state LGPL license for TDB). From the Sourceforge page: TDB is a Trivial Database. In concept, it is very much like GDBM, and BSD's DB except that it allows multiple simultaneous writers and uses locking internally to keep writers from trampling on each other. TDB is also extremely small.
- QDBM: 'Quicker Database Manager'. Claims to be quicker, and was released under the LGPL by Mikio Hirabayashi in 2000.
- Berkeley DB: A version that is available under dual license, both copyleft and commercial. It is now supported and maintained by the company Oracle since February 2006.
- JDBM: JDBM is a transactional persistence engine for Java. It aims to be for Java what GDBM is for other languages (C/C++, Python, Perl, etc.)
- Tokyo Cabinet: A modern reimplementation of QDBM, also by Mikio Hirabayashi
- VSDB: An implementation of a dbm-like database written by John Meacham that supports full ACID semantics that places data safety above all. It includes transactions and rollbacks with no locking whatsoever, rather relying on atomic filesystem operations.
References
- Seltzer & Yigit. "A New Hashing Package for Unix".
- Brachman & Neufeld. "TDBM: A DBM Library With Atomic Transactions" (PDF).
- Olsen, Bostic & Seltzer. "Berkeley DB" (PDF).