|Original author(s)||Margo Seltzer and Keith Bostic of Sleepycat Software|
|Developer(s)||Sleepycat Software, later Oracle Corporation|
18.1.40  / May 29, 2020
|Operating system||Windows, Unix-like|
|Size||~1244 kB compiled on Windows x86|
|Type||Embedded database, NoSQL Database|
|License||Dual licensed (GNU Affero General Public License and commercial (version 6.x and upwards)|
Sleepycat license (versions 2.0-5.x)
4-clause BSD license (versions 1.x)
Berkeley DB (BDB) is a software library intended to provide a high-performance embedded database for key/value data. Berkeley DB is written in C with API bindings for C++, C#, Java, Perl, PHP, Python, Ruby, Smalltalk, Tcl, and many other programming languages. BDB stores arbitrary key/data pairs as byte arrays, and supports multiple data items for a single key. Berkeley DB is not a relational database, although it has advanced database features including database transactions, multiversion concurrency control and write-ahead logging.
BDB can support thousands of simultaneous threads of control or concurrent processes manipulating databases as large as 256 terabytes, on a wide variety of operating systems including most Unix-like and Windows systems, and real-time operating systems.
BDB was commercially supported and developed by Sleepycat Software from 1996 to 2006. Sleepycat Software was acquired by Oracle Corporation in February 2006, which continues to develop and sell the C Berkeley DB library. In 2013 Oracle re-licensed BDB under the AGPL license. As of 2020, Bloomberg LP continues to develop a fork of BDB within their Comdb2 database, under the original Sleepycat permissive license.
Berkeley DB originated at the University of California, Berkeley as part of BSD, Berkeley's version of the Unix operating system. After 4.3BSD (1986), the BSD developers attempted to remove or replace all code originating in the original AT&T Unix from which BSD was derived. In doing so, they needed to rewrite the Unix database package. Seltzer and Yigit created a new database, unencumbered by any AT&T patents: an on-disk hash table that outperformed the existing dbm libraries. Berkeley DB itself was first released in 1991 and later included with 4.4BSD. In 1996 Netscape requested that the authors of Berkeley DB improve and extend the library, then at version 1.86, to suit Netscape's requirements for an LDAP server and for use in the Netscape browser. That request led to the creation of Sleepycat Software. This company was acquired by Oracle Corporation in February 2006, which continues to develop and sell Berkeley DB.
Since its initial release, Berkeley DB has gone through various versions. Each major release cycle has introduced a single new major feature generally layering on top of the earlier features to add functionality to the product. The 1.x releases focused on managing key/value data storage and are referred to as "Data Store" (DS). The 2.x releases added a locking system enabling concurrent access to data. This is what is known as "Concurrent Data Store" (CDS). The 3.x releases added a logging system for transactions and recovery, called "Transactional Data Store" (TDS). The 4.x releases added the ability to replicate log records and create a distributed highly available single-master multi-replica database. This is called the "High Availability" (HA) feature set. Berkeley DB's evolution has sometimes led to minor API changes or log format changes, but very rarely have database formats changed. Berkeley DB HA supports online upgrades from one version to the next by maintaining the ability to read and apply the prior release's log records.
The FreeBSD and OpenBSD operating systems continue to use Berkeley DB 1.8x for compatibility reasons;[dubious ] Linux-based operating systems commonly include several versions to accommodate for applications still using older interfaces/files.
Starting with the 6.0.21 (Oracle 12c) release, all Berkeley DB products are licensed under the GNU AGPL. Previously, Berkeley DB was redistributed under the 4-clause BSD license (before version 2.0), and the Sleepycat Public License, which is an OSI-approved open-source license as well as an FSF-approved free software license. The product ships with complete source code, build script, test suite, and documentation. The comprehensive feature along with the licensing terms have led to its use in a multitude of free and open-source software. Those who do not wish to abide by the terms of the GNU AGPL, or use an older version with the Sleepycat Public License, have the option of purchasing another proprietary license for redistribution from Oracle Corporation. This technique is called dual licensing.
Berkeley DB has an architecture notably simpler than that of other database systems like relational database management systems. For example, like SQLite, it is not based on a server/client model, and does not provide support for network access – programs access the database using in-process API calls. Oracle added support for SQL in 11g R2 release based on the popular SQLite API by including a version of SQLite in Berkeley DB (it uses Berkeley DB for storage). There is third party support for PL/SQL in Berkeley DB via a commercial product named Metatranz StepSqlite.
A program accessing the database is free to decide how the data is to be stored in a record. Berkeley DB puts no constraints on the record's data. The record and its key can both be up to four gigabytes long.
Oracle Corporation use of name "Berkeley DB"
The name "Berkeley DB" is used by Oracle Corporation for three different products, two of which are not BDB:
- Berkeley DB, the C database library that is the subject of this article
- Berkeley DB Java Edition, a pure Java library whose design is modelled after the C library but is otherwise unrelated
- Berkeley DB XML, A C++ program that supports XQuery, and which includes a legacy version of the C database library
Programs that use Berkeley DB
Berkeley DB provides the underlying storage and retrieval system of several LDAP servers, database systems, and many other proprietary and free/open source applications. Notable software that use Berkeley DB for data storage include:
- Bitcoin Core – The first implementation of the Bitcoin cryptocurrency retains use of 2009 Berkeley DB 4.8 for one feature
- Bogofilter – A free/open source spam filter that saves its wordlists using Berkeley DB by default
- Citadel – A free/open source groupware platform that keeps all of its data stores, including the message base, in Berkeley DB. Citadel is licensed under the GPLv3 which is compatible with Oracle BDB licensing
- Sendmail – A popular MTA for Linux/Unix systems
- Spamassassin – An anti-spam application
Berkeley DB V2.0 and higher is available under a dual license:
- Oracle commercial license with professional support
- Open source license
The switch to AGPL has caused major Linux distributions such as Debian to completely phase out their use of Berkeley DB, with a preference for Lightning Memory-Mapped Database (LMDB). The rationale is that having commercial users use AGPL code would be unacceptable, as they would be forced to provide their source code to users by a simple software upgrade.
- "Oracle Berkeley DB Downloads". Retrieved 27 September 2020.
- Berkeley DB Reference Guide: What is Berkeley DB not?. Doc.gnu-darwin.org (2001-05-31). Retrieved on 2013-09-18.
- http://doc.gnu-darwin.org/am_misc/dbsizes.html Berkeley DB Reference Guide: Database limits Retrieved on 2013-09-19
- "Major Release: Berkeley DB 12gR1 (188.8.131.52)". Open Source Projects at Oracle. 2013-06-10. Archived from the original on 2013-12-05. Retrieved 2021-04-11.
- Nathan, Willis (2013-07-10). "Debian, Berkeley DB, and AGPLv3". Linux Weekly News.
- Olson, Michael A.; Bostic, Keith; Seltzer, Margo (1999). "Berkeley DB" (PDF). Proc. FREENIX Track, USENIX Annual Tech. Conf. Retrieved October 20, 2009.
- Seltzer, Margo; Yigit, Ozan (1991). "A New Hashing Package for UNIX". Proc. USENIX Winter Tech. Conf. Retrieved October 20, 2009.
- Brunelli, Mark (March 28, 2005). "A Berkeley DB primer". Enterprise Linux News. Retrieved December 28, 2008.
- "db(3)". Retrieved April 12, 2009.
- [Berkeley DB Announce] Major Release: Berkeley DB 12gR1 (184.108.40.206). Retrieved July 5, 2013. (Despite AGPL mentions there, the source archive still declares BSD-4-Clause terms in 6.0.19.)
- Snapshot of the 6.0.19 source at the time
- "The Sleepycat License". Open Source Initiative. October 31, 2006. Retrieved December 28, 2008.
- "Licenses". Free Software Foundation. December 10, 2008. Archived from the original on December 16, 2008. Retrieved December 28, 2008.
- "Compatibility with historic UNIX interfaces". docs.oracle.com. Retrieved 2019-11-20.
- "Twitter / Gregory Burd: @humanications We didn't r ..."
- "Official Berkeley DB FAQ". Oracle Corporation. Retrieved March 30, 2010.
Does Berkeley DB support PL/SQL?
- Oracle Berkeley DB Downloads: Latest Production Releases
- "Oracle Berkeley DB Java Edition".
- "Berkeley DB XML".
- Release Notes for Bitcoin 0.8.0 2013
- "bogofilter -- Fast Bayesian Spam Filter / Code (Git) / [93b68e] /bogofilter/README". sourceforge.net. Retrieved 2020-07-17.
- "Download, license and sales information". Nov 30, 2017.
- "Major Release: Berkeley DB 12gR1 (220.127.116.11)". June 10, 2013. Retrieved July 15, 2013.
- Ondřej Surý (June 19, 2014). "New project goal: Get rid of Berkeley DB (post jessie)". debian-devel (Mailing list). Debian.