Universally unique identifier

From Wikipedia, the free encyclopedia
  (Redirected from Universally Unique Identifier)
Jump to: navigation, search

A universally unique identifier (UUID) is an identifier standard used in software construction. A UUID is simply a 128-bit value. The meaning of each bit is defined by any of several variants.

For human-readable display, many systems use a canonical format using hexadecimal text with inserted hyphen characters. For example:

de305d54-75b4-431b-adb2-eb6b9e546013

The intent of UUIDs is to enable distributed systems to uniquely identify information without significant central coordination. In this context the word unique should be taken to mean "practically unique" rather than "guaranteed unique". Since the identifiers have a finite size, it is possible for two differing items to share the same identifier. The identifier size and generation process need to be selected so as to make this sufficiently improbable in practice. Anyone can create a UUID and use it to identify something with reasonable confidence that the same identifier will never be unintentionally created by anyone to identify something else. Information labeled with UUIDs can therefore be later combined into a single database without needing to resolve identifier (ID) conflicts.

Adoption of UUIDs is widespread with many computing platforms providing support for generating UUIDs and for parsing/generating their textual representation.

Definition[edit]

A UUID is a 16-octet (128-bit) number.

In its canonical form, a UUID is represented by 32 lowercase hexadecimal digits, displayed in five groups separated by hyphens, in the form 8-4-4-4-12 for a total of 36 characters (32 alphanumeric characters and four hyphens). For example:

123e4567-e89b-12d3-a456-426655440000

The first 3 sequences are interpreted as complete hexadecimal numbers, while the final 2 as a plain sequence of bytes. The byte order is "most significant byte first (known as network byte order)"[1](sec. 4.1.2) (note that GUID's byte order is different). This form is defined in the RFC[1](sec. 3) and simply reflects UUID's division into fields,[1](sec. 4.1.2) which apparently originates from the structure of the initial time and MAC-based version.

The number of possible UUIDs is 340,282,366,920,938,463,463,374,607,431,768,211,456 (1632 or 2128), or about 3.4 × 1038.

Variants and versions[edit]

The variant indicates the layout of the UUID. The UUID specification covers one particular variant. Other variants are reserved or exist for backward compatibility reasons (e.g., for values assigned before the UUID specification was produced). An example of a UUID that is a different variant is the nil UUID, which is a UUID that has all 128 bits set to zero.

In the canonical representation, xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx, the most significant bits of N indicates the variant (depending on the variant; one, two, or three bits are used). The variant covered by the UUID specification is indicated by the two most significant bits of N being 1 0 (i.e., the hexadecimal N will always be 8, 9, A, or B).

The variant covered by the UUID specification has five versions. For this variant, the four bits of M indicates the UUID version (i.e., the hexadecimal M will be either 1, 2, 3, 4, or 5).

Version 1 (MAC address & date-time)[edit]

Conceptually, the original (version 1) generation scheme for UUIDs was to concatenate the UUID version with the MAC address of the computer that is generating the UUID, and with the number of 100-nanosecond intervals since the adoption of the Gregorian calendar in the West. By representing a single point in space (the computer) and time (the number of intervals), the chance of a collision in values is effectively nil.

This scheme has been criticized in that it is not sufficiently "opaque"; it reveals both the identity of the computer that generated the UUID and the time at which it did so. Its uniqueness across computers is guaranteed as long as MAC addresses are not duplicated (which can happen, for instance, due to manual setting or “spoofing” of the MAC address); however, given the speed of modern processors, successive invocations on the same machine of a naive implementation of a generator of version 1 UUIDs may produce the same UUID, violating the uniqueness property. (Non-naïve implementations can avoid this problem by, for example, remembering the most recently generated UUID, "pocketing" unused UUIDs, and using pocketed UUIDs in case a duplicate is about to be generated.)

Version 2 (DCE Security)[edit]

Version 2 UUIDs are similar to Version 1 UUIDs, with the first 4 bytes of the timestamp replaced by the user's POSIX UID or GID (with the "local domain" identifier indicating which it is) and the upper byte of the clock sequence replaced by the identifier for a "local domain" (typically either the "POSIX UID domain" or the "POSIX GID domain").[2][3]

Version 3 (MD5 hash & namespace)[edit]

Version 3 UUIDs use a scheme deriving a UUID via MD5 from a URL, a fully qualified domain name, an object identifier, a distinguished name (DN as used in Lightweight Directory Access Protocol), or on names in unspecified namespaces. Version 3 UUIDs have the form xxxxxxxx-xxxx-3xxx-yxxx-xxxxxxxxxxxx where x is any hexadecimal digit and y is one of 8, 9, A, or B.

To determine the version 3 UUID of a given name, the UUID of the namespace (e.g., 6ba7b810-9dad-11d1-80b4-00c04fd430c8 for a domain) is transformed to a string of bytes corresponding to its hexadecimal digits, concatenated with the input name, hashed with MD5 yielding 128 bits. Six bits are replaced by fixed values, four of these bits indicate the version, 0011 for version 3. Finally, the fixed hash is transformed back into the hexadecimal form with hyphens separating the parts relevant in other UUID versions.

Version 4 (random)[edit]

Version 4 UUIDs use a scheme relying only on random numbers. This algorithm sets the version number (4 bits) as well as two reserved bits. All other bits (the remaining 122 bits) are set using a random or pseudorandom data source. Version 4 UUIDs have the form xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx where x is any hexadecimal digit and y is one of 8, 9, A, or B (e.g., f47ac10b-58cc-4372-a567-0e02b2c3d479).

Version 5 (SHA-1 hash & namespace)[edit]

Version 5 UUIDs use a scheme with SHA-1 hashing; otherwise it is the same idea as in version 3. RFC 4122 states that version 5 is preferred over version 3 name based UUIDs, as MD5's security has been compromised. Note that the 160 bit SHA-1 hash is truncated to 128 bits to make the length work out. An erratum addresses the example in appendix B of RFC 4122.

Implementations[edit]

4D 
The 4D programming language offers a Generate UUID command which generates a hex string in non-canonical form. While lacking native (128-bit values) support, the database can store UUIDs as text and display using UUID Format. Another way to generate UUIDs is the "hmFree_GenerateUUID" command in the plugin HMFree.
ActionScript 
CASA Lib provides a Version 4 UUID function as part of the StringUtil class.[4] Adobe Flex also provides a UUID implementation with the UIDUtil class.[5]
Apache Solr 
Solr contains an uuid data type.
On Linux, libuuid is part of the util-linux package since version 2.15.1 (previously in the e2fsprogs package, but this implementation is being phased out as not even e2fsprogs uses its internal implementation any more when possible[6] ). The OSSP project provides a UUID library.[7]
C++ 
Object Oriented ID provides a C++ concrete type, i.e. designed to behave much like a built-in type. QUuid is part of the C++ Qt framework. Boost.Uuid is a header-only implementation under a non-reciprocal Open Source license.
Caché ObjectScript 
UUID Version 4 implementation for Caché ObjectScript.
CakePHP 
Cakephp will automatically generate UUIDs for new records if the table's primary key data type is set to CHAR(36).[8]
Cassandra 
Cassandra uses version 1 UUIDs for a data type called 'timeuuid' for use in applications requiring conflict-free timestamps. A standard 'uuid' data type is also provided.[9]
Cocoa/Carbon (Mac OS X/iOS) 
The Core Foundation class CFUUIDRef is used to produce and store UUIDs, as well as to convert them to and from CFString/NSString representations. Since Mac OS X 10.8 and iOS 6.0, the NSUUID class is available.[10]
CFML 
The createUUID() function provides a UUID in all versions, however the format generated is in four segments instead of five xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxx (8-4-4-16).[11]
CodeGear RAD Studio (Delphi/C++ Builder) 
A new GUID can be generated by pressing Ctrl+Shift+G. For runtime functions see the "Free Pascal & Lazarus IDE" section.
Common Lisp 
Two libraries are available to create UUIDs according to RFC 4122. uuid creates v1, v3, v4 and v5 UUIDs. Unicly creates v3, v4, and v5 UUIDs.
CouchDB 
If not provided, CouchDB sets the document ID for each document to be a UUID[12]
The Tango standard library includes a module to create UUIDs (v3, v4, and v5) according to RFC 4122.[13]
Eiffel 
A library is available to create UUIDs Generates uuids according to RFC 4122, Variant 1 0, Version 4. Source available at Eiffel UUID library
Erlang 
erlang-uuid[14] and uuid[15] implement UUID generation for versions 1, 3, 4, and 5 from RFC 4122. The v1 UUIDs generated are Erlang pid specific.
ExtJS 
ExtJS implements UUID class as a generator to create data identifiers. [16]
Firebird Server 
Firebird has gen_uuid() from version 2.1[17] and uuid_to_char() and char_to_uuid() from version 2.5[18] as built-in functions.
Free Pascal & Lazarus IDE 
In Free Pascal there is a class called TGUID that holds the structure of a UUID. Also in the SysUtils.pas unit there are methods to create, compare and convert UUID's. They are CreateGUID(), GUIDToString() and IsEqualGUID().[19] In the Lazarus IDE you can also generate a UUID by pressing Ctrl+Shift+G.
Go 
The gouuid[20] package provides, in pure Go, immutable UUID structs and the functions NewV3, NewV4, NewV5 and Parse() for generating versions 3, 4 and 5 UUIDs as specified in RFC 4122.
Haskell 
The package uuid[21] directly implements most of RFC 4122. The package supports generation (v1, v3, v4 and v5) as well as serialization to and from string and binary formats. The package system-uuid[22] provides bindings to the native UUID generators on Windows, Linux, FreeBSD and Mac OS X.
Haxe 
Haxe functions which generate version 4 UUIDs as defined in the RFC 4122 specification.
iOS 
CFUUID and NSUUID can both be used to create UUIDs in Objective-C.
Java 
The J2SE 5.0 release of Java provides a class that will produce 128-bit UUIDs, although it only implements version 3 (via the nameUUIDFromBytes(byte[] name) method) and 4 (via UUID.randomUUID()) generation methods, not the original version 1 (due to lack of means to access MAC addresses using pure Java before version 6). The API documentation for the java.util.UUID class refers to ISO/IEC 11578:1996. Alternative open source libraries supporting MAC addresses on several common operating systems include UUID – generate UUIDs (or GUIDs) in Java and Java Uuid Generator (JUG).
JavaScript 
Broofa.com has implemented a JavaScript function which generates version 1 and version 4 UUIDs as defined in the RFC 4122 specification. It is available through the npm and ComponentJS package managers. Another open source library UUID.js, which is available under the MIT license, generates version 4 and version 1 UUIDs according to RFC 4122.
KohanaPHP 
The Kohana PHP Framework, supports the generation of version 3, 4, and 5 UUIDs according to RFC 4122 specifications using the UUID module.[23]
Lasso 
A custom tag for Lasso 8+ by Douglas Burchard, an LJAPI-module by Steffan A. Cline, also for Lasso 8+. Lasso 9's implementation of Lasso_UniqueID also returns a UUID.
Linden Scripting Language 
The built-in scripting language used by Second Life, which has a built-in key datatype that is used to represent UUID4 values. However, it is implemented as a string encoded in utf8 (LSO engine) or utf16 (Mono engine), and does not restrict you from placing general string data instead of a UUID.
LiveCode
Lua 
There is a Lua module by Luiz Henrique de Figueiredo.
Mac OS X 
Command line utility uuidgen is available.
Microsoft SQL Server 
Transact-SQL (2000 and 2005) provides a function called NEWID() to generate unique identifiers. SQL Server 2005 provides an additional function called NEWSEQUENTIALID() which generates a new GUID that is greater than any GUID previously created by the NEWSEQUENTIALID() function on a given computer.
MySQL 
MySQL provides a UUID() function.[24]
.NET Framework 
The .NET Framework also provides a structure System.Guid to generate and manipulate 128-bit UUIDs.[25]
NodeJs
NodeJS also provides some packages which generates UUID. node-uuid is a popular package that generated UUID.[26]
OCaml 
The uuidm library implements universally unique identifiers version 3, 5 (name based with MD5, SHA-1 hashing) and 4 (random based) according to RFC 4122.
Oracle Database 
The Oracle Database provides a function SYS_GUID() to generate unique identifiers.[27]
Perl
The Data::UUID and Data::GUID modules from CPAN can be used to create UUIDs.[28] The UUID::Tiny module is a lightweight, low dependency Pure Perl module for UUID creation and testing.[29] The OSSP project provides a OSSP::uuid module.[7]
PHP 
In PHP there are several modules for creating UUIDs.[30]
PostgreSQL 
PostgreSQL supports UUID as a native data type. For generating UUID values, functions may be added as a commonly-available 'uuid-ossp' extension based on the OSSP library.
Progress OpenEdge ABL 
The GENERATE-UUID function in OpenEdge 10 provides a UUID which can be made printable using the GUID() or BASE64-ENCODE() functions.[31]
Python 
The uuid module[32] (included in the standard library since Python 2.5) creates UUIDs according to RFC 4122.
Revolution/RunRev 
The libUUID library[33] A library that generates UUIDs of type 1 (time based), type 3 (name-based) and type 4 (random-based). Version 1.0. by Mark Smith. OSL 3.0
Ruby 
There are several RFC4122 implementations for Ruby, the most updated ones being Ruby-UUID (fork here [1]), UUID and UUIDTools. Ruby 1.9 includes a built-in version 4 uuid generator (SecureRandom.uuid).
SAP BusinessObjects Data Services 
The ETL tool SAP BusinessObjects Data Services contains a function to generate a UUID: gen_uuid().[34]
Tcl 
A Tcl implementation is provided in the TclLib package.[35]
Unix 
Command line utility uuidgen may be provided by default. There is also a tool called simply "uuid" available, which has the same functionality. The FreeBSD and Linux kernels have a built-in UUID v4 generator too. To use this on Linux, you have to read the file /proc/sys/kernel/random/uuid. On FreeBSD there is a simple system call uuidgen(2). FreeBSD also has /compat/linux/proc/sys/kernel/random/uuid as part of its Linux emulation.
Web 
Many web sites provide a UUID generator as a service. They typically construct one or more UUID values displayed as hexadecimal strings to be copied by the user and pasted to another application. For example, Online UUID Generator by TransparenTech LLC.

Random UUID probability of duplicates[edit]

Out of a total of 128 bits, two bits indicate an RFC 4122 ("Leach-Salz") UUID and four bits the version (0100 indicating "randomly generated"), so randomly generated UUIDs have 122 random bits. The chance of two such UUIDs having the same value can be calculated using probability theory (birthday paradox). Using the approximation

p(n) \approx 1 - e^{-\frac{n^2}{2x}},

these are the probabilities of an accidental clash after calculating n UUIDs, with x = 2122:

n probability
68,719,476,736 = 236 0.0000000000000004 (4 × 10−16)
2,199,023,255,552 = 241 0.0000000000004 (4 × 10−13)
70,368,744,177,664 = 246 0.0000000004 (4 × 10−10)

To put these numbers into perspective, the annual risk of a given person being hit by a meteorite is estimated to be one chance in 17 billion,[36] which means the probability is about 0.00000000006 (6 × 10−11), equivalent to the odds of creating a few tens of trillions of UUIDs in a year and having one duplicate. In other words, only after generating 1 billion UUIDs every second for the next 100 years, the probability of creating just one duplicate would be about 50%.

However, these probabilities only hold when the UUIDs are generated using sufficient entropy. Otherwise, the probability of duplicates could be significantly higher, since the statistical dispersion might be lower. Where unique identifiers are required for distributed applications, so that UUIDs do not clash even when data from many devices is merged, the randomness of the seeds and generators used on every device must be reliable for the life of the application. Where this is not feasible, RFC4122 recommends using a namespace variant instead.

Standards[edit]

UUIDs are standardized by the Open Software Foundation (OSF) as part of the Distributed Computing Environment (DCE).

UUIDs are documented as part of ISO/IEC 11578:1996 "Information technology – Open Systems Interconnection – Remote Procedure Call (RPC)" and more recently in ITU-T Rec. X.667 | ISO/IEC 9834-8:2005.

The IETF has published the Standards-Track, RFC 4122, that is technically equivalent with ITU-T Rec. X.667 | ISO/IEC 9834-8.

History[edit]

UUIDs were originally used in the Apollo Network Computing System and later in the Open Software Foundation's (OSF) Distributed Computing Environment (DCE). The initial design of DCE UUIDs was based on UUIDs as defined in the Apollo Computer Network Computing System,[37] whose design was in turn inspired by the (64-bit) unique identifiers defined and used pervasively in Domain/OS, an operating system also designed by Apollo Computer.

Later, the Microsoft Windows platforms adopted that design as globally unique identifiers (GUIDs).

Other significant uses include ext2/ext3/ext4 filesystem userspace tools (e2fsprogs uses libuuid provided by util-linux), LUKS encrypted partitions, GNOME, KDE, and Mac OS X,[38] most of which either use the libuuid library now provided by the util-linux package or implementations derived from it or from the original implementation by Theodore Ts'o in the e2fsprogs[39] package (the latter has been moved to the util-linux[40] package in version 2.15.1[41] for consistency).

See also[edit]

References[edit]

  1. ^ a b c P. Leach et al. (July 2005). "RFC 4122 - A Universally Unique IDentifier (UUID) URN Namespace". Internet Engineering Task Force. 
  2. ^ The Open Group (1997). "CDE 1.1: Remote Procedure Call". 
  3. ^ The Open Group (1997). "DCE 1.1: Authentication and Security Services". 
  4. ^ Aaron Clinger and the CASA Lib Team. "CASA Lib's StringUtil Documentation". 
  5. ^ Adobe Systems Incorporated. "mx.utils.UIDUtil". 
  6. ^ change that prevents the internal implementation from being used when an external implementation is available
  7. ^ a b Open Source Software Project. "Universally Unique Identifier (UUID)". 
  8. ^ "Cake version 1.2 manual". 
  9. ^ "Cassandra version 1.2 documentation". 
  10. ^ Apple Computer, Inc. "CFUUID Reference". 
  11. ^ Adobe Systems Inc. "ColdFusion Functions:CreateUUID". 
  12. ^ "Couch DB Core API Documentation". 
  13. ^ "D/Tango UUID API document". 
  14. ^ Per Andersson. "erlang-uuid". 
  15. ^ Michael Truog. "uuid". 
  16. ^ Sencha Inc. "ExtJS 5 API - Ext.data.identifier.Uuid". 
  17. ^ "Firebird 2.1 Release Notes". 
  18. ^ "Firebird 2.5 Release Notes". 
  19. ^ Free Pascal Documentation. "Reference for 'sysutils' unit". 
  20. ^ Krzysztof Kowalik. "gouuid". 
  21. ^ Antoine Latter. "uuid". 
  22. ^ Jason Dusek. "system-uuid". 
  23. ^ Gilk, Woody. "Kohana UUID module". 
  24. ^ MySQL AB. "MySQL 5.0 Reference Manual". 
  25. ^ "Guid Structure". MSDN Library. 
  26. ^ "NodeJs package node-uuid". 
  27. ^ "SYS_GUID". Oracle Database SQL Reference. Oracle Corporation. 
  28. ^ Signes, Ricardo (16 January 2009). "Data-GUID". CPAN. 
  29. ^ Augustin, Christian (31 January 2010). "UUID-Tiny". CPAN. 
  30. ^ Holzgraefe, Hartmut (1 April 2008). "uuid". PECL. 
  31. ^ http://www.psdn.com/library/servlet/KbServlet/download/1927-102-2537/dvref.pdf[dead link]
  32. ^ "Python Library Reference: uuid". 
  33. ^ "Revolution Stuff: libUUID". 
  34. ^ "SAP BusinessObjects Data Services XI 4.0 features". 
  35. ^ "Tcl Standard Library: uuid". 
  36. ^ Old Farmer's Almanac 1994, 220–222, Taking your Chances: An Explanation of Risk
  37. ^ Zahn, Lisa (1990). Network Computing Architecture. Prentice Hall. p. 10. ISBN 0-13-611674-4. 
  38. ^ gen_uuid.c in Apple's Libc-391, corresponding to Mac OS X 10.4
  39. ^ gen_uuid.c in e2fsprogs
  40. ^ gen_uuid.c in util-linux
  41. ^ according to util-linux's man 3 uuid manual page, section AVAILABILITY

External links[edit]