Talk:Universally unique identifier

This is the talk page for discussing improvements to the Universally unique identifier article.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL

Archives: 1

Systems

	Systems science portal This article is within the scope of WikiProject Systems, which collaborates on articles related to systems and systems science.SystemsWikipedia:WikiProject SystemsTemplate:WikiProject SystemsSystems articles
???	This article has not yet received a rating on the project's importance scale.
	This article is not associated with a particular field. Fields are listed on the template page.

Computing: Software

	This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.ComputingWikipedia:WikiProject ComputingTemplate:WikiProject ComputingComputing articles
???	This article has not yet received a rating on the project's importance scale.
	This article is supported by WikiProject Software.

Software: Computing

	This article is within the scope of WikiProject Software, a collaborative effort to improve the coverage of software on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.SoftwareWikipedia:WikiProject SoftwareTemplate:WikiProject Softwaresoftware articles
???	This article has not yet received a rating on the project's importance scale.
	This article is supported by WikiProject Computing.

Bring back the stat!

I am saddened that this article no longer includes this stat in the opening paragraph: "A UUID is a 16-byte (128-bit) number. The number of theoretically possible UUIDs is therefore 216*8 = 2128 = 25616 or about 3.4 × 1038. This means that 1 trillion UUIDs would have to be created every nanosecond for 10 billion years to exhaust the number of UUIDs." It made the article several times more readable, and gave the context straight away, even if it is not genuinely of mathematical use... — Preceding unsigned comment added by 87.127.211.206 (talk) 13:37, 25 August 2011 (UTC)[reply]

Doesn't a 128-bit number have 2^128 possible values (where ^ means "raised to the power"). I don't understand the phrase "216*8 = 2128 = 25616 or about 3.4 × 1038". 2128 is not equal to 25616, and 216*8 is not 2128.

I have read, elsewhere, that there are enough UUIDs to assign one to every atom in the known universe. On the other hand, http://en.wikipedia.org/wiki/Observable_universe says that the known universe has about 10^80 atoms. I think that 10^80 is about 2^265, which is larger than 2^128. I'm confused. 75.146.141.142 (talk) 22:19, 27 April 2012 (UTC)[reply]

We need more material in this article

As a Linux user, I understand the importance and awesomeness of UUIDs. However, I'd like to know more. For instance, when did the idea of UUID come into creation? When was it created as a set standard? What I'm trying to say is that this article needs more dates, years, names of people related to those things. --Cyberman (talk) 02:31, 14 June 2008 (UTC)[reply]

ETC

What is Leach-Salz? --Abdull 15:16, 4 April 2007 (UTC)[reply]

The UUID variant described in RFC 4122, "A Universally Unique IDentifier (UUID) URN Namespace", by Paul Leach, Michael Mealling, and Rich Salz. Guy Harris 17:44, 4 April 2007 (UTC)[reply]

Perfect answer, thank you very much! I first thought it has something to do with salt... --Abdull 18:55, 4 May 2007 (UTC)[reply]

private-use UUIDs?

The article doesn't make clear if there is a "private-use" address space for UUIDs, i.e., some rules to create UUIDs not intended to be exported from a closed system, like we have IPv4 and IPv6 private networks, Unicode's Private Use Area, the .local domain name, local-use EAN-13 barcodes and so on... The most logical idea is to use a specific reserved version number, but the RFC 4122 doesn't seem to mention anything other than the 5 versions (clock, dce, md5, random and sha1). --Juliano (T) 01:39, 20 August 2007 (UTC)[reply]

AFAIK there are not such "private use" UUIDs. The inventors of these UUIDs claim that the collision risk is small enough that in practise there are no collisions, so there is no need for "private use" UUIDs. --81.27.124.161 (talk) 21:22, 9 April 2008 (UTC) (RokerHRO)[reply]

AFAIK?? Seems like that acronymn is a good example of private use UUIDs. The acronym is unintelligible, its meaning known only to only a select few ... as the link provided demonstrates.

K. Kellogg-Smith (talk) 13:09, 20 October 2011 (UTC)[reply]

Collisions

"This means that 1 trillion UUIDs would have to be created every nanosecond for 10 billion years to exhaust the number of UUIDs." this leads to misunderstanding. While the number space is that big, chance for a collision are much higher, indeed! I.e. see http://en.wikipedia.org/wiki/Birthday_paradox —Preceding unsigned comment added by 84.129.161.7 (talk) 03:16, August 30, 2007 (UTC)

Duplicate UUID's on hosts

IS it possible to have same valid UUID on multiple hosts in a network? —Preceding unsigned comment added by 128.222.37.20 (talk) 05:15, 13 September 2007 (UTC)[reply]

It is, but very unlikely as explained in the article if the UUIDs are chosen randomly. --81.27.124.161 (talk) 21:24, 9 April 2008 (UTC) (RokerHRO)[reply]

"Well Known" UUID's

Perhaps a section should be added to list some well known UUIDs. The article already uses the MS IUnknown UUID as an example of such a UUID - however a more exhaustive list of "known" aka documented UUIDs that are for specific uses should be added.Myrdred (talk) 16:43, 9 February 2008 (UTC)[reply]

Am I the only one horrified by this idea?

What a terrible idea! So basically because we can't be bothered to create a central listing of things, we're just generating an enormous gillion-character string and "hoping" that since it's so big it's not going to collide?! This is used in ext3?! Wow I feel great knowing that the only reason my filesystem isn't catastrophically corrupted is because I'm lucky. This type of thing might have limited applications where it's impossible to safely merge IDs that could potentially collide (can't imagine why it would be impossible but maybe for some crazy deep space probe where energy can't be spared for merging processing and high bandwidth is critical), but for labelling articles on E's website?! This is terrible programming! :D\=< (talk) 02:11, 2 March 2008 (UTC)[reply]

It is not enormous, it is not gillion-character, and it is not a string. It is a 128-bit number. The purpose of UUID is to be used when you simply don't have any means of using a central registry. Two autonomous systems, with no previous established communication between them, or access to the Internet may create lots of objects identified by UUIDs, than they can merge together into one, and their objects will still be unique.

Or do you really want to force everyone making a filesystem on a newly-installed system to have an Internet connection, in order for mke2fs to connect to a central registry to create the UUID of your partition? And even in the remote possibility of two objects in the universe having the same UUID, it is easier to believe in the existence of the Invisible Pink Unicorn than they happening to coexist on the same context.

Your filesystem won't be catastrophically corrupted just because it has the same UUID of a news article on some website. It won't be corrupted even it there was another object on the same computer with the same UUID. You have to be extremely unlucky to get two partitions with the same UUID on the same computer, and even so, your biggest problem will be referencing one of them by UUID when mounting, something very simple to fix.

The principle is the same of hashing. For mission-critical systems, you either won't depend on UUIDs or will properly handle UUIDs so that they don't collide or collisions will be treated accordingly.

--Juliano (T) 17:00, 9 March 2008 (UTC)[reply]

And how is a filesystem not mission-critical? And give one example where you can't have a central registry. I don't understand why partitions have UUIDs- they're already registered in the drive's partition table, and each hard drive's partitions are numbered and made available by the BIOS. No need for an internet connection of course. Obviously there are cases where data needs to be merged and still have a unique identifier but if you actually give some processing to sorting through the data and preventing collisions, reissuing identifiers, and generally doing what a merge should be doing, then you have no problems. It's a lot slower than just issuing huge UUIDs to everything and crossing your fingers, which does work, but come on! How ugly! This is not the right thing. :D\=< (talk) 15:19, 10 March 2008 (UTC)[reply]

Yes, sure a filesystem is mission-critical... but the data contained into it, not its identification along all the partitions of the system. Once you build a mission-critical system, the UUIDs of the partitions are set on stone and won't change anymore during its mission-critical activity. You use UUIDs only to identify each partition among the others, only. Once mounted, the kernel uses its (major, minor) tuple to access it. And you are thinking too strictly... forget the BIOS and the drive's partition table. A computer is not restricted to having a single drive.

Think you have 2 hard drives on your computer, each one with a few partitions, and each partition identified by its UUID. If you consider the possibility of changing their internal connections, referring them by their /dev/sdXY path becomes inherently broken. Using UUIDs, you buy a new SATA PCI card and move one of the disks to it, the system will boot with all mount points in place, like if nothing changed. Or you buy an USB enclosure, put the other disk into it and plug it into the USB port, the system will still boot and mount all the partitions properly on their mount points, since they still have the same UUIDs they had when created. You bring a hard drive from a friend, and plug it anywhere on your system reassured that it would be easier to win three consecutive times the Lottery's full-prize than having his partitions colliding with yours (unless he knew beforehand one of your systems partitions UUID and set on his drive, but this is another problem).

With dynamic and redundant disk systems, like LVM and RAID, where disks and partitions may be freely moved and newly ones added, UUIDs are even more important. You ask for one example where you can't have a central registry... disk partitioning is pretty much an excellent example. Creating and formatting RAID partitions on hardware RAID implementations are done by the BIOS, before the operating system is loaded. How you can think of using a central registry in this case?

I work with distributed computing, process migration and some grid computing. We use UUIDs extensively to identify nodes, tasks and all sort of objects being processed and passed around. Objects must be unique on a system that is not fully connected. Sections of the system get disconnected from the rest and get reconnected a few hours later. The idea of a central registry for UUIDs is simply crazy and stupid. Half the objects created don't even live for more than a few seconds. A central registry not only would slow everything down, it would break its mobility and create a single point of failure, for no added value. Such a distributed system is inherently designed with fault-tolerance in mind, since you may spawn a task to a node, the node crashes and you won't ever receive its answer back. UUID collision is not even considered, it is a non-issue.

And talking about BIOS, your worst nightmare is already becoming true. The BIOS is obsolete and is currently being replaced with the Extensible Firmware Interface, which in turn replaces the old and obsolete Master Boot Record with the new GUID Partition Table, heavily based on UUIDs. Sorry.

--Juliano (T) 17:19, 10 March 2008 (UTC)[reply]

Froth, I thought about this problem too. I guess I would feel safer if all programs that use UUID's are written to handle UUID collisions safely, but I guess the probability of collision is known to be too small to bother with. I wonder if there are many life-and-death applications that blindly depend on uniqueness of UUIDs, e.g.avionics, railways, medical equipment. Perhaps only the most hardnosed engineer/mathematician-types would feel safe. Glueball (talk) 11:47, 21 April 2008 (UTC)[reply]

xxxxxxxx-xxxx-3xxx-xxxxxxxxxx

This pattern is supplied in the section on Version 3 UUIDs. Isn't there a group of four hex digits missing here? I hesitate to correct it myself in case I'm missing a key point about version 3. Jimgawn (talk) 18:36, 11 April 2008 (UTC)[reply]

Fixed. You are right. It should be (and now is) xxxxxxxx-xxxx-3xxx-xxxx-xxxxxxxxxxxx . --68.0.124.33 (talk) 06:07, 15 January 2009 (UTC)[reply]

Also it would be better write "340 282 366 920 938 463 463 374 607 431 768 211 456 possible UUIDs", in stead of "There are 340,282,366,920,938,463,463,374,607,431,768,211,456 possible UUIDs". — Preceding unsigned comment added by Palladipeloarancione (talk • contribs) 13:47, 5 April 2012 (UTC)[reply]

Where UUID stored?

Please say where the UUID number is stored in flash memory cards.

I have a CF card that shows up as

/dev/disk/by-uuid/2004-1223 -> ../../sdb1

and an SD card that doesn't seem to have a UUID. Can one use e.g., the GNU/Linux dd command to see where the UUID is stored? Jidanni (talk) 01:24, 13 July 2008 (UTC)[reply]

It is not a property of the SD card, but of the partitions contained in it. It is part of the filesystem, depending on which filesystem you have in it. This number (2004-1223) is most likely to be a 32-bit FAT volume serial number, which is not an UUID as described in this article.

This number is chosen when you create the filesystem in (ie, format) the SD card. On Linux, you may force a given number by passing the -i xxxxxxxx parameter to mkdosfs.

The FAT article tells exactly where the serial number is stored inside the partition, but this is out of the scope of this article. UUIDs are not used by Windows to identify its partitions.

--Juliano (T) 17:35, 13 July 2008 (UTC)[reply]

Is there an error or am I a bad counter?

I think, that 8+4+4+4+12=32 not 36? —Preceding unsigned comment added by BartekBl (talk • contribs) 20:19, 14 November 2008 (UTC)[reply]

32 hexadecimal digits, plus 4 dashes, equals to 36 characters in the textual representation. --Juliano (T) 11:31, 15 November 2008 (UTC)[reply]

NCS UUID

When I first came across UUIDs they always seemed to be referred to as NCS UUIDs, but there is no mention of NCS in the article. I don't know where NCS fits in, but I would like to. —Preceding unsigned comment added by 170.148.215.156 (talk) 14:16, 5 March 2009 (UTC)[reply]

Too many implementations

Am I the only one who thinks that mentioning implementations of UUIDs in every possible language/environment (29 currently) only clutters the article and doesn't belong to encyclopedia? I think the section should either be removed or shortened a lot. Svick (talk) 18:55, 7 October 2009 (UTC)[reply]

I think it's useful to continue to include them. They might be moved to List of UUID implementations, but that seems like overkill. Maybe change the list to a toggled table? Ant (talk) 15:29, 19 November 2010 (UTC)[reply]

Why do you think it's useful? Could you elaborate on that? Keep in mind that Wikipedia is not a repository of links. Svick (talk) 17:51, 20 November 2010 (UTC)[reply]

"Sufficient entropy" requirement for collision avoidance

However, these probabilities only hold when the UUIDs are generated using sufficient entropy. [1]

This is true, but it seems to gloss over an important factor. Some UUID allocation schemes are deliberately running with reduced entropy. On the same page, for example, SQL Server's NEWSEQUENTIALID() is mentioned, and there's the practice of overwriting a portion of the GUID with predictable information, for example the wFormatTag/GUID conversion used in WAVE_FORMAT_EXTENSIBLE headers. I don't assert that either of these examples drastically reduce the safety of the system, but similar implementations may be more greedy with the number of predictable bits, and substantially increase the risk of collision. --ToobMug (talk) 13:36, 10 February 2010 (UTC)[reply]

Such schemes generally use the MAC address based form of UUIDs. Those do not depend on entropy for uniqueness, only the improbability of generating two UUIDs exactly 2^60 / 10 ^ 7 seconds (which is more than 3600 years) apart and expecting them to differ. Everything else is by central authority: First digits of MAC address allocated by IEEE to hardware maker, remaining digits of MAC allocated by hardware maker (one per card made), 60 bit Date/time allocated by the clock on the computer with that card, 14 bit reboot counter allocated by OS of that computer, variant and type bits fixed by standard. To generate sequential UUIDs with this scheme, simply reserve a continuous time interval of the needed number of 100ns units, and allocate them all as a block from the system daemon/facility that ensures only one program gets the UUID for a given moment in time. 77.215.46.17 (talk) 23:38, 18 April 2011 (UTC)[reply]

Compress the Implementations section?

Would anybody mind if I replaced the entire Implementations section with just a list of languages in which implementations exist (to give a feel for how widespread adoption is) and all the citations? I think that would improve overall readability of the article and detract almost nothing from the content. Anybody who is looking for a specific implementation need only google "uuid <language name>". -- RoySmith (talk) 18:05, 26 June 2011 (UTC)[reply]

Yes, I would mind. Implementations are not always conformant (e.g. CouchDB's) or may offer differing features, licensing, be target for different compilers, etc. Lambda-mon key (talk) 01:17, 24 August 2011 (UTC)[reply]

"In perspective"

Regarding the paragraph "To put these numbers into perspective, one's annual risk of being hit by a meteorite...", this is bad exposition. People are notoriously bad at having an intuition for these types of odds, so if your goal is to assess risk, this isn't helpful. — Preceding unsigned comment added by 17.209.4.116 (talk) 06:19, 23 February 2012 (UTC)[reply]

entropy

the Author wrote: >>However, these probabilities only hold when the UUIDs are generated using sufficient entropy. >>Otherwise the probability of duplicates may be significantly higher, since the statistical dispersion may be lower.

So what is required to provide "sufficient entropy?" — Preceding unsigned comment added by Eostermueller (talk • contribs) 15:37, 24 February 2012 (UTC)[reply]

Citation for "e2fsprogs is used by all these people"

The current version of this article (2012-05-07T13:56:32) says that "Linux's ext2/ext3 filesystem, LUKS encrypted partitions, GNOME, KDE, and Mac OS X" all generate UUIDs (or GUIDs? paragraph is unclear) by using the "e2fsprogs" software. The citation for this links to what appears to be the e2fsprogs website, but (as of this writing) this webpage doesn't make any claims regarding what other software uses it to create UUIDs/GUIDs. Bowmanjj (talk) 16:11, 9 May 2012 (UTC)[reply]

I was bored so I dug around and confirmed that e2fsprogs does indeed implement UUID. It's under lib/uuid. A direct link that I hope works is http://git.kernel.org/?p=fs/ext2/e2fsprogs.git;a=tree;f=lib/uuid;h=8b3114ef4e04e05248251da519633bcc982021ae;hb=HEAD Now how to site this? I have no idea 68.190.112.86 (talk) 10:47, 6 July 2012 (UTC)[reply]

I've added some links, though they're a bit ugly. For Mac OS X, e2fsprogs' implementation appears to have been introduced in 10.4; 10.3.9's CFUUIDCreate() instead calls _CFUUIDGenerate() which appears to originate with DEC (via HP and the OSF). However, I find the sentence problematic because it conflates what UUIDs are used for (e.g. for identifying filesystems) with what they are used by (e.g. by the kernel, for finding the root FS) and what they are generated by (e.g. gen_uuid.c). I think the anecdote that a single implementation has been copied into many projects is worth mentioning, but could use a little rephrasing. I also changed the wording WRT ext2/ext3 — UUIDs are not used by "the filesystem" (the filesystem driver almost certainly doesn't care!), they're used to identify the filesystem across reboots/hotplugging/etc. ⇌Elektron 02:24, 11 July 2012 (UTC)[reply]

LVM UUID / Non-standard UUIDs

The Logical Volume Manager (Linux) seems to use a non-standard form of UUID using digits and upper and lowercase letters, for a space of size 62^32 = 2.27x10^57 or 2^190. An LVM UUID of "VexQRf-qHxg-dQ8N-AM6r-Xtf0-WvItBa" does not fit the standard referenced in this article. The LVM code at http://git.fedorahosted.org/git/?p=lvm2.git;a=tree;f=lib/uuid;hb=HEAD documents the implementation. Should the page cover non-standard UUIDs like this? Drf5n (talk) 20:15, 30 July 2012 (UTC)[reply]

Not the only one horrified...

..but perhaps for a different reason, namely that UUIDs involve, in principle, the computer telling lies to the user about what an object's real name or identifier is. A disk partition is not /dev/sda1, it is actually 87937597593793753.... A user in Active Directory, likewise. This is obfuscation of the worst possible kind, and ought to be avoided if at all possible. It creates numerous operational difficulties where legitimate and normal changes to a computer give rise to unexpected and unpredictable results, and in some circumstances can cause backups to be unusable or data to be lost. --Anteaus (talk) 19:10, 13 September 2012 (UTC)[reply]

Fortunately, in most of the examples you mention, the computer is not lying. The thing has one or more names and one or more numbers, some of those numbers happen to be UUIDs. A disk partition is /dev/sda1 (meaning the first partition on the first disk that uses a libscsi driver with your current kernel and boot timing), partition # xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx in the GPT partition table on disk # yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy, the partition named "MyHomes" inside its superblock, the partition numbered zzzzzzzz-zzzz-zzzz-zzzz-zzzzzzzzzzzz inside itsown superblock and the partition currently mounted as /home until you change your mind. A user in active directory is user RID #1011 in domain foo.example (which has # S-1-5-21-yyyyyyyyy-yyyyyyyyy), it is also SID S-1-5-21-yyyyyyyyy-yyyyyyyyy-1011, user "John Doe", login username jd@foo.example and AD object {xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}. Users can (with some difficulty) see all these numbers and names, but tends to prefer some to others.

The operational difficulties are mostly caused by the other numbers, not the UUIDs, or by trying to use configuration files that refer to UUIDs that were not restored (such as creating a new filesystem with a new UUIDs and then restoring an /etc/fstab that refers to those UUIDs, or upgrading to a kernel that renames /dev/hda1 to /dev/sda1 as happened recently to all Linux users).

I will admit though that some recent operating systems (from both camps) tend to go out of their way to hide the real names of things from users, resulting in much operational difficulty, but the UUIDs are the least of the problems here, using pretty names such as "John Doe" while hiding the more specific names such as "jd" is a much bigger problem.

77.215.46.17 (talk) 03:43, 26 November 2012 (UTC)[reply]