[guid] How unique is UUID?

How safe is it to use UUID to uniquely identify something (I'm using it for files uploaded to the server)? As I understand it, it is based off random numbers. However, it seems to me that given enough time, it would eventually repeat it self, just by pure chance. Is there a better system or a pattern of some type to alleviate this issue?

This question is related to guid uniqueidentifier uuid

The answer is


For UUID4 I make it that there are approximately as many IDs as there are grains of sand in a cube-shaped box with sides 360,000km long. That's a box with sides ~2 1/2 times longer than Jupiter's diameter.

Working so someone can tell me if I've messed up units:

  • volume of grain of sand 0.00947mm^3 (Guardian)
  • UUID4 has 122 random bits -> 5.3e36 possible values (wikipedia)
  • volume of that many grains of sand = 5.0191e34 mm^3 or 5.0191e+25m^3
  • side length of cubic box with that volume = 3.69E8m or 369,000km
  • diameter of Jupiter: 139,820km (google)

Here's a testing snippet for you to test it's uniquenes. inspired by @scalabl3's comment

Funny thing is, you could generate 2 in a row that were identical, of course at mind-boggling levels of coincidence, luck and divine intervention, yet despite the unfathomable odds, it's still possible! :D Yes, it won't happen. just saying for the amusement of thinking about that moment when you created a duplicate! Screenshot video! – scalabl3 Oct 20 '15 at 19:11

If you feel lucky, check the checkbox, it only checks the currently generated id's. If you wish a history check, leave it unchecked. Please note, you might run out of ram at some point if you leave it unchecked. I tried to make it cpu friendly so you can abort quickly when needed, just hit the run snippet button again or leave the page.

_x000D_
_x000D_
Math.log2 = Math.log2 || function(n){ return Math.log(n) / Math.log(2); }_x000D_
  Math.trueRandom = (function() {_x000D_
  var crypt = window.crypto || window.msCrypto;_x000D_
_x000D_
  if (crypt && crypt.getRandomValues) {_x000D_
      // if we have a crypto library, use it_x000D_
      var random = function(min, max) {_x000D_
          var rval = 0;_x000D_
          var range = max - min;_x000D_
          if (range < 2) {_x000D_
              return min;_x000D_
          }_x000D_
_x000D_
          var bits_needed = Math.ceil(Math.log2(range));_x000D_
          if (bits_needed > 53) {_x000D_
            throw new Exception("We cannot generate numbers larger than 53 bits.");_x000D_
          }_x000D_
          var bytes_needed = Math.ceil(bits_needed / 8);_x000D_
          var mask = Math.pow(2, bits_needed) - 1;_x000D_
          // 7776 -> (2^13 = 8192) -1 == 8191 or 0x00001111 11111111_x000D_
_x000D_
          // Create byte array and fill with N random numbers_x000D_
          var byteArray = new Uint8Array(bytes_needed);_x000D_
          crypt.getRandomValues(byteArray);_x000D_
_x000D_
          var p = (bytes_needed - 1) * 8;_x000D_
          for(var i = 0; i < bytes_needed; i++ ) {_x000D_
              rval += byteArray[i] * Math.pow(2, p);_x000D_
              p -= 8;_x000D_
          }_x000D_
_x000D_
          // Use & to apply the mask and reduce the number of recursive lookups_x000D_
          rval = rval & mask;_x000D_
_x000D_
          if (rval >= range) {_x000D_
              // Integer out of acceptable range_x000D_
              return random(min, max);_x000D_
          }_x000D_
          // Return an integer that falls within the range_x000D_
          return min + rval;_x000D_
      }_x000D_
      return function() {_x000D_
          var r = random(0, 1000000000) / 1000000000;_x000D_
          return r;_x000D_
      };_x000D_
  } else {_x000D_
      // From http://baagoe.com/en/RandomMusings/javascript/_x000D_
      // Johannes Baagøe <[email protected]>, 2010_x000D_
      function Mash() {_x000D_
          var n = 0xefc8249d;_x000D_
_x000D_
          var mash = function(data) {_x000D_
              data = data.toString();_x000D_
              for (var i = 0; i < data.length; i++) {_x000D_
                  n += data.charCodeAt(i);_x000D_
                  var h = 0.02519603282416938 * n;_x000D_
                  n = h >>> 0;_x000D_
                  h -= n;_x000D_
                  h *= n;_x000D_
                  n = h >>> 0;_x000D_
                  h -= n;_x000D_
                  n += h * 0x100000000; // 2^32_x000D_
              }_x000D_
              return (n >>> 0) * 2.3283064365386963e-10; // 2^-32_x000D_
          };_x000D_
_x000D_
          mash.version = 'Mash 0.9';_x000D_
          return mash;_x000D_
      }_x000D_
_x000D_
      // From http://baagoe.com/en/RandomMusings/javascript/_x000D_
      function Alea() {_x000D_
          return (function(args) {_x000D_
              // Johannes Baagøe <[email protected]>, 2010_x000D_
              var s0 = 0;_x000D_
              var s1 = 0;_x000D_
              var s2 = 0;_x000D_
              var c = 1;_x000D_
_x000D_
              if (args.length == 0) {_x000D_
                  args = [+new Date()];_x000D_
              }_x000D_
              var mash = Mash();_x000D_
              s0 = mash(' ');_x000D_
              s1 = mash(' ');_x000D_
              s2 = mash(' ');_x000D_
_x000D_
              for (var i = 0; i < args.length; i++) {_x000D_
                  s0 -= mash(args[i]);_x000D_
                  if (s0 < 0) {_x000D_
                      s0 += 1;_x000D_
                  }_x000D_
                  s1 -= mash(args[i]);_x000D_
                  if (s1 < 0) {_x000D_
                      s1 += 1;_x000D_
                  }_x000D_
                  s2 -= mash(args[i]);_x000D_
                  if (s2 < 0) {_x000D_
                      s2 += 1;_x000D_
                  }_x000D_
              }_x000D_
              mash = null;_x000D_
_x000D_
              var random = function() {_x000D_
                  var t = 2091639 * s0 + c * 2.3283064365386963e-10; // 2^-32_x000D_
                  s0 = s1;_x000D_
                  s1 = s2;_x000D_
                  return s2 = t - (c = t | 0);_x000D_
              };_x000D_
              random.uint32 = function() {_x000D_
                  return random() * 0x100000000; // 2^32_x000D_
              };_x000D_
              random.fract53 = function() {_x000D_
                  return random() +_x000D_
                      (random() * 0x200000 | 0) * 1.1102230246251565e-16; // 2^-53_x000D_
              };_x000D_
              random.version = 'Alea 0.9';_x000D_
              random.args = args;_x000D_
              return random;_x000D_
_x000D_
          }(Array.prototype.slice.call(arguments)));_x000D_
      };_x000D_
      return Alea();_x000D_
  }_x000D_
}());_x000D_
_x000D_
Math.guid = function() {_x000D_
    return 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'.replace(/[xy]/g, function(c)    {_x000D_
      var r = Math.trueRandom() * 16 | 0,_x000D_
          v = c == 'x' ? r : (r & 0x3 | 0x8);_x000D_
      return v.toString(16);_x000D_
  });_x000D_
};_x000D_
function logit(item1, item2) {_x000D_
    console.log("Do "+item1+" and "+item2+" equal? "+(item1 == item2 ? "OMG! take a screenshot and you'll be epic on the world of cryptography, buy a lottery ticket now!":"No they do not. shame. no fame")+ ", runs: "+window.numberofRuns);_x000D_
}_x000D_
numberofRuns = 0;_x000D_
function test() {_x000D_
   window.numberofRuns++;_x000D_
   var x = Math.guid();_x000D_
   var y = Math.guid();_x000D_
   var test = x == y || historyTest(x,y);_x000D_
_x000D_
   logit(x,y);_x000D_
   return test;_x000D_
_x000D_
}_x000D_
historyArr = [];_x000D_
historyCount = 0;_x000D_
function historyTest(item1, item2) {_x000D_
    if(window.luckyDog) {_x000D_
       return false;_x000D_
    }_x000D_
    for(var i = historyCount; i > -1; i--) {_x000D_
        logit(item1,window.historyArr[i]);_x000D_
        if(item1 == history[i]) {_x000D_
            _x000D_
            return true;_x000D_
        }_x000D_
        logit(item2,window.historyArr[i]);_x000D_
        if(item2 == history[i]) {_x000D_
            _x000D_
            return true;_x000D_
        }_x000D_
_x000D_
    }_x000D_
    window.historyArr.push(item1);_x000D_
    window.historyArr.push(item2);_x000D_
    window.historyCount+=2;_x000D_
    return false;_x000D_
}_x000D_
luckyDog = false;_x000D_
document.body.onload = function() {_x000D_
document.getElementById('runit').onclick  = function() {_x000D_
window.luckyDog = document.getElementById('lucky').checked;_x000D_
var val = document.getElementById('input').value_x000D_
if(val.trim() == '0') {_x000D_
    var intervaltimer = window.setInterval(function() {_x000D_
         var test = window.test();_x000D_
         if(test) {_x000D_
            window.clearInterval(intervaltimer);_x000D_
         }_x000D_
    },0);_x000D_
}_x000D_
else {_x000D_
   var num = parseInt(val);_x000D_
   if(num > 0) {_x000D_
        var intervaltimer = window.setInterval(function() {_x000D_
         var test = window.test();_x000D_
         num--;_x000D_
         if(num < 0 || test) {_x000D_
    _x000D_
         window.clearInterval(intervaltimer);_x000D_
         }_x000D_
    },0);_x000D_
   }_x000D_
}_x000D_
};_x000D_
};
_x000D_
Please input how often the calulation should run. set to 0 for forever. Check the checkbox if you feel lucky.<BR/>_x000D_
<input type="text" value="0" id="input"><input type="checkbox" id="lucky"><button id="runit">Run</button><BR/>
_x000D_
_x000D_
_x000D_


UUID schemes generally use not only a pseudo-random element, but also the current system time, and some sort of often-unique hardware ID if available, such as a network MAC address.

The whole point of using UUID is that you trust it to do a better job of providing a unique ID than you yourself would be able to do. This is the same rationale behind using a 3rd party cryptography library rather than rolling your own. Doing it yourself may be more fun, but it's typically less responsible to do so.


Quoting from Wikipedia:

Thus, anyone can create a UUID and use it to identify something with reasonable confidence that the identifier will never be unintentionally used by anyone for anything else

It goes on to explain in pretty good detail on how safe it actually is. So to answer your question: Yes, it's safe enough.


I concur with the other answers. UUIDs are safe enough for nearly all practical purposes1, and certainly for yours.

But suppose (hypothetically) that they aren't.

Is there a better system or a pattern of some type to alleviate this issue?

Here are a couple of approaches:

  1. Use a bigger UUID. For instance, instead of a 128 random bits, use 256 or 512 or ... Each bit you add to a type-4 style UUID will reduce the probability of a collision by a half, assuming that you have a reliable source of entropy2.

  2. Build a centralized or distributed service that generates UUIDs and records each and every one it has ever issued. Each time it generates a new one, it checks that the UUID has never been issued before. Such a service would be technically straight-forward to implement (I think) if we assumed that the people running the service were absolutely trustworthy, incorruptible, etcetera. Unfortunately, they aren't ... especially when there is the possibility of governments' security organizations interfering. So, this approach is probably impractical, and may be3 impossible in the real world.


1 - If uniqueness of UUIDs determined whether nuclear missiles got launched at your country's capital city, a lot of your fellow citizens would not be convinced by "the probability is extremely low". Hence my "nearly all" qualification.

2 - And here's a philosophical question for you. Is anything ever truly random? How would we know if it wasn't? Is the universe as we know it a simulation? Is there a God who might conceivably "tweak" the laws of physics to alter an outcome?

3 - If anyone knows of any research papers on this problem, please comment.


Been doing it for years. Never run into a problem.

I usually set up my DB's to have one table that contains all the keys and the modified dates and such. Haven't run into a problem of duplicate keys ever.

The only drawback that it has is when you are writing some queries to find some information quickly you are doing a lot of copying and pasting of the keys. You don't have the short easy to remember ids anymore.


There is more than one type of UUID, so "how safe" depends on which type (which the UUID specifications call "version") you are using.

  • Version 1 is the time based plus MAC address UUID. The 128-bits contains 48-bits for the network card's MAC address (which is uniquely assigned by the manufacturer) and a 60-bit clock with a resolution of 100 nanoseconds. That clock wraps in 3603 A.D. so these UUIDs are safe at least until then (unless you need more than 10 million new UUIDs per second or someone clones your network card). I say "at least" because the clock starts at 15 October 1582, so you have about 400 years after the clock wraps before there is even a small possibility of duplications.

  • Version 4 is the random number UUID. There's six fixed bits and the rest of the UUID is 122-bits of randomness. See Wikipedia or other analysis that describe how very unlikely a duplicate is.

  • Version 3 is uses MD5 and Version 5 uses SHA-1 to create those 122-bits, instead of a random or pseudo-random number generator. So in terms of safety it is like Version 4 being a statistical issue (as long as you make sure what the digest algorithm is processing is always unique).

  • Version 2 is similar to Version 1, but with a smaller clock so it is going to wrap around much sooner. But since Version 2 UUIDs are for DCE, you shouldn't be using these.

So for all practical problems they are safe. If you are uncomfortable with leaving it up to probabilities (e.g. your are the type of person worried about the earth getting destroyed by a large asteroid in your lifetime), just make sure you use a Version 1 UUID and it is guaranteed to be unique (in your lifetime, unless you plan to live past 3603 A.D.).

So why doesn't everyone simply use Version 1 UUIDs? That is because Version 1 UUIDs reveal the MAC address of the machine it was generated on and they can be predictable -- two things which might have security implications for the application using those UUIDs.


I don't know if this matters to you, but keep in mind that GUIDs are globally unique, but substrings of GUIDs aren't.


If by "given enough time" you mean 100 years and you're creating them at a rate of a billion a second, then yes, you have a 50% chance of having a collision after 100 years.


The answer to this may depend largely on the UUID version.

Many UUID generators use a version 4 random number. However, many of these use Pseudo a Random Number Generator to generate them.

If a poorly seeded PRNG with a small period is used to generate the UUID I would say it's not very safe at all. Some random number generators also have poor variance. i.e. favouring certain numbers more often than others. This isn't going to work well.

Therefore, it's only as safe as the algorithms used to generate it.

On the flip side, if you know the answer to these questions then I think a version 4 uuid should be very safe to use. In fact I'm using it to identify blocks on a network block file system and so far have not had a clash.

In my case, the PRNG I'm using is a mersenne twister and I'm being careful with the way it's seeded which is from multiple sources including /dev/urandom. Mersenne twister has a period of 2^19937 - 1. It's going to be a very very long time before I see a repeat uuid.

So pick a good library or generate it yourself and make sure you use a decent PRNG algorithm.