Bug 19836 – Excessive probability of UUID collisions in std.uuid.randomUUID

Status
RESOLVED
Resolution
FIXED
Severity
normal
Priority
P1
Component
phobos
Product
D
Version
D2
Platform
All
OS
All
Creation time
2019-04-28T19:04:44Z
Last change time
2019-05-18T00:39:15Z
Keywords
pull
Assigned to
No Owner
Creator
Nathan S.

Comments

Comment #0 by n8sh.secondary — 2019-04-28T19:04:44Z
`std.uuid.randomUUID` defaults to using `rndGen`. Because every `rndGen` starts in one of 2^^32 states then if 77000 independent programs each generate a single UUID there is a 50% chance that at least two of them generate the same initial UUID (and all subsequent UUIDs from those programs would be identical as well). This problem is shared by C++ boost::uuids::random_generator which also generates UUIDs using a PRNG initialized with a 32-bit seed.
Comment #1 by dlang-bot — 2019-04-28T19:10:22Z
@n8sh created dlang/phobos pull request #6985 "Fix Issue 19836 - Excessive probability of UUID collisions in std.uuid.randomUUID" fixing this issue: - Fix Issue 19836 - Excessive probability of UUID collisions in std.uuid.randomUUID On 64-bit architectures use 64 bits of entropy to initialize thread-local `rndGen`. The motivation for this change is std.uuid defaults to using `rndGen` to generate UUIDs. If every `rndGen` starts in one of 2^^32 states then if 77000 independent programs each generate a single UUID there is a 50% chance that at least two of them generate the same initial UUID (and all subsequent UUIDs would be identical as well). Not just Phobos but also C++ boost::uuids::random_generator defaults to generating UUIDs with a Mersenne Twister initialized from a 32-bit seed, exacerbating the collision problem further. If instead there are 2^^64 possible initial states of `rndGen` there can be over 5 billion independent `rndGen`s before there is a 50% chance of two having identical initial states. This change is limited to 64-bit architectures to avoid a measurable performance decrease, because many programs are not generating UUIDs. https://github.com/dlang/phobos/pull/6985
Comment #2 by dlang-bot — 2019-04-29T07:50:16Z
dlang/phobos pull request #6985 "Fix Issue 19836 - Excessive probability of UUID collisions in std.uuid.randomUUID" was merged into master: - 5d0f1d3471b95d6d22406001f3e7ba52b4691aa2 by Nathan Sashihara: Fix Issue 19836 - Excessive probability of UUID collisions in std.uuid.randomUUID On 64-bit architectures use 64 bits of entropy to initialize thread-local `rndGen`. The motivation for this change is std.uuid defaults to using `rndGen` to generate UUIDs. If every `rndGen` starts in one of 2^^32 states then if 77000 independent programs each generate a single UUID there is a 50% chance that at least two of them generate the same initial UUID (and all subsequent UUIDs would be identical as well). Not just Phobos but also C++ boost::uuids::random_generator defaults to generating UUIDs with a Mersenne Twister initialized from a 32-bit seed, exacerbating the collision problem further. If instead there are 2^^64 possible initial states of `rndGen` there can be over 5 billion independent `rndGen`s before there is a 50% chance of two having identical initial states. This change is limited to 64-bit architectures to avoid a measurable performance decrease, because many programs are not generating UUIDs. https://github.com/dlang/phobos/pull/6985
Comment #3 by dlang-bot — 2019-05-18T00:39:15Z
dlang/phobos pull request #6994 "Issue 19836 followup - also fix on 32 bit machines" was merged into master: - 8d499765142b2bb7c97ca07d7f8b9afc17cb5776 by Nathan Sashihara: Issue 19836 followup - also fix on 32 bit machines Changed the private enhanced seeding method for `rndGen` to something that is fast on both 64 bit and 32 bit machines so can be enabled regardless of architecture. When compiled with LDC it is about 1.35x the speed of public `Mt19937.seed(uint)`. https://github.com/dlang/phobos/pull/6994