← Back to index | Original Bugzilla link

Bug 10550 – Xorshift32 and Xorshift160 do not generate uniformly-distributed random numbers

Status: RESOLVED
Resolution: FIXED
Severity: major
Priority: P2
Component: phobos
Product: D
Version: D2
Platform: All
OS: All
Creation time: 2013-07-05T07:11:03Z
Last change time: 2018-03-27T22:46:45Z
Assigned to: Masahiro Nakagawa
Creator: Joseph Rushton Wakeling

Comments

Comment #0 by joseph.wakeling — 2013-07-05T07:11:03Z

The output of Xorshift32 and Xorshift160 departs strongly from uniformity. The issue for Xorshift32 can be seen clearly even in this very simple piece of code: auto rng = Xorshift32(unpredictableSeed); foreach(r; rng.take(20)) { writeln(r, "\t", rng.min, "\t", rng.max); } ... where we can see that the generated values are always much closer to rng.min than rng.max. A more sophisticated example (but still very simple) is provided in the test of uniformity in https://github.com/WebDrake/std.random.test -- specifically, in: https://github.com/WebDrake/std.random.test/blob/master/uniform.d https://github.com/WebDrake/std.random.test/blob/master/test/stats.d ... which generates millions of random numbers using uniform(0.0, 0.1, rng), divides them up into bins of width 0.05, and outputs the histogram and cumulative distribution values. For Xorshift32, this gives: Generating 10000000 random numbers in [0, 1) with XorshiftEngine!(uint, 32, 13, 17, 5) 5 1 1 10 0 1 15 0 1 20 0 1 25 0 1 30 0 1 35 0 1 40 0 1 45 0 1 50 0 1 55 0 1 60 0 1 65 0 1 70 0 1 75 0 1 80 0 1 85 0 1 90 0 1 95 0 1 100 0 1 ... while for Xorshift160 it gives: Generating 10000000 random numbers in [0, 1) with XorshiftEngine!(uint, 160, 2, 1, 4) 5 0.0535713 0.0535713 10 0.0208334 0.0744047 15 0.0446429 0.119048 20 0.047619 0.166667 25 0.0714286 0.238095 30 0.0238096 0.261905 35 0.0208333 0.282738 40 0.0416667 0.324405 45 0.0595238 0.383929 50 0.0446429 0.428572 55 0.0476191 0.476191 60 0.0595239 0.535714 65 0.0714285 0.607143 70 0.0535714 0.660714 75 0.0535713 0.714286 80 0.0744045 0.78869 85 0.0744049 0.863095 90 0.0357143 0.898809 95 0.0238095 0.922619 100 0.0773811 1 Other RNGs, including other Xorshift types, appear to generate correct proportions.

Comment #1 by joseph.wakeling — 2013-07-05T07:35:07Z

The Xorshift32 non-uniformity can be fixed by correcting the update rule in popFront(), from: static if (bits == 32) { temp = seeds_[0] ^ (seeds_[0] << a); temp = temp >> b; seeds_[0] = temp ^ (temp << c); } to: static if (bits == 32) { temp = seeds_[0] ^ (seeds_[0] << a); temp = temp ^ (temp >> b); seeds_[0] = temp ^ (temp << c); } See p.3 of http://www.jstatsoft.org/v08/i14/paper -- the current implementation appears to be a typo when copying the first from the list of possible update rules. However, if this change is made, the Xorshift unittests fail for the checks against the reference edition: auto checking = [ [2463534242UL, 267649, 551450, 53765, 108832, 215250, 435468, 860211, 660133, 263375], [362436069UL, 2113136921, 19051112, 3010520417, 951284840, 1213972223, 3173832558, 2611145638, 2515869689, 2245824891], [521288629UL, 1950277231, 185954712, 1582725458, 3580567609, 2303633688, 2394948066, 4108622809, 1116800180, 3357585673], [88675123UL, 3701687786, 458299110, 2500872618, 3633119408, 516391518, 2377269574, 2599949379, 717229868, 137866584], [5783321UL, 93724048, 491642011, 136638118, 246438988, 238186808, 140181925, 533680092, 285770921, 462053907], [0UL, 246875399, 3690007200, 1264581005, 3906711041, 1866187943, 2481925219, 2464530826, 1604040631, 3653403911] ]; alias TypeTuple!(Xorshift32, Xorshift64, Xorshift96, Xorshift128, Xorshift160, Xorshift192) XorshiftTypes; foreach (I, Type; XorshiftTypes) { Type rnd; foreach (e; checking[I]) { assert(rnd.front == e); rnd.popFront(); } }

Comment #2 by joseph.wakeling — 2013-07-05T07:35:38Z

I also think the choice of a, b, c values may be in error: currently we have alias XorshiftEngine!(uint, 32, 13, 17, 5) Xorshift32; ... but I think this is most likely a typo for alias XorshiftEngine!(uint, 32, 13, 17, 15) Xorshift32; as the paper states that there should be a < c and the triple 13, 17, 5 is not found among the list of valid triples (while 13, 17, 15 is). However, correcting this does not prevent the unittest fail described in the previous comment.

Comment #3 by joseph.wakeling — 2013-07-05T07:38:56Z

(In reply to comment #2) > I also think the choice of a, b, c values may be in error: currently we have > > alias XorshiftEngine!(uint, 32, 13, 17, 5) Xorshift32; This seems to stem from a descriptive passage at the top of p.4 of the paper, where this choice of a, b, c values is described as one of the author's favourites. However, I suspect that this is a typo as it violates the a < c rule described elsewhere in the paper and (as already stated) the 13, 17, 5 triple is not found in the table of appropriate triples for 32-bit Xorshift.

Comment #4 by joseph.wakeling — 2013-07-05T08:12:24Z

Uniformity in Xorshift160 can be restored by tweaking the update rules: else static if (bits == 160) { temp = seeds_[0] ^ (seeds_[0] >> a); seeds_[0] = seeds_[1]; seeds_[1] = seeds_[2]; seeds_[2] = seeds_[3]; seeds_[3] = seeds_[4]; seeds_[4] = seeds_[4] ^ (seeds_[4] >> c) ^ temp ^ (temp >> b); } ... which faithfully reproduce what is given at the bottom of p.4 of the paper, and changing the first line to: temp = seeds_[0] ^ (seeds_[0] << a); Note that this change was pure guesswork on the grounds that other bit-values of the algorithm had this alternative formulation. It also results in a failure of the unittest on line 1032 of std.random. Unfortunately George Marsaglia has died, so we can't ask him about typos in his papers. :-( I was not able to find any erratum to the published article, but the discrepancies already identified make me suspect that it must be in error in several places.

Comment #5 by bearophile_hugs — 2013-07-08T01:46:42Z

Maybe this bug should have a "major" importance. And maybe a warning note in site ddocs should be added in the meantime. These tests can help: http://en.wikipedia.org/wiki/Diehard_tests >Unfortunately George Marsaglia has died, so we can't ask him about typos in his papers.< Some people should live longer. Probably some of his collaborators or people that have used his generators have some errata list or some suggestions to help.

Comment #6 by joseph.wakeling — 2013-07-08T02:46:40Z

(In reply to comment #5) > Maybe this bug should have a "major" importance. And maybe a warning note in > site ddocs should be added in the meantime. Agree with the "major" importance, tweaked accordingly. > These tests can help: > http://en.wikipedia.org/wiki/Diehard_tests These are now a little out of date, but there is the "dieharder" suite which is actually available as a utility in many Linux distros. I plan on incorporating that into my test suite -- I think it should be as simple as just getting D to generate random variates using whatever method and then piping them through to dieharder. > Some people should live longer. Probably some of his collaborators or people > that have used his generators have some errata list or some suggestions to > help. I did not find anything yet, but I'll keep looking. I assumed that his homepage would still exist with remarks like this on it, but didn't track down anything useful so far. There are other people who've written follow-up papers on Xorshift who could be worth contacting. I may do that if I can't find obvious documentary material.

Comment #7 by joseph.wakeling — 2013-07-08T05:58:49Z

What I'd really like is to have a source for the checking values for Xorshift used in the unittests. Masahiro, do you recall how you obtained these values? They're not in Marsaglia's paper, and Google searches to track down their source are proving fruitless. :-(

Comment #8 by repeatedly — 2013-07-08T07:28:26Z

(In reply to comment #7) > What I'd really like is to have a source for the checking values for Xorshift > used in the unittests. Masahiro, do you recall how you obtained these values? > They're not in Marsaglia's paper, and Google searches to track down their > source are proving fruitless. :-( I remember correctly, I generated test cases from paper based C implementation. I implemented C version first and generated test cases for D unittest. After that, I implemented XorshiftEngine for D-ish code.

Comment #9 by joseph.wakeling — 2013-07-08T07:51:04Z

(In reply to comment #8) > I remember correctly, I generated test cases from paper based C implementation. > I implemented C version first and generated test cases for D unittest. > After that, I implemented XorshiftEngine for D-ish code. Ahhh, OK. Then I think we can reasonably assume that these test sequences, like the code in std.random, reflect typos in Marsaglia's paper. For Xorshift32 we can see that Marsaglia's C implementation at the very top of p.32 contains at least one typo -- the second bitshift y=(y>>17) should be y^=(y>>17). I am strongly suspicious that there is a second typo for the value of c, which should be 15 rather than 5. It's definitely not the only such typo in the paper -- Panneton and L'Ecuyer (2005) note that the (a, b, c) triple (9, 5, 1) should be (9, 5, 14). In any case, we lose nothing by using the triple (13, 17, 15) since it's acknowledged in the paper as a valid choice. For Xorshift160 I think that the code given has a typo for the bitshift with respect to a, as in all other such code examples the bitshift is in the opposite direction to that for b and c. In general, the bitshifts seem to follow a rule of two in one direction, one in the other. In summary, I think we can proceed as follows: * confirm with experts in the field the typos in the paper * generate new checking sequences with corrected versions of the C code * correct the D code and unittests accordingly I am actually inclined to jump ahead with getting the patches to Phobos done, because I'm pretty confident my analysis here is correct :-)

Comment #10 by repeatedly — 2013-07-08T09:26:03Z

(In reply to comment #9) > ... > In summary, I think we can proceed as follows: > > * confirm with experts in the field the typos in the paper > > * generate new checking sequences with corrected versions of the C code > > * correct the D code and unittests accordingly > > I am actually inclined to jump ahead with getting the patches to Phobos done, > because I'm pretty confident my analysis here is correct :-) I rechecked the paper and I agree with you (y^=, a < c and 160's >>) So I think we can fix this paper derived bugs. Could you send the pull request? After passed auto tester, I will merge it. On the other hand, confirm the typos is good for future generations.

Comment #11 by joseph.wakeling — 2013-07-08T09:47:04Z

(In reply to comment #10) > I rechecked the paper and I agree with you (y^=, a < c and 160's >>) > So I think we can fix this paper derived bugs. > Could you send the pull request? > After passed auto tester, I will merge it. Yes, I'll get that submitted either later this evening or tomorrow. :-) > On the other hand, confirm the typos is good for future generations. Agree. I also think it could be a good way to begin outreach to the PRNG academic community -- see: http://d.puremagic.com/issues/show_bug.cgi?id=10572

Comment #12 by joseph.wakeling — 2013-07-08T12:49:09Z

(In reply to comment #11) > (In reply to comment #10) > > I rechecked the paper and I agree with you (y^=, a < c and 160's >>) > > So I think we can fix this paper derived bugs. > > Could you send the pull request? > > After passed auto tester, I will merge it. > > Yes, I'll get that submitted either later this evening or tomorrow. :-) First step -- corrected reference code: https://github.com/WebDrake/xorshift I first implemented versions that match _exactly_ Marsaglia's sample code, and that reproduce the checking values from std.random: https://github.com/WebDrake/xorshift/blob/4305bd5a0ac4d94f59505713a085b34d2e4d482e/xorshift.c Subsequent commits correct the typos we've observed, and so the current version can be used to generate updated checking values.

Comment #13 by joseph.wakeling — 2013-07-08T14:31:53Z

Pull request submitted: https://github.com/D-Programming-Language/phobos/pull/1403

Comment #14 by github-bugzilla — 2013-07-08T22:48:03Z

Commit pushed to master at https://github.com/D-Programming-Language/phobos https://github.com/D-Programming-Language/phobos/commit/e1504f70b8cfdf4cd1082408fe2452a71e299ab3 Merge pull request #1403 from WebDrake/xorshift Fix Issue 10550 - Xorshift32 and Xorshift160 do not generate uniformly-distributed random numbers

Comment #15 by joseph.wakeling — 2013-07-09T01:04:56Z

Masahiro -- thanks so much for your fast attention to this. :-)

Comment #16 by joseph.wakeling — 2013-07-09T11:16:30Z

I think we can now close the bug, but I will try and follow up with some people expert in RNG design to see if we can confirm the fixes really are correct.

Comment #17 by zbbjornson — 2018-03-27T22:46:45Z

I know this is an old bug, but for the sake of tying up loose ends: The 13,17,5 triple is valid: On the bottom of page 2 [1], Marsaglia says "Of those 81 triples with a < c, the triple (c, b, a) also provides a full period T...". The triple 5,17,13 appears in the 32-bit table, so 13,17,5 is valid. The triple used in the commit in this issue (13,17,15) is also valid, but was an unnecessary change. The fix to add the xor was necessary. [1] http://www.jstatsoft.org/v08/i14/paper