According to my document on pseudorandom number generator (PRNG) recommendations, a high-quality PRNG, among other requirements—
generates bits that behave like independent uniform random bits (at least for nearly all practical purposes outside of information security)[,]
a requirement called the “independence requirement” in this short document.
Note: This document is intended to apply only to PRNGs not used for information security (including cryptography).
To determine whether a PRNG meets the independence requirement, its output should be sent to the PractRand program by Chris Doty-Humphrey and show no failures (“FAILs”) in the PractRand tests at 1 TiB (2^40 bytes) or greater. For more information, see “How to Test with PractRand” by M. E. O’Neill.
Random number streams. Many PRNGs use different strategies to produce nearby sequences (or streams) of pseudorandom numbers. But not every strategy produces independent streams. To determine whether nearby sequences of the PRNG meet the independence requirement, the output sent to PractRand should be formed by interleaving the outputs of those sequences (for example, one output from the first sequence, one output from the second, another output from the first, another from the second, and so on).
There are several kinds of nearby sequences to test for this purpose:
The leapfrogging technique (Bauke and Mertens 2007)^{1} involves assigning N processes each a PRNG that differs from the last by 1 step, then having each such PRNG advance N steps, where N is the number of PRNGs, each time it generates a pseudorandom number. However, note that testing nearby sequences produced by leapfrogging is redundant with testing the regular PRNG sequence without streams.
Hash functions and counter-based PRNGs. In general, a counter-based PRNG produces pseudorandom numbers by transforming a seed and a counter; with each number, it increments the counter and leaves the seed unchanged (Salmon et al. 2011)^{2}. The seed and counter can be transformed using block ciphers, other permutation functions, or hash functions. In general, counter-based PRNGs that use hash functions (such as MD5, SHA-1, MurmurHash, CityHash, xxHash) will meet the independence requirement if the following hash stream (for that hash function) doesn’t fail the PractRand tests at 1 TiB or greater:
seed || 0x5F || counter
(the ||
symbol means concatenation).(seed+1) || 0x5F || counter
.counter
and go to the first step.In general, a hash function without PractRand failures is worthy of mention if it’s noncryptographic and faster than hash functions designed for cryptography, such as MD5 and the SHA family.
Combined PRNGs. As G. Marsaglia (in KISS), D. Jones (in JKISS), and A. Fog (2015)^{3} have recognized, combining two or more PRNGs of weaker quality often leads to a higher-quality PRNG. It might be possible to convert a PRNG that isn’t high-quality to a high-quality PRNG in one of the following ways:
P(x)
to transform that state, have the PRNG generate each number as follows instead:
P(state)
or S(P(state))
, where S(x)
is one of the four scramblers defined in (Blackman and Vigna 2019)^{6} (+, ++, *, **).Other combinations and transformations. There are other ways to combine two PRNGs, or to transform a single PRNG, but they are not preferred ways to build a high-quality PRNG. They include:
shuffle_block_engine
).Splittable PRNGs. A splittable PRNG consists of two operations: a split
operation to create multiple new internal states from one, and a generate
operation to produce a pseudorandom number from a state (Schaathun 2015; Claessen et al., 2013)^{9}. The Schaathun paper surveys several known constructions of splittable PRNGs. Some of the constructions can be used by any PRNG, but do not necessarily lead to high-quality splittable PRNGs.
The Schaathun paper suggests the following four pseudorandom number sequences for testing purposes:
seed
and g
to split(seed)[0]
and split(seed)[1]
, respectively.t
to split(split(g)[0])
, write out generate(t[0])
, and write out generate(t[1])
.t
to split(split(g)[1])
, write out generate(t[0])
, and write out generate(t[1])
.seed
and g
to split(seed)[1]
and split(seed)[0]
, respectively, and write out generate(g)
. Go to the first step.seed
and g
to split(seed)[0]
and split(seed)[1]
, respectively, and write out generate(g)
. Go to the first step.seed
and g
to split(seed)[1]
and split(seed)[0]
, respectively, and write out generate(g)
.seed
and g
to split(seed)[0]
and split(seed)[1]
, respectively, and write out generate(g)
.Bauke and Mertens, “Random numbers for large-scale distributed Monte Carlo simulations”, 2007 ↩
Salmon, John K., Mark A. Moraes, Ron O. Dror, and David E. Shaw. “Parallel random numbers: as easy as 1, 2, 3.” In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1-12. 2011. ↩
Agner Fog, “Pseudo-Random Number Generators for Vector Processors and Multicore Processors”, Journal of Modern Applied Statistical Methods 14(1), article 23 (2015). ↩
A permutation function (or bijection) is a reversible mapping from N-bit integers to N-bit integers. Examples include: JSF64 by B. Jenkins; MIX and MIX-i (part of Tyche and Tyche-i); the Romu family by Mark Overton; block ciphers with a fixed key; mixing functions with reversible operations as described in “Hash functions” by B. Mulvey. ↩
As P. Evensen shows, the choice of constant can matter for a given permutation function. ↩
Blackman, D., Vigna, S., “Scrambled Linear Pseudorandom Number Generators”, 2019. ↩
J. D. Cook, “Using one RNG to sample another”, June 4, 2019. ↩
von Neumann, J., “Various techniques used in connection with random digits”, 1951. ↩
Schaathun, H.G. “Evaluation of Splittable Pseudo-Random Generators”, 2015; Claessen, K., et al. “Splittable Pseudorandom Number Generators using Cryptographic Hashing”, Proceedings of Haskell Symposium 2013, pp. 47-58. ↩