Menu - Top - Home - Donate to Me

Random Number Generator Recommendations for Applications

Peter Occil

Begun on Mar. 5, 2016; last updated on July 23, 2017.

Most apps that use random numbers care about either unpredictability or speed/high quality.

Introduction and Summary

As I see it, there are two kinds of random number generators (RNGs) needed by most applications, namely—

This page will discuss these two kinds of RNG, and make recommendations on their use and properties.

In addition, other applications require numbers that "seem" random but are based on an initial state, or "seed". This page will discuss when applications should specify their own seeds.

Then, this page will explain what programming language APIs implement statistical-random and unpredictable-random generators and give advice on implementing them in programming languages.

Finally, this page will discuss issues on shuffling with an RNG.

Summary

The following table summarizes the kinds of RNGs covered in this document.

Kind of RNG When to Use This RNG Examples
Unpredictable-Random In computer/information security cases, or when speed is not a concern. /dev/urandom, CryptGenRandom
Statistical-Random When computer/information security is not a concern, but speed is. See also "Shuffling". xoroshiro128+, xorshift128+
Seeded PRNG When generating reproducible results in a way not practical otherwise. Statistical-random quality PRNG with custom seed

Contents

Definitions

The following definitions are helpful in better understanding this document.

Unpredictable-Random Generators

Unpredictable-random implementations (also known as "cryptographically strong" or "cryptographically secure" RNGs) seek to generate random numbers that are cost-prohibitive to predict. Such implementations are indispensable in computer security and information security contexts, such as—

They are also useful in cases where the application generates random numbers so infrequently that the RNG's speed is not a concern.

An unpredictable-random implementation ultimately relies on one or more nondeterministic sources (sources that don't always return the same output for the same input) for random number generation. Sources that are reasonably fast for most applications (for instance, by producing very many random bits per second), especially sources implemented in hardware, are highly advantageous here, since an implementation for which such sources are available can rely less on PRNGs, which are deterministic and benefit from reseeding as explained later.

Quality

An unpredictable-random implementation generates uniformly distributed random bits such that it would be cost-prohibitive for an outside party to guess either prior or future unseen bits of the random sequence correctly with more than a 50% chance per bit, even with knowledge of the randomness-generating procedure, the implementation's internal state at the given point in time, and/or extremely many outputs of the RNG. (If the sequence was generated directly by a PRNG, ensuring future bits are unguessable this way should be done wherever the implementation finds it feasible; see "Seeding and Reseeding".)

Seeding and Reseeding

If an unpredictable-random implementation uses a PRNG, the following requirements apply.

The PRNG's state length must be at least 128 bits and should be at least 256 bits.

Before an instance of the RNG generates a random number, it must have been initialized ("seeded") with an unpredictable seed, defined as follows. The seed—

The RNG should be reseeded from time to time (using a newly generated unpredictable seed) to help ensure the unguessability of the output. If the implementation reseeds, it must do so before it generates more than 267 bits without reseeding and should do so before it generates more than 232 bits without reseeding.

Examples

Examples of unpredictable-random implementations include the following:

Statistical-Random Generators

Statistical-random generators are used, for example, in simulations, numerical integration, and many games to bring an element of chance and variation to the application, with the goal that each possible outcome is equally likely. However, statistical-random generators are generally suitable only if—

If more than 20 items are being shuffled, a concerned application would be well advised to use alternatives to this kind of implementation (see "Shuffling").

A statistical-random implementation is usually implemented with a PRNG, but can also be implemented in a similar way as an unpredictable-random implementation provided it remains reasonably fast.

Quality

A statistical-random implementation generates random bits, each of which is uniformly randomly distributed independently of the other bits, at least for nearly all practical purposes. The implementation must be highly likely to pass all the tests used in TestU01's Crush, SmallCrush, and BigCrush test batteries [L'Ecuyer and Simard 2007], and ought to be highly likely to pass other known statistical randomness tests. The RNG need not be equidistributed. (Mentioning specific test batteries here is in the interest of precision and makes it clearer whether a particular RNG meets these quality requirements.)

Seeding and Reseeding

If statistical-random implementation uses a PRNG, the following requirements apply.

The PRNG's state length must be at least 64 bits, should be at least 128 bits, and is encouraged to be as high as the implementation can go to remain reasonably fast for most applications.

Before an instance of the RNG generates a random number, it must have been initialized ("seeded") with a seed described as follows. The seed—

The implementation is encouraged to reseed itself from time to time (using a newly generated seed as described earlier), especially if the PRNG has a state length less than 238 bits. If the implementation reseeds, it should do so before it generates more values than the square root of the PRNG's period without reseeding.

Examples and Non-Examples

Examples of statistically-random generators include the following:

Non-examples include the following:

Seeded Random Generators

In addition, some applications use pseudorandom number generators (PRNGs) to generate results based on apparently-random principles, starting from a known initial state, or "seed". Such applications usually care about reproducible results. (Note that in the definitions for unpredictable-random and statistical-random generators given earlier, the PRNGs involved are automatically seeded before use.)

Seeding Recommendations

An application should use a PRNG with a seed it specifies (rather than an automatically-initialized PRNG or another kind of RNG) only if—

  1. the initial state (the seed) which the "random" result will be generated from—
    • is hard-coded,
    • was entered by the user,
    • is known to the application and was generated using a statistical-random or unpredictable-random implementation (as defined earlier),
    • is a verifiable random number (as defined later), or
    • is based on a timestamp (but only if the reproducible result is not intended to vary during the time specified on the timestamp and within the timestamp's granularity; for example, a year/month/day timestamp for a result that varies only daily),
  2. the application might need to generate the same "random" result multiple times,
  3. the application either—
    • makes the seed (or a "code" or "password" based on the seed) accessible to the user, or
    • finds it impractical to store or distribute the "random" results or the random numbers (rather than the seed) for later use, such as—
      • by saving the result to a file,
      • by storing the random numbers for the feature generating the result to "replay" later, or
      • by distributing the results or the random numbers to networked users as they are generated,
  4. the random number generation method will remain stable for as long as the relevant feature is still in use by the application, and
  5. any feature using that random number generation method to generate that "random" result will remain backward compatible with respect to the "random" results it generates, for as long as that feature is still in use by the application.

As used here, a random number generation method is stable if it uses a deterministic algorithm, outputs the same random sequence given the same seed, and has no random-number generation behavior that is unspecified, that is implementation-dependent, or that may change in the future. For example—

Seedable PRNG Recommendations

Which PRNG to use for generating reproducible results depends on the application. But here are some recommendations:

Examples

Custom seeds can come into play in the following situations, among others.

Games

Many kinds of games generate game content using apparently-random principles, such as—

where the game might need to generate the same content of that kind multiple times.

In general, such a game should use a PRNG with a custom seed for such purposes only if—

  1. generating the random content uses relatively many random numbers (say, more than a few thousand), and the application finds it impractical to store or distribute the content or the numbers for later use (see recommendations 2 and 3), or
  2. the game makes the seed (or a "code" or "password" based on the seed, such as a barcode or a string of letters and digits) accessible to the player, to allow the player to generate the level or state repeatedly (see recommendations 2 and 3).

Option 1 often applies to games that generate procedural terrain for game levels, since the terrain often exhibits random variations over an extended space. Option 1 is less suitable for puzzle game boards or card shuffling, since much less data needs to be stored.

Unit Testing

A custom seed is appropriate when unit testing a method that uses a seeded PRNG in place of another kind of RNG for the purpose of the test (provided the method meets recommendation 5).

Verifiable Random Numbers

Verifiable random numbers are random numbers (such as seeds for PRNGs) that are disclosed along with all the information required to verify their generation. Usually, of the information used to derive such numbers, at least some of it is not known by anyone until some time after the announcement is made that those numbers will be generated, but all of it will eventually be publicly available. In some cases, some of the information required to verify the numbers' generation is disclosed in the announcement that those numbers will be generated.

One process to generate verifiable random numbers is described in RFC 3797 (to the extent its advice is not specific to the Internet Engineering Task Force or its Nominations Committee). Although the source code given in that RFC uses the MD5 algorithm, the process does not preclude the use of hash algorithms stronger than MD5 (see the last paragraph of section 3.3 of that RFC).

Noise

Randomly generated numbers can serve as noise, that is, a randomized variation in images and sound.

If the noise implementation implements colored noise, such as white noise or pink noise(2), then the same RNG recommendations apply to the implementation as they do to most other cases.

If the noise implementation implements cellular noise or gradient noise (such as Perlin noise), then different considerations apply depending on the implementation:

The fractional Brownian motion technique combines several layers of cellular or gradient noise by calling the underlying noise function several times. The same considerations apply to fractional Brownian motion as they do to the underlying noise implementation.

Programming Language APIs

The following table lists techniques, methods, and functions that implement unpredictable-random and statistical-random RNGs for popular programming languages. Note the following:

Language Unpredictable-random Statistical-random Other
C/C++ (G) (C) xoroshiro128plus.c (128-bit nonzero seed); xorshift128plus.c (128-bit nonzero seed)
Python secrets.SystemRandom (since Python 3.6); os.urandom() ihaque/xorshift library (128-bit nonzero seed; default seed uses os.urandom()) random.getrandbits() (A); random.seed() (19,936-bit seed) (A)
Java (D) (C); java.security.SecureRandom (F) grunka/xorshift (XORShift1024Star or XORShift128Plus)
JavaScript crypto.randomBytes(byteCount) (node.js only) xorshift library Math.random() (floating-point) (B)
Ruby (C); SecureRandom class (require 'securerandom') Random#rand() (floating-point) (A) (E); Random#rand(N) (integer) (A) (E); Random.new(seed) (default seed uses entropy)

(A) Default general RNG implements the Mersenne Twister, which doesn't meet the statistical-random requirements, strictly speaking, but may be adequate for many applications due to its extremely long period.

(B) JavaScript's Math.random is implemented using xorshift128+ in the latest V8 engine, Firefox, and certain other modern browsers at the time of writing; the exact algorithm to be used by JavaScript's Math.random is "implementation-dependent", though, according to the ECMAScript specification.

(C) See "Advice for New Programming Language APIs" for implementation notes for unpredictable-random implementations.

(D) Java's java.util.Random class uses a 48-bit seed, so doesn't meet the statistical-random requirements. However, a subclass of java.util.Random might be implemented to meet those requirements.

(E) In my opinion, Ruby's Random#rand method presents a beautiful and simple API for random number generation.

(F) At least in Unix-based systems, calling the SecureRandom constructor that takes a byte array is recommended. The byte array should be data described in note (C).

(G) std::random_device, introduced in C++11, is not recommended because its specification leaves considerably much to be desired. For example, std::random_device can fall back to a pseudorandom number generator of unspecified quality without much warning.

Advice for New Programming Language APIs

Wherever possible, existing libraries or techniques that already meet the requirements for unpredictable-random and statistical-random RNGs should be used. For example—

If existing solutions are inadequate, a programming language API could implement unpredictable-random and statistical-random RNGs by filling an output byte buffer with random bytes, where each bit in each byte will be randomly set to 0 or 1. For instance, a C language API for unpredictable-random generators could look like the following: int random(uint8_t[] bytes, size_t size);, where "bytes" is a pointer to a byte array, and "size" is the number of random bytes to generate, and where 0 is returned if the method succeeds and nonzero otherwise. Any programming language API that implements such RNGs by filling a byte buffer must run in amortized linear time on the number of random bytes the API will generate.

Unpredictable-random and statistical-random implementations—

In my opinion, a new programming language's standard library should include—

and should include those two methods separately for unpredictable-random generators and for statistical RNGs. However, a detailed discussion of how to implement those two methods or other methods to generate random numbers or integers that follow a given distribution (such as a normal, geometric, binomial, or discrete weighted distribution) or fall within a given range is outside the scope of this page; I have written about this in another document.

Shuffling

There are special considerations in play when applications use RNGs to shuffle a list of items.

Shuffling Method

The first consideration touches on the shuffling method. The Fisher–Yates shuffle method shuffles a list such that all permutations of that list are equally likely to occur, assuming the RNG it uses produces uniformly random numbers and can choose from among all permutations of that list. However, that method is also easy to mess up (see also Jeff Atwood, "The danger of naïveté"); I give a correct implementation in another document.

Choosing from Among All Permutations

The second consideration is present if the application uses PRNGs for shuffling. If the PRNG's period is less than the number of distinct permutations (arrangements) of a list, then there are some permutations that PRNG can't choose when it shuffles that list. (This is not the same as generating all permutations of a list, which, for a sufficiently large list size, can't be done by any computer in a reasonable time.)

The number of distinct permutations is the multinomial coefficient m! / (w1! × w2! × ... × wn!), where m is the list's size, n is the number of different items in the list, x! means "x factorial", and wi is the number of times the item identified by i appears in the list. Special cases of this are—

In general, a PRNG with state length k bits, as shown in the table below, can't choose from among all the distinct permutations of a list with more items than the given maximum list size n (k is the base-2 logarithm of n!, rounded up to an integer). (Note that a PRNG with state length k bits can't have a period greater than 2k, so can't choose from among more than 2k permutations.)

State length (k) Maximum list size (n)
64 20
128 34
226 52
256 57
512 98
525 100

A PRNG with state length less than the number of bits given below (k) can't choose from among all the distinct permutations of a list formed from m identical lists each with n different items, as shown in this table (k is the base-2 logarithm of ((nm)! / m!n), rounded up to an integer).

Number of lists (m) Items per list (n) Minimum state length (k)
1 20 62
2 20 140
4 20 304
1 52 226
2 52 500
1 60 273

Whenever a statistical-random implementation or seeded RNG is otherwise called for, if an application is expected—

The PRNG in question should—

Motivation

In this document, I made the distinction between statistical-random and unpredictable-random generators because that is how programming languages often present random number generators — they usually offer a general-purpose RNG (such as C's rand or Java's java.util.Random) and sometimes an RNG intended for security purposes (such as java.security.SecureRandom).

What has motivated me to write a more rigorous definition of random number generators is the fact that many applications still use weak RNGs. In my opinion, this is largely because most popular programming languages today—

Conclusion

In conclusion, most applications that require random numbers usually want either unpredictability (cryptographic security), or speed and high quality. I believe that RNGs that meet the descriptions specified in the Unpredictable-Random Generators and Statistical-Random Generators sections will meet the needs of those applications.

In addition, this document recommends using unpredictable-random implementations in many cases, especially in computer and information security contexts, and recommends easier programming interfaces for both unpredictable-random and statistical-random implementations in new programming languages.

I acknowledge—

Request for Comments

Feel free to send comments. They may help improve this page.

Comments on any aspect of the document are welcome, but answers to the following would be particularly appreciated.

Notes

(1) This statement appears because multiple instances of a PRNG automatically seeded with a timestamp, when they are created at about the same time, run the risk of starting with the same seed and therefore generating the same sequence of random numbers.

(2) This is because usual implementations of colored noise don't sample each point of the sample space more than once; rather, all the samples are generated, then, for some kinds of colored noise, a filter is applied to the samples.

License

This page is licensed under A Public Domain dedication.