Menu - Top - Home - Donate to Me

Random Number Generation Methods

Peter Occil

Begun on June 4, 2017; last updated on July 23, 2017.

Discusses many ways in which applications can extract random numbers from RNGs and includes pseudocode for most of them.

Introduction

This page discusses many ways applications can extract random numbers from random number generators (RNGs) and includes pseudocode for most of them.

As used in this document, a random number generator—

The methods presented on this page apply to all those kinds of RNGs unless otherwise noted. Moreover, recommendations on which RNGs are suitable for which applications are generally outside the scope of this page; I have written about this in another document.

This methods described in this document can be categorized as follows:

Sample Python code that implements many of the methods in this document is available.

Contents

Notes and Definitions

In this document:

Uniform Random Numbers

This section describes how RNGs can be used to generate uniformly-distributed random numbers. Here is an overview of the methods described in this section.

One method, RNDINT, described next, can serve as the basis for the remaining methods.

RNDINT: Core Random Integer Method

The core method for generating random numbers using an RNG is called RNDINT(maxInclusive) in this document. It generates a random integer 0 or greater maxInclusive or less, where maxInclusive is an integer 0 or greater and the number generated is approximately uniformly distributed. This core method can serve as the basis for all other methods described in later sections that extract random numbers from RNGs.

The implementation of RNDINT(maxInclusive) depends heavily on what kind of values the underlying RNG returns. This section explains how RNDINT(maxInclusive) can be implemented for four kinds of RNGs.

In this section:

If the RNG outputs integers 0 or greater and less than a positive integer (for example, less than 1,000,000 or less than 6), then RNDINT(maxInclusive) can be implemented as follows. In the pseudocode below, MODULUS is the RNG's modulus. Note that all the variables in this method are unsigned integers. (For an exercise solved by this method, see A. Koenig and B. E. Moo, Accelerated C++, 2000; see also a blog post by Johnny Chan.)

METHOD RNDINT(maxInclusive)
  // maxInclusive must be 0 or greater
  if maxInclusive < 0: return error
  if maxInclusive == 0: return 0
  // N equals modulus
  if maxInclusive == MODULUS - 1: return RNG()
  if maxInclusive >= MODULUS:
    cx = floor(maxInclusive / MODULUS) + 1
    while true
       ret = cx * RNG
       // NOTE: If this method is implemented using a fixed-
       // precision type, the addition operation below should
       // check for overflow and should reject the number
       // if overflow would result.
       ret = ret + RNDINT(cx - 1)
       if ret <= maxInclusive: return ret
    end
  else
    // NOTE: If the programming language implements
    // division with two integers by truncating to an
    // integer, the division can be used as is without
    // calling a "floor" function.
          nPlusOne = maxInclusive + 1
          maxexc = floor((MODULUS - 1) / nPlusOne) * nPlusOne
          while true
                    ret = RNG()
                    if ret < nPlusOne: return ret
                    if ret < maxexc: return mod(ret, nPlusOne)
          end
  end
END METHOD

If the RNG outputs integers 0 or greater and less than a positive integer that's a power of two, such as random bits, random bytes, or random values of a given number of bits, then RNDINT(maxInclusive) can be implemented as follows. In the pseudocode below, MODULUS is the RNG's modulus, and MODBITS is the number of bits, minus 1, used to represent the modulus. For example:

Note that all the variables in this method are unsigned integers.

METHOD RNDINT(maxInclusive)
   // maxInclusive must be 0 or greater
    if maxInclusive < 0: return error
    if maxInclusive == 0: return 0
    // maxInclusive equals maximum
    if maxInclusive == MODULUS - 1: return RNG()
    // Special cases
    if maxInclusive == 1: return RNG() & 1
    if maxInclusive == 3 and MODBITS >= 2: return RNG() & 3
    if maxInclusive == 255 and MODBITS >= 8: return RNG() & 255
    if maxInclusive == 65535 and MODBITS >=16: return RNG() & 65535
    if maxInclusive > MODULUS - 1:
        // Calculate the bit count of maxInclusive
        bitCount = 0
        tempnumber = maxInclusive
        while tempnumber > 0
               // NOTE: If the programming language implements
               // division with two integers by truncating to an
               // integer, the division can be used as is without
               // calling a "floor" function.
               tempnumber = floor(tempnumber / 2)
               bitCount = bitCount + 1
        end
        while true
               // Build a number with `bitCount` bits
                tempnumber = 0
                while bitCount > 0
                     wordBits = MODBITS
                     rngNumber = RNG()
                     if wordBits > bitCount
                        wordBits = bitCount
                        // Truncate number to 'wordBits' bits
                        rngNumber = rngNumber & (
                           (1 << wordBits) - 1)
                     end
                     tempnumber = tempnumber << wordBits
                     // NOTE: In programming languages that
                     // support the OR operator between two
                     // integers, that operator can replace the
                     // plus operator below.
                     tempnumber = tempnumber + rngNumber
                     bitCount = bitCount - wordBits
                end
                // Accept the number if allowed
                if tempnumber <= maxInclusive: return tempnumber
         end
   else
          nPlusOne = maxInclusive + 1
          maxexc = floor((MODULUS - 1) / nPlusOne) * nPlusOne
          while true
                    ret = RNG()
                    if ret < nPlusOne: return ret
                    if ret < maxexc: return mod(ret, nPlusOne)
          end
   end
END METHOD

Note that this implementation of RNDINT(maxInclusive) may result in unused bits (for example, when truncating a random number to wordBits bits or in the special cases at the start of the method). It would be outside the scope of this document to describe how a more sophisticated implementation may save those bits for later reuse.

If the RNG outputs fixed-point numbers 0 or greater and less than a positive integer, that is, numbers with a fixed number of fractional parts, then find A and B, where A is the greatest integer that is less than the highest number the RNG can output, and B is the number of fractional parts the fixed-point number format can have, and use one of the two methods given above depending on whether A * B is a power of two (A * B is treated as the modulus for that purpose). Here, though, RNG() in the methods above is floor(RNG() * B) instead.

If the RNG outputs floating-point numbers 0 or greater and less than 1, then find s, where s is the number of significand permutations for the floating-point format, and use one of the two methods given above depending on whether s is a power of two (s is treated as the modulus for that purpose). Here, though, RNG() in the methods above is floor(RNG() * s) instead. (If the RNG outputs arbitrary-precision floating-point numbers, s should be set to the number of different values that are possible by calling the underlying RNG.)


The underlying RNG can be other than already described in this section; however, a detailed RNDINT(maxInclusive) implementation for other kinds of RNGs is not given here, since they seem to be lesser seen in practice. Readers who know of a RNG that is in wide use, returns uniformly distributed numbers, and is other than already described in this section should send me a comment.

RNDINTRANGE: Random Integers Within a Range, Maximum Inclusive

The naïve way of generating a random integer minInclusive or greater and maxInclusive or less is as follows. This approach works well for unsigned integers and arbitrary-precision integers.

 METHOD RNDINTRANGE(minInclusive, maxInclusive)
   // minInclusive must not be greater than maxInclusive
   if minInclusive > maxInclusive: return error
   return minInclusive + RNDINT(maxInclusive - minInclusive)
 END METHOD

The naïve approach won't work as well, though, for signed integer formats if the difference between maxInclusive and minInclusive exceeds the highest possible integer for the format. For fixed-length signed integer formats (1), such random integers can be generated using the following pseudocode. In the pseudocode below, INT_MAX is the highest possible integer in the integer format.

METHOD RNDINTRANGE(minInclusive, maxInclusive)
   // minInclusive must not be greater than maxInclusive
   if minInclusive > maxInclusive: return error
   if minInclusive == maxInclusive: return minInclusive
   if minInclusive==0: return RNDINT(maxInclusive)
   // Difference does not exceed maxInclusive
   if minInclusive > 0 or minInclusive + INT_MAX >= maxInclusive
       return minInclusive + RNDINT(maxInclusive - minInclusive)
   end
   while true
     ret = RNDINT(INT_MAX)
     // NOTE: If the signed integer format uses two's-complement
     // form, use the following line:
     if RNDINT(1) == 0: ret = -1 - ret
     // NOTE: If the signed integer format uses sign-magnitude
     // form (such as .NET's `System.Decimal`) or one's-complement
     // form,  use the following three lines instead of the preceding line;
     // here, zero will be rejected at a 50% chance because zero occurs
     // twice in both forms.
     // negative = RNDINT(1) == 0
     // if negative: ret = 0 - ret
     // if negative and ret == 0: continue
     if ret >= minInclusive and ret <= maxInclusive: return ret
   end
END METHOD

A common use case of RNDINTRANGE is to simulate die rolls. For example, to simulate rolling a six-sided die, generate a random number from 1 through 6 by calling RNDINTRANGE(1, 6).

RNDBITS: Random N-Bit Integers

The following naïve way of generating a uniformly distributed random N-bit integer is as follows:

 METHOD RNDBITS(bits)
    // NOTE: The maximum number that could be returned
    // here is 2^bits - 1, in which all `bits` bits are set to 1.
    return RNDINT((1 << bits) - 1)
 END METHOD

Although this works well for arbitrary-precision integers, it won't work well for the much more popular integer types called fixed-length two's-complement signed integers (1). For such signed integers as well as fixed-length unsigned integers, RNDBITS(bits) can be implemented using the pseudocode below. In the pseudocode below, BITCOUNT is the number of bits used in the format. Note that for such signed integers, RNDBITS(bits) can return a sequence of bits that resolves to a negative number.

METHOD RNDBITS(bits)
     if bits<0 or bits > BITCOUNT: return error
     if bits==0: return 0
     if bits==1: return RNDINT(1)
     if bits==2 and BITCOUNT == 2
         return (RNDINT(1) << 1) | RNDINT(1)
     end
     if bits==2: return RNDINT(3)
     bitsMinus2 = bits - 2
     // NOTE: The "|" below is the OR operator between
     // two integers.  The following line is implemented this
     // way to accommodate implementations that use
     // fixed-length two's-complement signed integers.
     ret = (RNDINT((1<<bitsMinus2) - 1) << bitsMinus2) | RNDINT(3)
     // NOTE: If the implementation uses fixed-length
     // unsigned integers, the following line can replace
     // the preceding line.  Note that the implementation
     // avoids shifting an integer by BITCOUNT bits or more,
     // because such behavior is undefined in C and C++.
     // ret = RNDINT( (((1 << (bits - 1)) - 1) << 1) | 1 )
     // NOTE: Alternatively, a list containing powers-of-two
     // minus 1 can be generated (calculating `floor(pow(2,i)) - 1`
     // for each relevant index `i` of the list, assuming unlimited
     // precision) and the following line used instead of the
     // preceding (assuming `list` is
     // the list generated this way):
     // ret = RNDINT( list[bits] )
     return ret
END METHOD

RNDU01: Random Numbers 0 or Greater and 1 or Less

The following method, RNDU01(), generates a random number 0 or greater and 1 or less:

 METHOD RNDU01()
    return RNDINT(X) / X
 END METHOD

In the method above, X is the number of fractional parts between 0 and 1. (For fixed-precision floating-point number formats, X should equal the number of significand permutations for that format. See "Generating uniform doubles in the unit interval" in the xoroshiro+ remarks page for further discussion.)

For fixed-precision binary floating-point numbers with fixed exponent range (such as Java's double and float), the following pseudocode for RNDU01() can be used instead, which returns a uniformly-distributed random number 0 or greater and 1 or less. It's based on a technique devised by Allen Downey, who found that dividing a random number by a constant usually does not yield all representable binary floating-point numbers in the desired range. In the pseudocode below, SIGBITS is the binary floating-point format's precision (for examples, see the note for "significand permutations").

METHOD RNDU01()
    e=-SIGBITS
    while true
        if RNDINT(1)==0: e = e - 1
      else: break
    end
    sig = RNDINT((1 << (SIGBITS - 1)) - 1)
    if sig==0 and RNDINT(1)==0: e = e + 1
    sig = sig + (1 << (SIGBITS - 1))
    // NOTE: This multiplication should result in
    // a floating-point number; if `e` is sufficiently
    // small, the number will underflow to 0
    return sig * pow(2, e)
END METHOD

RNDU01OneExc, RNDU01ZeroExc, and RNDU01ZeroOneExc: Related Methods

Three related methods also generate a random number in an interval bounded at 0 and 1. They can be implemented as follows.

RNDNUMRANGE: Random Numbers Within a Range, Maximum Inclusive

The naïve way of generating random number minInclusive or greater and maxInclusive or less, is shown in the following pseudocode, which generally works well only if the number format can't be negative or that format uses arbitrary precision.

METHOD RNDNUMRANGE(minInclusive, maxInclusive)
    if minInclusive > maxInclusive: return error
    return minInclusive + (maxInclusive - minInclusive) * RNDU01()
END

For fixed-point or floating-point number formats with fixed precision (such as Java's double and float), the pseudocode above can overflow if the difference between maxInclusive and minInclusive exceeds the maximum possible value for the format. For such formats, the following pseudocode for RNDU01() can be used instead. In the pseudocode below, NUM_MAX is the highest possible value for the number format. The pseudocode assumes that the highest possible value is positive and the lowest possible value is negative.

METHOD RNDNUMRANGE(minInclusive, maxInclusive)
   if minInclusive > maxInclusive: return error
   if minInclusive == maxInclusive: return minInclusive
   // Difference does not exceed maxInclusive
   if minInclusive >= 0 or minInclusive + NUM_MAX >= maxInclusive
       return minInclusive + (maxInclusive - minInclusive) * RNDU01()
   end
   while true
     ret = RNDU01() * NUM_MAX
     // NOTE: If the number format uses sign-magnitude
     // representation, as is the case for Java `float` and
     // `double` and .NET's implementation of `System.Decimal`,
     // for example, use the following:
     negative = RNDINT(1) == 0
     if negative: ret = 0 - ret
     if negative and ret == 0: continue
     // NOTE: For fixed-precision fixed-point numbers implemented
     // using two's complement numbers (note 1), use the following line
     // instead of the preceding three lines, where `QUANTUM` is the
     // smallest representable positive number in the fixed-point format:
     // if RNDINT(1) == 0: ret = (0 - QUANTUM) - ret
     if ret >= minInclusive and ret <= maxInclusive: return ret
   end
END

RNDINTEXC: Modified Core Method, Maximum Exclusive

A method based on RNDINT(maxInclusive) is called RNDINTEXC(maxExclusive) in this document; it generates a random integer 0 or greater and less than maxExclusive, where maxExclusive is an integer greater than 0. This variant is not given as the core random generation method because it's harder to fill integers in popular integer formats with random bits with this method. The method can be implemented as follows:

 METHOD RNDINTEXC(maxExclusive)
    if maxExclusive <= 0: return error
    return RNDINT(maxExclusive - 1)
 END METHOD

Note: An alternative way of generating a random integer 0 or greater and less than maxExclusive is the following idiom: floor(RNDU01OneExc()*(maxExclusive)). This approach, though, is recommended only if the programming language supports only floating-point numbers (an example is JavaScript) or doesn't support an integer type that is big enough to fit the number maxExclusive - 1.

RNDINTEXCRANGE: Random Integers Within a Range, Maximum Exclusive

A version of RNDINTRANGE, called RNDINTEXCRANGE here, returns a random integer minInclusive or greater and less than maxExclusive. It can be implemented using RNDINTRANGE, as the following pseudocode demonstrates.

METHOD RNDINTEXCRANGE(minInclusive, maxExclusive)
   if minInclusive >= maxExclusive: return error
   // NOTE: For signed integer formats, replace the following line
   // with "if minInclusive >=0 or minInclusive + INT_MAX >=
   // maxExclusive", where `INT_MAX` has the same meaning
   // as the pseudocode for `RNDINTRANGE`.
   if minInclusive >=0
     return RNDINTRANGE(minInclusive, maxExclusive - 1)
   end
   while true
     ret = RNDINTRANGE(minInclusive, maxExclusive)
     if ret < maxExclusive: return ret
   end
END METHOD

RNDNUMEXCRANGE: Random Numbers Within a Range, Maximum Exclusive

A version of RNDNUMRANGE, called RNDNUMEXCRANGE here, returns a random number minInclusive or greater and less than maxExclusive. It can be implemented using RNDNUMRANGE, as the following pseudocode demonstrates.

METHOD RNDNUMEXCRANGE(minInclusive, maxExclusive)
   if minInclusive >= maxExclusive: return error
   while true
     ret = RNDNUMRANGE(minInclusive, maxExclusive)
     if ret < maxExclusive: return ret
   end
END METHOD

Randomization Techniques

This section describes commonly used randomization techniques, such as shuffling, selection of several unique items, and creating random strings of text.

Boolean Conditions

To generate a condition that is true at the specified probabilities, use the following idioms in an if condition:

Shuffling

The Fisher–Yates shuffle method shuffles a list such that all permutations of that list are equally likely to occur, assuming the RNG it uses produces uniformly random numbers and can choose from among all permutations of that list. However, that method is also easy to write incorrectly (see also Jeff Atwood, "The danger of naïveté"). The following pseudocode is designed to shuffle a list's contents.

METHOD Shuffle(list)
   // NOTE: Check size of the list early to prevent
   // `i` from being less than 0 if the list's size is 0 and
   // `i` is implemented using an unsigned type available
   // in certain programming languages.
   if size(list) >= 2
      // Set i to the last item's index
      i = size(list) - 1
      while i > 0
         // Choose an item ranging from the first item
         // up to the item given in `i`. Note that the item
         // at i+1 is excluded.
         k = RNDINTEXC(i + 1)
         // The following is wrong since it introduces biases:
         // k = RNDINTEXC(size(list))
         // The following is wrong since the algorithm won't
         // choose from among all possible permutations:
         // k = RNDINTEXC(i)
         // Swap item at index i with item at index k;
         // in this case, i and k may be the same
         tmp = list[i]
         list[i] = list[k]
         list[k] = tmp
         // Move i so it points to the previous item
         i = i - 1
      end
   end
   // NOTE: An implementation can return the
   // shuffled list, as is done here, but this is not required.
   return list
END METHOD

An important consideration with respect to shuffling is the nature of the underlying RNG, as I discuss in further detail in my RNG recommendation document on shuffling. It suffices to say here that in general, whenever a deterministic, uniformly-distributed RNG is otherwise called for, such an RNG is good enough for shuffling a 52-item list if its period is 2226 or greater. (The period is the maximum number of values in a generated sequence for an RNG before that sequence repeats).

Note: In simulation testing, shuffling is used to relabel items from a dataset at random, where each item in the dataset is assigned one of several labels. In such testing:

Choosing a Random Item from a List

To choose a random item from a list—

Note: Bootstrapping is a method of creating a simulated dataset by choosing random items with replacement from an existing dataset until both datasets have the same size. (The simulated dataset can contain duplicates this way.) Usually, multiple simulated datasets are generated this way, one or more statistics, such as the mean, are calculated for each simulated dataset as well as the original dataset, and the statistics for the simulated datasets are compared with those of the original.

Creating a Random Character String

A commonly asked question involves how to generate a random string of characters (usually a random alphanumeric string, or string of letters and digits).

The first step is to generate a list of the letters, digits, and/or other characters the string can have. Often, those characters will be—

as found in the Basic Latin block of the Unicode Standard. Note that:

The second step is to build a new string whose characters are chosen from that character list. The pseudocode below demonstrates this by creating a list, rather than a string, where the random characters will be held. It also takes the number of characters as a parameter named size. (Converting this list to a text string is programming-language-dependent, and the details of the conversion are outside the scope of this page.)

  METHOD RandomString(characterList, stringSize)
       i = 0
       newString = NewList()
       while i < stringSize
           // Choose a character from the list
           randomChar = characterList[RNDINTEXC(size(characterList))]
           // Add the character to the string
           AddItem(newString, randomChar)
           i = i + 1
        end
        return newString
  END METHOD

Note: Often applications need to generate a string of characters that's not only random, but also unique. The best way to ensure uniqueness in this case is to store a list (such as a hash table) of strings already generated and to check newly generated strings against the list (or table). Random number generators alone should not be relied on to deliver unique results.

Choosing Several Unique Items

Often, the need arises to choose k unique items or values from among n available items or values. (This is also known as sampling without replacement.) The following assumes that each item has an equal chance of being chosen. There are several techniques for doing this depending on whether n is known and how big it is:

Almost-Random Sampling

Some applications (particularly some games) may find it important to control which random numbers appear, to make the random outcomes appear fairer to users. Without this control, a user may experience long streaks of good outcomes or long streaks of bad outcomes, both of which are theoretically possible with a random number generator. To implement this kind of "almost-random" sampling, an application can do one of the following:

However, "almost-random" sampling techniques are not recommended—

Note: Monte Carlo integration uses randomization to estimate a multidimensional integral. It involves evaluating a function at N random points in the domain, adding them up, then dividing the sum by N. The "Variance" MathWorld article gives methods for calculating the estimate's variance. (After calculating the error, or square root of variance, and the estimated integral, both can be multiplied by the volume of the domain.) Often quasirandom sequences (also known as low-discrepancy sequences, such as Sobel and Halton sequences), often together with a uniformly-distributed RNG, provide the "random" numbers to sample the function more efficiently. Unfortunately, the methods to produce such sequences are too complicated to show here.

Non-Uniform Distributions

Some applications need to choose random items or numbers such that some of them are more likely to be chosen than others. This section describes how to use the uniform random number methods to generate random numbers that follow a non-uniform statistical distribution.

Discrete Weighted Choice

The discrete weighted choice method is used to choose a random item from among a set of them with different probabilities of being chosen.

The following pseudocode takes a single list weights, and returns the index of a weight from that list. The greater the weight, the more likely its index will be chosen. (Note that there are two possible ways to generate the random number depending on whether the weights are all integers or can be fractional numbers.) Each weight should be 0 or greater.

METHOD DiscreteWeightedChoice(weights)
    if size(weights) == 0: return error
    sum = 0
    // Get the sum of all weights
    i = 0
    while i < size(weights)
        sum = sum + weights[i]
        i = i + 1
    end
    // Choose a random integer/number from 0 to less than
    // the sum of weights.
    value = RNDINTEXC(sum)
    // NOTE: If the weights can be fractional numbers,
    // use this instead:
    // value = RNDNUMEXCRANGE(0, sum)
    // Choose the object according to the given value
    i = 0
    lastItem = size(weights) - 1
    runningValue = 0
    while i < size(weights)
       if weights[i] > 0
          newValue = runningValue + weights[i]
          if value < newValue: return i
          runningValue = newValue
          lastItem = i
       end
       i = i + 1
    end
    // Last resort (might happen because rounding
    // error happened somehow)
    return lastItem
END METHOD

Example

Assume we have the following list: ["apples", "oranges", "bananas", "grapes"], and weights is the following: [3, 15, 1, 2]. The weight for "apples" is 3, and the weight for "oranges" is 15. Since "oranges" has a higher weight than "apples", the index for "oranges" (1) is more likely to be chosen than the index for "apples" (0) with the DiscreteWeightedChoice method. The following pseudocode implements how to get a randomly chosen item from the list with that method.

index = DiscreteWeightedChoice(weights)
// Get the actual item
item = list[index]

Weighted Choice Without Replacement

In the example above, the weights sum to 21. However, the weights do not mean that when 21 items are selected, the index for "apples" will be chosen exactly 3 times, or the index for "oranges" exactly 15 times, for example. Each call to DiscreteWeightedChoice is independent from the others, and each weight indicates only a likelihood that the corresponding index will be chosen rather than the other indices. And this likelihood doesn't change no matter how many times DiscreteWeightedChoice is called with the same weights. This is called a weighted choice with replacement, which can be thought of as drawing a ball, then putting it back.

To implement weighted choice without replacement (which can be thought of as drawing a ball without putting it back), simply call DiscreteWeightedChoice, and then decrease the weight for the chosen index by 1. In this way, when items are selected repeatedly, each weight behaves like the number of "copies" of each item. This technique, though, will only work properly if all the weights are integers 0 or greater. The pseudocode below is an example of this.

// Get the sum of weights
// (NOTE: This code assumes that `weights` is
// a list that can be modified.  If the original weights
// are needed for something else, a copy of that
// list should be made first, but the copying process
// is not shown here.  This code also assumes that `list`,
// a list of items, was already declared earlier and
// has at least as many items as `weights`.)
totalWeight = 0
i = 0
while i < size(weights)
    totalWeight = totalWeight + weights[i]
    i = i + 1
end
// Choose as many items as the sum of weights
i = 0
items = NewList()
while i < totalWeight
    index = DiscreteWeightedChoice(weights)
    // Decrease weight by 1 to implement selection
    // without replacement.
    weights[index] = weights[index] - 1
    AddItem(items, list[index])
    i = i + 1
end

Alternatively, if all the weights are integers 0 or greater and their sum is relatively small, create a list with as many copies of each item as its weight, then shuffle that list. The resulting list will be ordered in a way that corresponds to a weighted random choice without replacement.

Choosing Multiple Items

The discrete weighted choice method can also be used for choosing multiple items from a list, whether or not the items have the same probability of being chosen. In this case, after choosing a random index, set the weight for that index to 0 to keep it from being chosen again. The pseudocode below is an example of this.

// (NOTE: This code assumes that `weights` is
// a list that can be modified.  If the original weights
// are needed for something else, a copy of that
// list should be made first, but the copying process
// is not shown here.  This code also assumes that `list`,
// a list of items, was already declared earlier and
// has at least as many items as `weights`.)
chosenItems = NewList()
i = 0
// Choose k items from the list
while i < k or i < size(weights)
    index = DiscreteWeightedChoice(weights)
    // Set the weight for the chosen index to 0
    // so it won't be chosen again
    weights[index] = 0
    // Add the item at the chosen index
    AddItem(chosenItems, list[index])
end
// `chosenItems` now contains the items chosen

Piecewise Constant Distribution

The discrete weighted choice method can also be used to implement a piecewise constant distribution, as in the following example. Assume we have the following list: [0, 5, 10, 11, 15], and weights is the following: [3, 15, 1, 2]. Note that the weight list has one fewer item than the number list. The weight for "0 to 5" (0 or greater, less than 5) is 3, and the weight for "5 to 10" is 15. Since "5 to 10" has a higher weight than "0 to 5", this distribution will choose a number from 5 to 10 more often than a number from 0 to 5. The following pseudocode implements the piecewise constant distribution.

// Choose a random index
index = DiscreteWeightedChoice(weights)
// Choose a random number in the chosen interval on the list
number = list[index] + (list[index + 1] - list[index]) * RNDU01OneExc()

The code above implements the distribution with replacement. Implementing the distribution without replacement is similar to implementing discrete weighted choice without replacement; the only change is to replace AddItem(items, list[index]) with AddItem(items, list[index] + (list[index + 1] - list[index]) * RNDU01OneExc()) in the pseudocode.

Multinomial Distribution

The discrete weighted choice method can also be used to implement a multinomial distribution. This distribution models the number of times each of several mutually exclusive events happens among a given number of trials, where each event can have a different probability of happening. The pseudocode below is of a method that takes two parameters: trials, which is the number of trials, and weights, which are the relative probabilities of each event. The method tallies the events as they happen and returns a list (with the same size as weights) containing the number of successes for each event.

METHOD Multinomial(trials, weights)
    if trials < 0: return error
    // create a list of successes
    list = NewList()
    i = 0
    while i < size(weights)
       AddItem(list, 0)
       i = i + 1
    end
    i = 0
    while i < trials
        // Choose an index
        index = DiscreteWeightedChoice(weights)
        // Tally the event at the chosen index
        list[index] = list[index] + 1
        i = i + 1
    end
    return list
END METHOD

Continuous Weighted Choice

The continuous weighted choice method is used to choose a random number that follows a continuous statistical distribution (here, a piecewise linear distribution).

The following pseudocode takes two lists, list and weights, and returns a random number that follows the distribution. list is a list of numbers (which can be fractional numbers) that should be arranged in ascending order, and weights is a list of probability densities for the given numbers (where each number and its density have the same index in both lists). (A number's probability density is the relative probability that a randomly chosen value will be infinitesimally close to that number, assuming no precision limits.) Each probability density should be 0 or greater. Both lists should be the same size. In the pseudocode below, the first number in list can be returned exactly, but not the last item in list, assuming the numbers in list are arranged in ascending order.

In many cases, the probability densities are sampled (usually at regularly spaced points) from a so-called probability density function, a function that specifies each number's probability density. A list of common probability density functions is outside the scope of this page.

METHOD ContinuousWeightedChoice(list, weights)
    if size(list) <= 0 or size(weights) < size(list): return error
    if size(list) == 1: return list[0]
    // Get the sum of all areas between weights
    sum = 0
    areas = NewList()
    i = 0
    while i < size(list) - 1
      weightArea = abs((weights[i] + weights[i + 1]) * 0.5 *
            (list[i + 1] - list[i]))
      AddItem(areas, weightArea)
      sum = sum + weightArea
       i = i + 1
    end
    // Choose a random number
    value = RNDNUMEXCRANGE(0, sum)
    // Interpolate a number according to the given value
    i=0
    // Get the number corresponding to the random number
    runningValue = 0
    while i < size(list) - 1
     area = areas[i]
     if area > 0
      newValue = runningValue + area
      if value <= newValue
       // NOTE: The following line can also read
       // "interp = RNDU01OneExc()", that is, a new number is generated
       // within the chosen area rather than using the point
       // already generated.
       interp = (value - runningValue) / (newValue - runningValue)
       retValue = list[i] + (list[i + 1] - list[i]) * interp
       return retValue
      end
      runningValue = newValue
     end
     i = i + 1
    end
    // Last resort (might happen because rounding
    // error happened somehow)
    return list[size(list) - 1]
END METHOD

Example

Assume list is the following: [0, 1, 2, 2.5, 3], and weights is the following: [0.2, 0.8, 0.5, 0.3, 0.1]. The probability density for 2 is 0.5, and that for 2.5 is 0.3. Since 2 has a higher probability density than 2.5, numbers near 2 are more likely to be chosen than numbers near 2.5 with the ContinuousWeightedChoice method.

Normal (Gaussian) Distribution

The normal distribution (also called the Gaussian distribution) can model many kinds of measurements or scores whose values are most likely around a given average and are less likely the farther away from that average on either side.

The following method generates two normally-distributed random numbers with mean (average) mu (μ) and standard deviation sigma (σ), using the polar method (2). (In a standard normal distribution, μ = 0 and σ = 1.) The standard deviation sigma affects how wide the normal distribution's "bell curve" appears; the probability that a normally-distributed random number will be within one standard deviation from the mean is about 68.3%; within two standard deviations (2 times sigma), about 95.4%, and within three standard deviations, about 99.7%.

METHOD Normal2(mu, sigma)
  while true
    a = RNDU01()
    b = RNDU01()
    if a != 0 and RNDINT(1) == 0: a = 0 - a
    if b != 0 and RNDINT(1) == 0: b = 0 - b
    c = a * a + b * b
    if c != 0 and c <= 1
       c = sqrt(-2 * ln(c) / c)
       return [c * a, c * b]
    end
  end
END METHOD

Since Normal2 returns two numbers instead of one, but many applications require only one number at a time, a problem arises on how to return one number while storing the other for later retrieval. Ways to solve this problem are outside the scope of this page, however. The name Normal will be used in this document to represent a method that returns only one normally-distributed random number rather than two.

Also note that a normally-distributed random number can theoretically fall anywhere on the number line, even if it's extremely far from the mean. Depending on the use case, an application may need to reject normally-distributed numbers lower than a minimum threshold and/or higher than a maximum threshold and generate new normally-distributed numbers, or to clamp outlying numbers to those thresholds. But then the resulting distribution would no longer be a real normal distribution, but rather a truncated or censored normal distribution, respectively. (Rejecting or clamping outlying numbers this way can be done for any statistical distribution, not just a normal distribution.)

Alternatively, or in addition, the following method (implementing a ratio-of-uniforms technique) can be used to generate normally distributed random numbers.

METHOD Normal(mu, sigma)
    bmp = 0.8577638849607068 // sqrt(2/exp(1))
    while true
        a=RNDU01ZeroExc()
        b=RNDNUMRANGE(-bmp,bmp)
        if b*b <= -a * a * 4 * ln(a)
            return (b * sigma / a) + mu
        end
    end
END METHOD

Generating Random Points on the Surface of a Hypersphere

Generating N Normal(0, 1) random numbers, then dividing them by their norm (the square root of the sum of squares of the numbers generated this way, that is, sqrt(num1 * num1 + num2 * num2 + ... + numN * numN)) will result in an N-dimensional point lying on the surface of an N-dimensional hypersphere of radius 1 (that is, the surface formed by all points lying 1 unit away from a common point in N-dimensional space). Reference. (In the exceptional case that all numbers are 0, the process should repeat.)

Binomial Distribution

The following method generates a random integer that follows a binomial distribution. This number expresses the number of successes that have happened after a given number of independently performed trials (expressed as trials below), where the probability of a success in each trial is p (which ranges from 0, never, to 1, always, and which can be 0.5, meaning an equal chance of success or failure).

Example: If p is 0.5, the binomial distribution models the task "Flip N coins, then count the number of heads."

METHOD Binomial(trials, p)
    if trials < 0: return error
    if trials == 0: return 0
    // Always succeeds
    if p >= 1.0: return trials
    // Always fails
    if p <= 0.0: return 0
    i = 0
    count = 0
    // Suggested by Saucier, R. in "Computer
    // generation of statistical distributions", 2000, p. 49
    tp = trials * p
    if tp > 25 or (tp > 5 and p > 0.1 and p < 0.9)
         countval = -1
         while countval < 0: countval = Normal(tp, tp)
         return floor(countval + 0.5)
    end
    if p == 0.5
        while i < trials
            if RNDINT(1) == 0
                // Success
                count = count + 1
            end
            i = i + 1
        end
    else
        while i < trials
            if RNDU01OneExc() < p
                // Success
                count = count + 1
            end
            i = i + 1
        end
    end
    return count
END METHOD

Hypergeometric Distribution

The following method generates a random integer that follows a hypergeometric distribution. When a given number of items are drawn at random without replacement from a set of items each labeled either 1 or 0, the random integer expresses the number of items drawn this way that are labeled 1. In the method below, trials is the number of items drawn at random, ones is the number of items labeled 1 in the set, and count is the number of items labeled 1 or 0 in that set.

Example: In a 52-card deck of Anglo-American playing cards, 12 of the cards are face cards (jacks, queens, or kings). After the deck is shuffled and seven cards are drawn, the number of face cards drawn this way follows a hypergeometric distribution where trials is 7, ones is 12, and count is 52.

METHOD Hypergeometric(trials, ones, count)
    if ones < 0 or count < 0 or trials < 0 or ones > count or trials > count
            return error
    end
    if ones == 0: return 0
   successes = 0
    i = 0
    currentCount = count
    currentOnes = ones
    while i < trials and currentOnes > 0
            if RNDINTEXC(currentCount) < currentOnes
                    currentOnes = currentOnes - 1
                    successes = successes + 1
            end
            currentCount = currentCount - 1
            i = i + 1
    end
    return successes
END METHOD

Poisson Distribution

The following method generates a random integer that follows a Poisson distribution. In the method below, the mean is the average number of independent events of a certain kind per fixed unit of time or space (for example, per day, hour, or square kilometer), and the method's return value gives a random number of such events during one such unit.

The random integer from the method below is such that the average of the random integers approaches the given mean number when this method is called repeatedly with the same mean. Note that the mean can be an integer or a non-integer. The method given here is based on Knuth's method from 1969.

METHOD Poisson(mean)
    if mean < 0: return error
    if mean == 0: return 0
    p = 1.0
    // Suggested by Saucier, R. in "Computer
    // generation of statistical distributions", 2000, p. 49
    if mean > 9
        p = -1.0
        while p < 0: p = Normal(mean, mean)
        return floor(p + 0.5)
    end
    pn = exp(-mean)
    count = 0
    while true
        count = count + 1
        p = p * RNDU01OneExc()
        if p <= pn
                return count - 1
        end
    end
END METHOD

Gamma Distribution

The following method generates a random number that follows a gamma distribution. The gamma distribution models expected lifetimes. The method given here is based on Marsaglia and Tsang's method from 2000.

METHOD GammaDist(meanLifetime)
    // Must be greater than 0
    if meanLifetime <= 0: return error
    d = meanLifetime
    v = 0
    if meanLifetime < 1: d = d + 1
    d = d - (1.0 / 3) // NOTE: 1.0 / 3 must be a fractional number
    c = 1.0 / sqrt(9 * d)
    while true
        x = 0
        while true
           x = Normal(0, 1)
           v = c * x + 1;
           v = v * v * v
           if v > 0: break
        end
        u = 1.0 - RNDU01OneExc()
        x2 = x * x
        if u < 1 - (0.0331 * x2 * x2): break
        if ln(u) < (0.5 * x2) + (d * (1 - v + ln(v))): break
    end
    ret = d * v
    if meanLifetime < 1
       ret = ret * exp(ln(RNDU01ZeroExc()) / meanLifetime)
    end
    return ret
end

Extended versions of the gamma distribution:

Generating Random Numbers with a Given Positive Sum

The Dirichlet distribution models a distribution of N numbers that sum to a given positive number, total. Generating N GammaDist(total) calls and dividing them by their sum will result in N random numbers that (approximately) sum to total (see the Wikipedia article). For example, if total is 1, the numbers will (approximately) sum to 1. (In the exceptional case that all numbers are 0, the process should repeat. If total is 1, the exponential distribution, described later, can be used instead of GammaDist(1) to generate the random numbers; see also Devroye 1986, p. 405.)

Negative Binomial Distribution

The following method generates a random integer that follows a negative binomial distribution. This number expresses the number of failures that have happened after seeing a given number of successes (expressed as successes below), where the probability of a success in each case is p (which can be greater than 0, never, and equal to or less than 1, always, and which can be 0.5, meaning an equal chance of success or failure).

Example: If p is 0.5 and successes is 1, the negative binomial distribution models the task "Flip a coin until you get tails, then count the number of heads."

The following implementation of the negative binomial distribution allows successes to be an integer or a non-integer (and implements a distribution also known as the Pólya distribution).

METHOD NegativeBinomial(successes, p)
    // Must be 0 or greater
    if successes < 0: return error
    // No failures if no successes or if always succeeds
    if successes == 0 or p >= 1.0: return 0
    // Always fails (NOTE: infinity can be the maximum possible
    // integer value if NegativeBinomial is implemented to return
    // an integer)
    if p <= 0.0: return infinity
    if successes == 1.0
        // Geometric distribution special case (see Saucier 2000)
        return floor(ln(RNDU01ZeroExc()) / ln(1.0 - p))
    else
        return Poisson(GammaDist(successes) * (1 - p) / p)
    end
END METHOD

The following implementation of the negative binomial distribution allows successes to be an integer only.

METHOD NegativeBinomialInt(successes, p)
    // Must be 0 or greater
    if successes < 0: return error
    // No failures if no successes or if always succeeds
    if successes == 0 or p >= 1.0: return 0
    // Always fails (NOTE: infinity can be the maximum possible
    // integer value if NegativeBinomialInt is implemented to return
    // an integer)
    if p <= 0.0: return infinity
    count = 0
    total = 0
    if successes == 1
        if p == 0.5
          while RNDINT(1) == 0: count = count + 1
           return count
        end
        // Geometric distribution special case (see Saucier 2000)
        return floor(ln(RNDU01ZeroExc()) / ln(1.0 - p))
    end
    while true
        if RNDU01OneExc() < p
            // Success
            total = total + 1
            if total >= successes
                    return count
            end
        else
            // Failure
            count = count + 1
        end
    end
END METHOD

von Mises Distribution

The von Mises distribution describes a distribution of circular angles. In the pseudocode below, the mean is the mean angle, kappa is a shape parameter, and the method can return a number within π of that mean. The algorithm below is the Best–Fisher algorithm from 1979 (as described in Devroye 1986 with errata incorporated).

METHOD VonMises(mean, kappa)
    if kappa < 0: return error
    if kappa == 0
        return RNDNUMEXCRANGE(mean-pi, mean+pi)
    end
    r = 1.0 + sqrt(4 * kappa * kappa + 1)
    rho = (r - sqrt(2 * r)) / (kappa * 2)
    s = (1 + rho * rho) / (2 * rho)
    while true
        u = RNDNUMEXCRANGE(-1, 1)
        v = RNDU01ZeroOneExc()
        z = cos(pi * u)
        w = (1 + s*z) / (s + z)
        y = kappa * (s - w)
        if y*(2 - y) - v >=0 or ln(y / v) + 1 - y >= 0
           if angle<-1: angle=-1
           if angle>1: angle=1
           angle = acos(w)
           if u < 0: angle = -angle
           return mean + angle
        end
    end
END METHOD

Stable Distribution

A stable distribution is a limiting distribution of the sum of arbitrarily many independent and identically distributed random variables with infinite variance; the distribution resembles a curve with a single peak, but with generally "fatter" tails than the normal distribution. The pseudocode below uses the Chambers–Mallows–Stuck algorithm. The two shape parameters are alpha and beta; if beta is 0, the curve is symmetric.

METHOD Stable(alpha, beta)
     if alpha <=0 or alpha > 2: return error
     if beta < -1 or beta > 1: return error
    halfpi = pi * 0.5
    unif=RNDNUMEXCRANGE(-halfpi, halfpi)
    if unif==-halfpi: unif=RNDNUMEXCRANGE(-halfpi, halfpi)
    // Cauchy special case
    if alpha == 1 and beta == 0: return tan(unif)
    expo=-ln(RNDU01ZeroExc())
    c=cos(unif)
    if alpha == 1
            s=sin(unif)
            return 2.0*((unif*beta+pi*0.5)*s/c -
                beta * ln(pi*0.5*expo*c/(pi*0.5+beta*unif)))/pi
    end
    z=-tan(pi*alpha*0.5)*beta
    y=atan(-z)/alpha
    ug=unif+y
    cpow=c**(-1.0/alpha)
    return ((1.0+z*z)**(1.0/(2*alpha)))*
        (sin(alpha*ug)*cpow)*
        ((cos(unif-alpha*ug)/expo)**((1.0-alpha)/alpha))
END METHOD

Extended versions of the stable distribution:

Other Non-Uniform Distributions

Conclusion

This page discussed many ways applications can extract random numbers from random number generators.

Feel free to send comments. They may help improve this page. In particular, corrections to any method given on this page are welcome.

I acknowledge the commenters to the CodeProject version of this page, including George Swan, who referred me to the reservoir sampling method.

Notes

(1) This number format describes B-bit signed integers with minimum value -2B-1 and maximum value 2B-1 - 1, where B is a positive even number of bits; examples include Java's short, int, and long, with 16, 32, and 64 bits, respectively. A signed integer is an integer that can be positive, zero, or negative. In two' s-complement form, nonnegative numbers have the highest (most significant) bit set to zero, and negative numbers have that bit (and all bits beyond) set to one, and a negative number is stored in such form by decreasing its absolute value by 1 and swapping the bits of the resulting number. (2) The method that formerly appeared here is the Box-Muller-transformation: mu + radius * cos(angle) and mu + radius * sin(angle), where angle = 2 * pi * RNDU01OneExc() and radius = sqrt(-2 * ln(RNDU01ZeroExc())) * sigma, are two independent normally-distributed random numbers. A method of generating approximate standard normal random numbers, summing twelve RNDU01OneExc() calls and subtracting by 6, results in values not less than -6 or greater than 6, but results outside that range will occur only with a generally negligible probability.

License

This page is licensed under A Public Domain dedication.