# More Algorithms for Arbitrary-Precision Sampling

Peter Occil

Abstract: This page contains additional algorithms for arbitrary-precision sampling of distributions, Bernoulli factory algorithms (biased-coin to biased-coin algorithms), and algorithms to produce heads with an irrational probability. They supplement my pages on Bernoulli factory algorithms and partially-sampled random numbers.

2020 Mathematics Subject Classification: 68W20, 60-08, 60-04.

## Introduction

This page contains additional algorithms for arbitrary-precision sampling of distributions, Bernoulli factory algorithms (biased-coin to biased-coin algorithms), and algorithms to produce heads with an irrational probability. These samplers are designed to not rely on floating-point arithmetic. They may depend on algorithms given in the following pages:

This is an open-source document; for an updated version, see the source code or its rendering on GitHub. You can send comments on this document on the GitHub issues page.

My audience for this article is computer programmers with mathematics knowledge, but little or no familiarity with calculus.

I encourage readers to implement any of the algorithms given in this page, and report their implementation experiences. In particular, I seek comments on the following aspects:

• Are the algorithms in this article easy to implement? Is each algorithm written so that someone could write code for that algorithm after reading the article?
• Does this article have errors that should be corrected?
• Are there ways to make this article more useful to the target audience?

Comments on other aspects of this document are welcome.

## Bernoulli Factories

In the methods below, λ is the unknown probability of heads of the coin involved in the Bernoulli Factory problem.

### Certain Power Series

A power series is a function written as— $$f(\lambda) = a_0 (g(\lambda))^0 + a_1 (g(\lambda))^1 + ... + a_i (g(\lambda))^i + ...,\tag{1}$$ where $a_i$ are coefficients and $g(\lambda)$ is a function in the variable $\lambda$. Not all power series sum to a definite value, but all power series that matter in this section do, and they must be factory functions. (In particular, $g(\lambda)$ must be a Bernoulli factory function.)

Depending on the coefficients, different algorithms can be built to simulate a power series function:

• The coefficients are arbitrary, but can be split into two parts.
• The coefficients alternate in sign, and their absolute values form a decreasing sequence.
• The coefficients are nonnegative and sum to 1 or less.
• The coefficients are nonnegative and may sum to 1 or greater.

Note: In theory, the power series can contain coefficients that are irrational numbers or sum to an irrational number, but the algorithms to simulate such series can be inexact in practice. Also, not all power series that admit a Bernoulli factory are covered by the algorithms in this section. They include:

• Series with coefficients that alternate in sign, but do not satisfy the general martingale algorithm or Algorithm 1 below. This includes nearly all such series that equal 0 at 0 and 1 at 1, or equal 0 at 1 and 1 at 0. (An example is $\sin(\lambda\pi/2)$.)
• Series with negative and positive coefficients that do not eventually alternate in sign (ignoring zeros).

#### Certain Alternating Series

Suppose the following holds true for a power series function $f(\lambda)$:

• $f$ is written as in equation $(1)$.
• Suppose $(a_i)$ is the sequence formed from the coefficients of the series.
• Let $(d_j)$ be the sequence formed from $(a_i)$ by deleting the zero coefficients. Then suppose that:
• $d_0$ is greater than 0, and the elements in $(d_j)$ alternate in sign (example: 1/2, -1/3, 1/4, -1/5, ...).
• The absolute values of $(d_j)$'s elements are 1 or less and form a nonincreasing sequence that is finite or converges to 0.

In addition, the coefficients should be rational numbers.

Example: Let $f(\lambda) = (1/2)\lambda^0 - (1/4)\lambda^2 + (1/8)\lambda^4 - ...$. Then $(a_i) = (1/2, 0, -1/4, 0, 1/8, ...)$ (for example, $a_0 = 1/2$) and deleting the zeros leads to $(d_i) = (1/2, -1/4, 1/8, ...)$ (for example, $d_0 = 1/2$), which meets the requirements above.

Then the algorithm below, based on an algorithm by Łatuszyński et al. (2009/2011, especially section 3.1)[^1], simulates $f(\lambda)$ given a coin that shows heads (returns 1) with probability $g(\lambda)$.

General martingale algorithm:

1. Set u to abs($d_0$) ($d_0$ is the value of the first nonzero coefficient in the sequence $(a_i)$), set w to 1, set to 0, and set n to 1.
2. Generate a uniform(0, 1) random variate ret.
3. Do the following process repeatedly, until this algorithm returns a value:
1. If w is not 0, run a Bernoulli factory algorithm for $g(\lambda)$ (if $g(\lambda) = \lambda$, this is done by flipping the input coin), then multiply w by the result of the run.
2. If $a_n$ is greater than 0: Set u to + w * $a_n$, then, if no further nonzero coefficients follow $a_n$, set to u.
3. If $a_n$ is less than 0: Set to uw * abs($a_n$), then, if no further nonzero coefficients follow $a_n$, set u to .
4. If ret is less than (or equal to) , return 1. Otherwise, if ret is less than u, add 1 to n. Otherwise, return 0. (If ret is a uniform partially-sampled random number [PSRN], these comparisons should be done via the URandLessThanReal algorithm, which is described in my article on PSRNs.)

Notes:

1. The general martingale algorithm, as it's called in this article, supports more functions than in section 3.1 of Łatuszyński et al. (2019/2011), which supports only power series whose coefficients alternate in sign and decrease in absolute value, with no zeros in between nonzero coefficients. However, the general martingale algorithm uses that paper's framework. A proof of its correctness is given in the appendix.
2. The general martingale algorithm allows the sequence $(a_i)$ to sum to 1, but this appears to be possible only if the sequence's nonzero values have the form $(1, -z_0, z_0, -z_1, z_1, ..., -z_i, z_i, ...)$, where the $z_i$ are positive, are no greater than 1, and form a nonincreasing sequence that is finite or converges to 0. Moreover, it appears that every power series with this sequence of coefficients is less than or equal to $\lambda$.

#### General Power Series

The algorithm that follows can be used to simulate a more general class of power series functions. Suppose the following for a power series function $f(\lambda)$:

• $f$ is written as in equation $(1)$.
• There is a rational number $Z$ defined as follows. For every $\lambda$ in $[0, 1]$, $f(\lambda) \le Z \lt 1$.
• There is an even integer $m$ defined as follows. The series in equation $(1)$ can be split into two parts: the first part ($A$) is the sum of the first $m$ terms, and the second part ($C$) is the sum of the remaining terms. Moreover, both parts admit a Bernoulli factory algorithm. Specifically: $$C(\lambda) = \sum_{i\ge m} a_i (g(\lambda))^i, A(\lambda) = f(\lambda) - C(\lambda).$$ One way to satisfy the condition on $C$ is if $C$ is an alternating series (starting at $m$, even-indexed $a$'s are positive and odd-indexed are negative) and if $0 \le |a_{i+1}| \le |a_i| \le 1$ for every $i\ge m$ (that is, the coefficients starting with coefficient $m$ have absolute values that are 1 or less and form a nonincreasing sequence); such functions $C$ admit the general martingale algorithm. ($C$ and $A$ admit a Bernoulli factory only if they map the interval [0, 1] to itself and meet other requirements.)

In addition, the algorithm will be simpler if each coefficient $a_i$ is a rational number.

Then rewrite the function as— $$f(\lambda) = A(\lambda) + (g(\lambda))^{m} B(\lambda),$$ where—

• $A(\lambda) = f(\lambda) - C(\lambda) = \sum_{i=0}^{m-1} a_i (g(\lambda))^i$ is a polynomial in $g(\lambda)$ of degree $m-1$, and
• $B(\lambda) = C(\lambda) / (g(\lambda))^{m} = \sum_{i\ge m} a_{m+i} (g(\lambda))^i$.

Rewrite $A$ as a polynomial in Bernstein form, in the variable $g(\lambda)$. (One way to transform a polynomial to Bernstein form, given the "power" coefficients $a_0, ..., a_{m-1}$, is the so-called "matrix method" from Ray and Nataraj (2012)[^2].) Let $b_0, ..., b_{m-1}$ be the Bernstein-form polynomial's coefficients. Then if those coefficients all lie in $[0, 1]$, then the following algorithm simulates $f(\lambda)$.

Algorithm 1: Run a linear Bernoulli factory, with parameters $x=2$, $y=1$, and $\epsilon=1-Z$. Whenever the linear Bernoulli factory "flips the input coin", it runs the sub-algorithm below.

• Sub-algorithm: Generate an unbiased random bit. If that bit is 1, sample the polynomial $A$ as follows (Goyal and Sigman 2012)[^9]:

1. Run a Bernoulli factory algorithm for $g(\lambda)$, $m-1$ times. Let $j$ be the number of runs that return 1.
2. With probability $b_j$, return 1. Otherwise, return 0.

If the bit is 0, do the following:

1. Run a Bernoulli factory algorithm for $g(\lambda)$, $m$ times. Return 0 if any of the runs returns 0.
2. Run a Bernoulli factory algorithm for $B(\lambda)$, and return the result.

#### Series with Non-Negative Coefficients Summing to 1 or Less

Now, suppose $f(\lambda)$ can be written as a power series in equation $(1)$, but this time, the coefficients $a_i$ are 0 or greater and their sum is 1 or less.

If $g(\lambda) = \lambda$, this kind of function—

• satisfies $0\le f(\lambda)\le 1$ whenever 0 ≤ λ ≤ 1,
• is either constant or strictly increasing, and
• is convex (its "slope" or "velocity" doesn't decrease as λ increases)[^3].

Suppose $f$ can be written as $f(\lambda)= f_0(g(\lambda))$, where— $$f_0(\lambda) = \sum_{n} a_n \lambda^n = \sum_{n} w(n) \frac{a_n}{w(n)}\lambda^n,$$ where each sum is taken over all nonnegative values of $n$ where $a_n > 0$.

Then the key to simulating $f(\lambda)$ is to "tuck" the values $a_n$ under a function $w(n)$ such that—

• $1 \ge w(n)\ge a_n\ge 0$ for every allowed n, and
• $w(0)+w(1)+...=1$ (required for a valid distribution of integers 0 or greater).

Notes:

1. Assuming $f_0(1)$ does not equal 0, an appropriate $w(n)$ is trivial to find — $w(n)=a_n/f_0(1)$ (because $a_n \le f_0(1)$ for every allowed $n$). But in general, this can make $w(n)$ an irrational number and thus harder to handle with arbitrary precision.
2. If the coefficients $a_n$ sum to 1, then $w(n)$ can equal $a_n$. In this case, $f_0(\lambda)$ is what's called the probability generating function for getting $X$ with probability $a_X$ (or $w(X)$), and the expected value ("long-run average") of $X$ equals the "slope" of $f_0(\lambda)$ at 1. See also (Dughmi et al. 2021)[^4].
3. Assuming $f_0(1)$ is an irrational number, $w(n)$ can equal $a_n + c_n/2^n$, where $c_n$ is the $n$-th base-2 digit after the point in the binary expansion of $1 - f_0(1)$ (or 0 if $n=0$). Here, a number's binary expansion is written as 0.bbbbb... in base 2, where each b is a base-2 digit (either 0 or 1). See my Stack Exchange question.

Once $a_n$ and $w(n)$ are found, the function $f(\lambda)$ can be simulated using the following algorithm, which takes advantage of the convex combination method.

Algorithm 2:

1. Choose at random a number n that equals i with probability $w(i)$.
2. (The next two steps succeed with probability $\frac{a_n}{w(n)} \lambda^n$.) Let P be $a_n/w(n)$. With probability P, go to the next step. Otherwise, return 0.
3. (At this point, n equals i with probability $a_i$.) Run a Bernoulli factory algorithm for $g(\lambda)$, n times or until a run returns 0, whichever happens first. (For example, if $g(\lambda)=\lambda$, flip the input coin each time.) Return 1 if all the runs, including the last, returned 1 (or if n is 0). Otherwise, return 0.

Step 1 is rather general, and doesn't fully describe how to generate the value $n$ at random. That depends on the function $w(n)$. See "Power Series Examples", later, for examples of power series functions $f(\lambda)$ that can be simulated using Algorithm 2.

Note: Part of Algorithm 2 involves choosing $X$ at random with probability $w(X)$, then doing $X$ coin flips. Thus, the algorithm uses, on average, at least the number of unbiased random bits needed to generate $X$ on average (Knuth and Yao 1976)[^5].

Algorithm 2 covers an algorithm that was given by Luis Mendo (2019)[^6] for simulating certain power series, but works only if the coefficients sum to 1 or less and only if coefficient 0 ($a_0$) is 0.

To get to an algorithm equivalent to Mendo's, first Algorithm 2 is modified to simulate $f_0(\lambda)$/CS as follows, where CS is the sum of all coefficients $a_i$, starting with $i=1$. This shows Mendo's algorithm, like Algorithm 2, is actually a special case of the convex combination algorithm.

• Step 1 of Algorithm 2 becomes: "(1a.) Set dsum to 0 and $n$ to 1; (1b.) With probability $a_n$/(CSdsum), go to step 2. Otherwise, add $a_n$ to dsum; (1c.) Add 1 to i and go to step 1b." (Choose at random $n$ with probability $w(n)=a_n$/CS.)
• Step 2 becomes "Go to step 3". (The P in Algorithm 2 is not used; it's effectively $w(n)/\frac{a_n}{CS}=\frac{a_n}{CS}/\frac{a_n}{CS} = 1$.)
• In step 3, $g(\lambda)$ is either $\lambda$ (flip the input coin) or $1-\lambda$ (flip the input coin and take 1 minus the flip).

Mendo's algorithm and extensions of it mentioned by him cover several variations of power series as follows:

Type Power Series Algorithm
1 $f(\lambda)=1-f_0(1-\lambda)$ With probability CS, run the modified algorithm with $g(\lambda)=1-\lambda$ and return 1 minus the result. Otherwise, return 1.
2 $f(\lambda)=f_0(1-\lambda)$ With probability CS, run the modified algorithm with $g(\lambda)=1-\lambda$ and return the result. Otherwise, return 0.
3 $f(\lambda)=f_0(\lambda)$ With probability CS, run the modified algorithm with $g(\lambda)=\lambda$ and return the result. Otherwise, return 0.
4 $f(\lambda)=1-f_0(\lambda)$ With probability CS, run the modified algorithm with $g(\lambda)=\lambda$ and return 1 minus the result. Otherwise, return 1.

The conditions on $f$ given above mean that—

• for series of type 1, f(0) = 1−CS and f(1) = 1 (series of type 1 with CS=1 is the main form in Mendo's paper),
• for series of type 2, f(0) = CS and f(1) = 0,
• for series of type 3, f(0) = 0 and f(1) = CS, and
• for series of type 4, f(0) = 1 and f(1) = 1−CS.

#### Series with General Non-Negative Coefficients

If $f$ is a power series written as equation (1), but—

• all of the coefficients are nonnegative, and
• the coefficients sum to greater than 1,

then Nacu and Peres (2005, proposition 16)[^1] gave an algorithm which takes the following parameters:

• t is a rational number such that B < t ≤ 1 and f(t) < 1.
• ϵ is a rational number such that 0 < ϵ ≤ (tB)/2.

B is not a parameter, but is the maximum allowed value for $g(\lambda)$ (probability of heads), and is greater than 0 and less than 1. The following algorithm is based on that algorithm, but runs a Bernoulli factory for $g(\lambda)$ instead of flipping the input coin with probability of heads $\lambda$.

1. Create a ν input coin that does the following: "(1) Set n to 0. (2) With probability ϵ/t, go to the next substep. Otherwise, add 1 to n and repeat this substep. (3) With probability 1 − $a_n\cdot t^n$, return 0. (4) Run a linear Bernoulli factory n times, x/y = 1/(tϵ), and ϵ = ϵ. If the linear Bernoulli factory would flip the input coin, the coin is 'flipped' by running a Bernoulli factory for $g(\lambda)$. If any run of the linear Bernoulli factory returns 0, return 0. Otherwise, return 1."
2. Run a linear Bernoulli factory once, using the ν input coin described earlier, x/y = t/ϵ, and ϵ = ϵ, and return the result.

#### Power Series Examples

Examples 1 to 4 show how Algorithm 1 leads to algorithms for simulating certain factory functions.

Note: In the SymPy computer algebra library, the series(func, n=20) method computes the first 20 terms of a function's power series expansion. An example is: series(sin(x), n=20).

Example 1: Take $f(\lambda) = \sin(3\lambda)/2$, which is a power series.

• $f$ is less than or equal to $Z=1/2 \lt 1$.
• $f$ satisfies $m=8$ since splitting the series at 8 leads to two functions that admit Bernoulli factories.
• Thus, $f$ can be written as— $$f(\lambda) = A(\lambda) + \lambda^8 \left(\sum_{i\ge 0} a_{8+i} \lambda^i\right),$$ where $a_i = \frac{3^i}{i! \times 2}(-1)^{(i-1)/2}$ if $i$ is odd and 0 otherwise.
• $A$ is rewritten from "power" form (with coefficients $a_0, ..., a_{m-1}$) to Bernstein form, with the following coefficients, in order: [0, 3/14, 3/7, 81/140, 3/5, 267/560, 81/280, 51/1120].
• Now, Algorithm 1 can be used to simulate $f$ given a coin that shows heads (returns 1) with probability $\lambda$, where:
• $g(\lambda) = \lambda$, so the Bernoulli factory algorithm for $g(\lambda)$ is simply to flip the coin for $\lambda$.
• The coefficients $b_0, ..., b_{m-1}$, in order, are the Bernstein-form coefficients found for $A$.
• The Bernoulli factory algorithm for $B(\lambda)$ is as follows: Let $h_i = a_i$. Then run the general martingale algorithm with $g(\lambda) = \lambda$ and $a_i = h_{m+i}$.

Example 2: Take $f(\lambda) = 1/2 + \sin(6\lambda)/4$, rewritable as another power series.

• $f$ is less than or equal to $Z=3/4 \lt 1$.
• $f$ satisfies $m=16$ since splitting the series at 16 leads to two functions that admit Bernoulli factories.
• Thus, $f$ can be written as— $$f(\lambda) = A(\lambda) + \lambda^{m} \left(\sum_{i\ge 0} a_{m+i} \lambda^i\right),$$ where $m=16$, and where $a_i$ is $1/2$ if $i = 0$; $\frac{6^i}{i! \times 4}(-1)^{(i-1)/2}$ if $i$ is odd; and 0 otherwise.
• $A$ is rewritten from "power" form (with coefficients $a_0, ..., a_{m-1}$) to Bernstein form, with the following coefficients, in order: [1/2, 3/5, 7/10, 71/91, 747/910, 4042/5005, 1475/2002, 15486/25025, 167/350, 11978/35035, 16869/70070, 167392/875875, 345223/1751750, 43767/175175, 83939/250250, 367343/875875].
• Now, Algorithm 1 can be used to simulate $f$ in the same manner as for Example 1.

Example 3: Take $f(\lambda) = 1/2 + \sin(\pi\lambda)/4$. To simulate this probability:

1. Create a μ coin that does the following: "With probability 1/3, return 0. Otherwise, run the algorithm for π/4 (in 'Bernoulli Factory Algorithms') and return the result." (Simulates π/6.)
2. Run the algorithm for $1/2 + \sin(6\lambda)/4$ in Example 2, using the μ coin.

Example 4: Take $f(\lambda) = 1/2 + \cos(6\lambda)/4$. This is as in Example 2, except—

• $Z=3/4$ and $m=16$;
• $a_i$ is $3/4$ if $i = 0$; $\frac{6^i}{i! \times 4}(-1)^{i/2}$ if $i$ is even and greater than 0; and 0 otherwise; and
• the Bernstein-form coefficients for $A$, in order, are [3/4, 3/4, 255/364, 219/364, 267/572, 1293/4004, 4107/20020, 417/2860, 22683/140140, 6927/28028, 263409/700700, 2523/4900, 442797/700700, 38481/53900, 497463/700700].

Example 5: Take $f(\lambda) = 1/2 + \cos(\pi\lambda)/4$. This is as in Example 3, except step 2 runs the algorithm for $1/2 + \cos(6\lambda)/4$ in Example 4.

Examples 6: The following functions can be written as power series that satisfy the general martingale algorithm. In the table, $B(i)$ is the $i$th Bernoulli number (see the note after the table), and ${n \choose m}$ = choose($n$, $m$) is a binomial coefficient.

Function $f(\lambda)$ Coefficients Value of $d_0$
$\lambda/(\exp(\lambda)-1)$ $a_i = -1/2$ if $i=1$, or $B(i)/(i!)$ otherwise. 1.
$\tanh(\lambda)$ $a_i = \frac{B(i+1) 2^{i+1} (2^{i+1}-1)}{(i+1)!}$ if $i$ is odd[^7], or 0 otherwise. 1.
$\cos(\sqrt \lambda)$ $a_i = \frac{(-1)^i}{(2i)!}$. 1.
$\sum_{i\ge 0} a_i x^i$ (source) $a_i = \frac{(-1)^i 4^i}{(2i+1)^2 {2i \choose i}}$. 1.

To simulate a function in the table, run the general martingale algorithm with $g(\lambda) = \lambda$ and with the given coefficients and value of $d_0$ ($d_0$ is the first nonzero coefficient).

Note: Bernoulli numbers can be computed with the following algorithm, namely Get the mth Bernoulli number:

1. If m is 0, 1, 2, 3, or 4, return 1, 1/2, 1/6, 0, or −1/30, respectively. Otherwise, if m is odd[^7], return 0.
2. Set i to 2 and v to 1 − (m+1)/2.
3. While i is less than m:
1. Get the ith Bernoulli number, call it b. Add b*choose(m+1, i) to v.[^8]
2. Add 2 to i.
4. Return −v/(m+1).

Examples 7 to 9 use Algorithm 2 to simulate power series functions where the coefficients $a_0$ are nonnegative.

Example 7: The hyperbolic cosine minus 1, denoted as cosh(λ)−1, can be written as follows: $$f(\lambda)=\cosh(\lambda)-1 = \sum_{n} a_n \lambda^n = \sum_{n} w(n) \frac{a_n \lambda^n}{w(n)},$$ where:

• Each sum given above is taken over all values of n that can occur after step 1 is complete (in this case, all values of n that are even and greater than 0).
• $a_n$ is $1/(n!)$.[^9]
• The coefficients $a_n$ are tucked under a function $w(n)$, which in this case is $\frac{1}{2^{n-2}}$ if n>0 and n is even[^10], or 0 otherwise.

For this particular function:

• Step 1 of Algorithm 2 can read: "(1a.) Generate unbiased random bits (each bit is 0 or 1 with equal probability) until a zero is generated this way, then set n to the number of ones generated this way; (1b.) Set n to 2*n + 2."
• In step 2, P is $a_n/w(n) = \frac{1}{n!} / \frac{1}{2^{n-2}} = \frac{2^{n/2}}{n!}$ for each allowed $n$.
• In step 3, $g(\lambda)$ is simply $\lambda$.

Examples 8: cosh(λ)−1 and additional target functions are shown in the following table. (In the table below, $w(n)=1/(2^{w^{-1}(n)+1})$ where $w^{-1}(n)$ is the inverse of the "Step 1b" column, and the $g(\lambda)$ in step 3 is simply $\lambda$.)

Target function f(λ) Step 1b reads "Set n to ..." $a_n$ $w(n)$ Value of P
cosh(λ)−1. 2*n + 2. 1/(n!). 1/(2(n−2)/2+1). 2n/2/(n!).
exp(λ/4)/2. n. 1/(n!*2*4n) 1/(2n+1). 1/(2n*(n!)).
exp(λ)/4. n. 1/(n!*4). 1/(2n+1). 2n−1/(n!).
exp(λ)/6. n. 1/(n!*6). 1/(2n+1). 2n/(3*(n!)).
exp(λ/2)/2. n. 1/(n!*2*2n) 1/(2n+1). 1/(n!).
(exp(λ)−1)/2. n + 1. 1/((n+1)!*4). 1/(2n). 2n−1/(n!).
sinh(λ)/2 2*n + 1. 1/(n!*2). 1/(2(n−1)/2+1). 2(n−1)/2/(n!).
cosh(λ)/2 2*n. 1/(n!*2). 1/(2n/2+1). 2n/2/(n!).

Examples 9: The table below shows power series functions shifted downward and shows the algorithm changes needed to simulate the modified function. In the table, D is a rational number such that 0 ≤ Dφ(0), where φ(.) is the original function.

Original function (φ(λ)) Target function f(λ) Step 1b reads "Set n to ..." Value of P
exp(λ)/4. φ(λ) − D. n. (1/4−D)*2 or (φ(0)−D)*2 if n = 0;
2n−1/(n!) otherwise.
exp(λ)/6. φ(λ) − D. n. (1/6−D)*2 if n = 0;
2n/(3*(n!)) otherwise.
exp(λ/2)/2. φ(λ) − D. n. (1/2−D)*2 if n = 0;
1/(n!) otherwise.
cosh(λ)/4. φ(λ) − D. 2*n. (1/4−D)*2 if n = 0;
2n/2/(2*(n!)) otherwise.

Example 10: Let $f(\lambda)=exp(\lambda)\cdot (1-\lambda)$. Run Mendo's algorithm for series of type 1, with $a_i = \frac{i-1}{i!}$ and $CS = 1$.

### Certain Piecewise Linear Functions

Let f(λ) be a function of the form min(λ*mult, 1−ε). This is a piecewise linear function, a function made up of two linear pieces (in this case, the pieces are a rising linear part and a constant part).

This section describes how to calculate the Bernstein coefficients for polynomials that converge from above and below to f, based on Thomas and Blanchet (2012)[^11]. These polynomials can then be used to generate heads with probability f(λ) using the algorithms given in "General Factory Functions".

In this section, fbelow(n, k) and fabove(n, k) are the kth coefficients (with k starting at 0) of the lower and upper polynomials, respectively, in Bernstein form of degree n.

The code in the appendix uses the computer algebra library SymPy to calculate a list of parameters for a sequence of polynomials converging from above. The method to do so is called calc_linear_func(eps, mult, count), where eps is ε, mult = mult, and count is the number of polynomials to generate. Each item returned by calc_linear_func is a list of two items: the degree of the polynomial, and a Y parameter. The procedure to calculate the required polynomials is then logically as follows (as written, it runs very slowly, though):

1. Set i to 1.
2. Run calc_linear_func(eps, mult, i) and get the degree and Y parameter for the last listed item, call them n and y, respectively.
3. Set x to −((y−(1−ε))/ε)5/mult + y/mult. (This exact formula doesn't appear in the Thomas and Blanchet paper; rather it comes from the supplemental source code uploaded by A. C. Thomas at my request.
4. For degree n, fbelow(n, k) is min((k/n)*mult, 1−ε), and fabove(n, k) is min((k/n)*y/x,y). (fbelow matches f because f is concave on the interval [0, 1], which roughly means that its rate of growth there never goes up.)
5. Add 1 to i and go to step 2.

It would be interesting to find general formulas to find the appropriate polynomials (degrees and Y parameters) given only the values for mult and ε, rather than find them "the hard way" via calc_linear_func. For this procedure, the degrees and Y parameters can be upper bounds, as long as the sequence of degrees is strictly increasing and the sequence of Y parameters is nonincreasing.

Note: In Nacu and Peres (2005)[^12], the following polynomial sequences were suggested to simulate $\min(2\lambda, 1-2\varepsilon)$, provided $\varepsilon \lt 1/8$, where n is a power of 2. However, with these sequences, an extraordinary number of input coin flips is required to simulate this function each time.

• fbelow(n, k) = $\min(2(k/n), 1-2\varepsilon)$.
• fabove(n, k) = $\min(2(k/n), 1-2\varepsilon)+$
$\frac{2\times\max(0, k/n+3\varepsilon - 1/2)}{\varepsilon(2-\sqrt{2})} \sqrt{2/n}+$
$\frac{72\times\max(0,k/n-1/9)}{1-\exp(-2\times\varepsilon^2)} \exp(-2n\times\varepsilon^2)$.

My own algorithm for min(λ, 1/2) is as follows. See the appendix for the derivation of this algorithm.

1. Generate an unbiased random bit. If that bit is 1 (which happens with probability 1/2), flip the input coin and return the result.
2. Run the algorithm for min(λ, 1−λ) given later, and return the result of that run.

And the algorithm for min(λ, 1−λ) is as follows:

1. (Random walk.) Generate unbiased random bits until more zeros than ones are generated this way for the first time. Then set m to (n−1)/2+1, where n is the number of bits generated this way.
2. (Build a degree-m*2 polynomial equivalent to (4*λ*(1−λ))m/2.) Let z be (4m/2)/choose(m*2,m). Define a polynomial of degree m*2 whose (m*2)+1 Bernstein coefficients are all zero except the mth coefficient (starting at 0), whose value is z. Elevate the degree of this polynomial enough times so that all its coefficients are 1 or less (degree elevation increases the polynomial's degree without changing its shape or position; see the derivation in the appendix). Let d be the new polynomial's degree.
3. (Simulate the polynomial, whose degree is d (Goyal and Sigman 2012)[^13].) Flip the input coin d times and set h to the number of ones generated this way. Let a be the hth Bernstein coefficient (starting at 0) of the new polynomial. With probability a, return 1. Otherwise, return 0.

I suspected that the required degree d would be floor(m*2/3)+1, as described in the appendix. With help from the MathOverflow community, steps 2 and 3 of the algorithm above can be described more efficiently as follows:

• (2.) Let r be floor(m*2/3)+1, and let d be m*2+r.
• (3.) (Simulate the polynomial, whose degree is d.) Flip the input coin d times and set h to the number of ones generated this way. Let a be (1/2) * 2m*2*choose(r,hm)/choose(d, h) (the polynomial's hth Bernstein coefficient starting at 0; the first term is 1/2 because the polynomial being simulated has the value 1/2 at the point 1/2). With probability a, return 1. Otherwise, return 0.

The min(λ, 1−λ) algorithm can be used to simulate certain other piecewise linear functions with three breakpoints, and algorithms for those functions are shown in the following table. In the table, μ is the unknown probability of heads of a second input coin, and ν is the unknown probability of heads of a third input coin.

Breakpoints Algorithm
0 at 0; ν/2 at 1/2; and ν*μ at 1. Flip the ν input coin. If it returns 0, return 0. Otherwise, flip the μ input coin. If it returns 1, flip the λ input coin and return the result. Otherwise, run the algorithm for min(λ, 1−λ) using the λ input coin, and return the result of that run.
(1−μ)/2 at 0; 1/2 at 1/2; and μ/2 at 1. Generate an unbiased random bit. If that bit is 1, run the algorithm for min(λ, 1−λ) using the λ input coin, and return the result of that run. Otherwise, flip the μ input coin. If it returns 1, flip the λ input coin and return the result. Otherwise, flip the λ input coin and return 1 minus the result.
0 at 0; μ/2 at 1/2; and μ/2 at 1. Flip the μ input coin. If it returns 0, return 0. Otherwise, generate an unbiased random bit. If that bit is 1 (which happens with probability 1/2), flip the λ input coin and return the result. Otherwise, run the algorithm for min(λ, 1−λ) using the λ input coin, and return the result of that run.
μ at 0; 1/2 at 1/2; and 0 at 1. Flip the μ input coin. If it returns 1, flip the λ input coin and return 1 minus the result. Otherwise, run the algorithm for min(λ, 1−λ) using the λ input coin, and return the result of that run.
1 at 0; 1/2 at 1/2; and μ at 1. Flip the μ input coin. If it returns 0, flip the λ input coin and return 1 minus the result. Otherwise, run the algorithm for min(λ, 1−λ) using the λ input coin, and return 1 minus the result of that run.
μ at 0; 1/2 at 1/2; and 1 at 1. Flip the μ input coin. If it returns 0, flip the λ input coin and return the result. Otherwise, run the algorithm for min(λ, 1−λ) using the λ input coin, and return 1 minus the result of that run.
B at 0; B+(A/2) at 1/2; and B+(A/2) at 1. (A≤1 and B≤1−A are rational numbers.) With probability 1−A, return a number that is 1 with probability B/(1−A) and 0 otherwise. Otherwise, generate an unbiased random bit. If that bit is 1, flip the λ input coin and return the result. Otherwise, run the algorithm for min(λ, 1−λ) using the λ input coin, and return the result of that run.

Example: Let $f$ be $\lambda/2$ if $\lambda\le 1/2$, and $1/2-\lambda/2$ otherwise. Then use the algorithm for 0 at 0; ν/2 at 1/2; and ν*μ at 1, where ν is a coin that returns 1 with probability 1/2 and 0 otherwise, and μ is a coin that always returns 0.

### Sampling Distributions Using Incomplete Information

The Bernoulli factory is a special case of the problem of sampling a probability distribution with unknown parameters. This problem can be described as sampling from a new distribution using an oracle (black box) that produces numbers of an incompletely known distribution. In the Bernoulli factory problem, this oracle is a coin that shows heads or tails where the probability of heads is unknown. The rest of this section deals with oracles that go beyond coins.

Algorithm 1. Suppose there is an oracle that produces independent random variates on a closed interval [a, b], and these numbers have an unknown mean of μ. The goal is now to produce nonnegative random variates whose expected value ("long-run average") is f(μ). Unless f is constant, this is possible if and only if—

• f is continuous on the closed interval, and
• f(μ) is greater than or equal to ε*min((μa)n, (bμ)n) for some integer n and some ε greater than 0 (loosely speaking, f is nonnegative and neither touches 0 in the interior of the interval nor moves away from 0 more slowly than a polynomial)

(Jacob and Thiery 2015)[^14]. (Here, a and b are both rational numbers and may be less than 0.)

In the algorithm below, let K be a rational number greater than the maximum value of f on the closed interval [a, b], and let g(λ) = f(a + (ba)*λ)/K.

1. Create a λ input coin that does the following: "Take a number from the oracle, call it x. With probability (xa)/(ba) (see note below), return 1. Otherwise, return 0."
2. Run a Bernoulli factory algorithm for g(λ), using the λ input coin. Then return K times the result.

Note: The check "With probability (xa)/(ba)" is exact if the oracle produces only rational numbers. If the oracle can produce irrational numbers (such as numbers that follow a beta distribution or another non-discrete distribution), then the code for the oracle should use uniform partially-sampled random numbers (PSRNs). In that case, the check can be implemented as follows. Let x be a uniform PSRN representing a number generated by the oracle. Set y to RandUniformFromReal(ba), then the check succeeds if URandLess(y, UniformAddRational(x, −a)) returns 1, and fails otherwise.

Example: Suppose an oracle produces random variates in the interval [3, 13] with unknown mean μ, and the goal is to use the oracle to produce nonnegative random variates with mean f(μ) = −319/100 + μ*103/50 − μ2*11/100, which is a polynomial with Bernstein coefficients [2, 9, 5] in the given interval. Then since 8 is greater than the maximum of f in that interval, g(λ) is a degree-2 polynomial with Bernstein coefficients [2/8, 9/8, 5/8] in the interval [0, 1]. g can't be simulated as is, though, but increasing g's degree to 3 leads to the Bernstein coefficients [1/4, 5/6, 23/24, 5/8], which are all less than 1 so that the following algorithm can be used (see "Certain Polynomials"):

1. Set heads to 0.
2. Generate three random variates from the oracle (which must produce random variates in the interval [3, 13]). For each number x: With probability (x−3)/(10−3), add 1 to heads.
3. Depending on heads, return 8 (that is, 1 times the upper bound) with the given probability, or 0 otherwise: heads=0 → probability 1/4; 1 → 5/6; 2 → 23/24; 3 → 5/8.

Algorithm 2. This algorithm takes an oracle and produces nonnegative random variates whose expected value ("long-run average") is the mean of f(X), where X is a number produced by the oracle. The algorithm appears in the appendix, however, because it requires applying an arbitrary function (here, f) to a potentially irrational number.

Algorithm 3. For this algorithm, see the appendix.

Algorithm 4. Say there is an oracle in the form of a fair die. The number of faces of the die, n, is at least 2 but otherwise unknown. Each face shows a different integer 0 or greater and less than n. The question arises: Which probability distributions based on the number of faces can be sampled with this oracle? This question was studied in the French-language dissertation of R. Duvignau (2015, section 5.2)[^15], and the following are four of these distributions.

Bernoulli 1/n. It's trivial to generate a Bernoulli variate that is 1 with probability 1/n and 0 otherwise: just take a number from the oracle and return either 1 if that number is 0, or 0 otherwise. Alternatively, take two numbers from the oracle and return either 1 if both are the same, or 0 otherwise (Duvignau 2015, p. 153)[^15].

Random variate with mean n. Likewise, it's trivial to generate variates with a mean of n: Do "Bernoulli 1/n" trials as described above until a trial returns 0, then return the number of trials done this way. (This is often called 1 plus a "geometric" random variate, and has a mean of n.)

Binomial with parameters n and 1/n. Using the oracle, the following algorithm generates a binomial variate of this kind (Duvignau 2015, Algorithm 20)[^15]:

1. Take items from the oracle until the same item is taken twice.
2. Create a list consisting of the items taken in step 1, except for the last item taken, then shuffle that list.
3. In the shuffled list, count the number of items that didn't change position after being shuffled, then return that number.

Binomial with parameters n and k/n. Duvignau 2015 also includes an algorithm (Algorithm 25) to generate a binomial variate of this kind using the oracle (where k is a known integer such that 0 < k and kn):

1. Take items from the oracle until k different items were taken this way. Let U be a list of these k items, in the order in which they were first taken.
2. Create an empty list L.
3. For each integer i in [0, k):
1. Create an empty list M.
2. Take an item from the oracle. If the item is in U at a position less than i (positions start at 0), repeat this substep. Otherwise, if the item is not in M, add it to M and repeat this substep. Otherwise, go to the next substep.
3. Shuffle the list M, then add to L each item that didn't change position after being shuffled (if not already present in L).
4. For each integer i in [0, k):
1. Let P be the item at position i in U.
2. Take an item from the oracle. If the item is in U at position i or less (positions start at 0), repeat this substep.
3. If the last item taken in the previous substep is in U at a position greater than i, add P to L (if not already present).
5. Return the number of items in L.

Note: Duvignau proved a result (Theorem 5.2) that answers the question: Which probability distributions based on the unknown n can be sampled with the oracle?[^16] The result applies to a family of (discrete) distributions with the same unknown parameter n, starting with either 1 or a greater integer. Let Supp(m) be the set of values taken on by the distribution with parameter equal to m. Then that family can be sampled using the oracle if and only if:

• There is a computable function f(k) that outputs a positive number.
• For each n, Supp(n) is included in Supp(n+1).
• For every k and for every n ≥ 2 starting with the first n for which k is in Supp(n), the probability of seeing k given parameter n is at least (1/n)f(k) (roughly speaking, the probability doesn't decay at a faster than polynomial rate as n increases).

### Pushdown Automata for Square-Root-Like Functions

A pushdown automaton is a state machine that keeps a stack of symbols. In this document, the input for this automaton is a stream of flips of a coin that shows heads with probability λ, and the output is 0 or 1 depending on which state the automaton ends up in when it empties the stack (Mossel and Peres 2005)[^17]. That paper shows that a pushdown automaton, as defined here, can simulate only algebraic functions, that is, functions that can be a solution of a nonzero polynomial equation. The appendix defines these machines in more detail and has proofs on which algebraic functions are possible with pushdown automata.

In this section, ${n \choose m}$ = choose($n$, $m$) is a binomial coefficient.

The following algorithm extends the square-root construction of Flajolet et al. (2010)[^18], takes an input coin with probability of heads λ, and returns 1 with probability—

$$f(\lambda)=\frac{1-\lambda}{\sqrt{1+4\lambda(g(\lambda))^2-1}} = (1-\lambda)\sum_{n\ge 0} \lambda^n (g(\lambda))^n (1-g(\lambda))^n {2n \choose n}$$ $$= (1-\lambda)\sum_{n\ge 0} (\lambda g(\lambda) (1-g(\lambda)))^n {2n \choose n},$$

and 0 otherwise, where:

• g(λ) is a Bernoulli factory function with the property $0\le g(\lambda)\lt 1$ whenever $0\le\lambda\le 1$. If g is a rational function (a ratio of two polynomials) whose coefficients are rational numbers, then f is algebraic and can be simulated by a pushdown automaton, as in the algorithm below. But this algorithm will still work even if g is not a rational function.

Equivalently— $$f(\lambda)=(1-\lambda) OGF(\lambda g(\lambda) (1-g(\lambda))),$$ where $OGF(x) = \sum_{n\ge 0} x^n {2n \choose n}$ is the algorithm's ordinary generating function (also known as counting generating function).

1. Set d to 0.
2. Do the following process repeatedly until this run of the algorithm returns a value:
1. Flip the input coin. If it returns 1, go to the next substep. Otherwise, return either 1 if d is 0, or 0 otherwise.
2. Run a Bernoulli factory algorithm for g(λ). If the run returns 1, add 1 to d. Otherwise, subtract 1 from d. Do this substep again.

As a pushdown automaton, this algorithm (except the "Do this substep again" part) can be expressed as follows. Let the stack have the single symbol EMPTY, and start at the state POS-S1. Based on the current state, the last coin flip (HEADS or TAILS), and the symbol on the top of the stack, set the new state and replace the top stack symbol with zero, one, or two symbols. These transition rules can be written as follows:

• (POS-S1, HEADS, topsymbol) → (POS-S2, {topsymbol}) (set state to POS-S2, keep topsymbol on the stack).
• (NEG-S1, HEADS, topsymbol) → (NEG-S2, {topsymbol}).
• (POS-S1, TAILS, EMPTY) → (ONE, {}) (set state to ONE, pop the top symbol from the stack).
• (NEG-S1, TAILS, EMPTY) → (ONE, {}).
• (POS-S1, TAILS, X) → (ZERO, {}).
• (NEG-S1, TAILS, X) → (ZERO, {}).
• (ZERO, flip, topsymbol) → (ZERO, {}).
• (POS-S2, flip, topsymbol) → Add enough transition rules to the automaton to simulate g(λ) by a finite-state machine (only possible if g is rational with rational coefficients (Mossel and Peres 2005)[^17]). Transition to POS-S2-ZERO if the machine outputs 0, or POS-S2-ONE if the machine outputs 1.
• (NEG-S2, flip, topsymbol) → Same as before, but the transitioning states are NEG-S2-ZERO and NEG-S2-ONE, respectively.
• (POS-S2-ONE, flip, topsymbol) → (POS-S1, {topsymbol, X}) (replace top stack symbol with topsymbol, then push X to the stack).
• (POS-S2-ZERO, flip, EMPTY) → (NEG-S1, {EMPTY, X}).
• (POS-S2-ZERO, flip, X) → (POS-S1, {}).
• (NEG-S2-ZERO, flip, topsymbol) → (NEG-S1, {topsymbol, X}).
• (NEG-S2-ONE, flip, EMPTY) → (POS-S1, {EMPTY, X}).
• (NEG-S2-ONE, flip, X) → (NEG-S1, {}).

The machine stops when it removes EMPTY from the stack, and the result is either ZERO (0) or ONE (1).

For the following algorithm, which extends the end of Note 1 of the Flajolet paper, the probability is— $$f(\lambda)=(1-\lambda) \sum_{n\ge 0} \lambda^{Hn} g(\lambda)^n (1-g(\lambda))^{Hn-n} {Hn \choose n},$$ where H ≥ 2 is an integer; and g has the same meaning as earlier.

1. Set d to 0.
2. Do the following process repeatedly until this run of the algorithm returns a value:
1. Flip the input coin. If it returns 1, go to the next substep. Otherwise, return either 1 if d is 0, or 0 otherwise.
2. Run a Bernoulli factory algorithm for g(λ). If the run returns 1, add (H−1) to d. Otherwise, subtract 1 from d. (Note: This substep is not done again.)

The following algorithm simulates the probability— $$f(\lambda) = (1-\lambda) \sum_{n\ge 0} \lambda^n \left( \sum_{m\ge 0} W(n,m) g(\lambda)^m (1-g(\lambda))^{n-m} {n \choose m}\right)$$ $$= (1-\lambda) \sum_{n\ge 0} \lambda^n \left( \sum_{m\ge 0} V(n,m) g(\lambda)^m (1-g(\lambda))^{n-m}\right),$$ where g has the same meaning as earlier; W(n, m) is 1 if m*H equals (nm)*T, or 0 otherwise; and H≥1 and T≥1 are integers. (In the first formula, the sum in parentheses is a polynomial in Bernstein form, in the variable g(λ) and with only zeros and ones as coefficients. Because of the λn, the polynomial gets smaller as n gets larger. V(n, m) is the number of n-letter words that have m heads and describe a walk that ends at the beginning.)

1. Set d to 0.
2. Do the following process repeatedly until this run of the algorithm returns a value:
1. Flip the input coin. If it returns 1, go to the next substep. Otherwise, return either 1 if d is 0, or 0 otherwise.
2. Run a Bernoulli factory algorithm for g(λ). If the run returns 1 ("heads"), add H to d. Otherwise ("tails"), subtract T from d. (Note: This substep is not done again.)

### ln(c+λ)/(c+λ)

In this algorithm, c must be an integer 1 or greater, and λ is the unknown heads probability of a coin.

1. Run the algorithm for d / (c + λ), with d=1 and c=c, repeatedly, until the run returns 1, then set g to the number of runs that returned 0 this way.
2. If g is 0, return 0. Otherwise, return a number that is 1 with probability 1/g or 0 otherwise. (Here, returning 0 means that the von Neumann schema would require another iteration; see the note.)

Note: This algorithm is based on the von Neumann schema with the single-cycle permutation class. In this case, given a coin that shows heads with probability z, the schema will terminate in one iteration with probability (1−z)*ln(1/(1−z)). Thus, if the coin shows heads with probability 1 − z, the one-iteration probability is z*ln(1/z), so if the coin shows heads with probability 1 − 1/(m+z), the one-iteration probability is (1/(m+z))*ln(1/(1/(m+z))) = ln(m+z)/(m+z).

### Examples for the von Neumann schema

Examples contained in Theorem 2.3 of Flajolet et al. (2010)[^18]. In the table:

• λ is the unknown heads probability of a coin.
• μ is another coin that flips the λ coin and returns 1 minus the result (thus simulating 1 − λ).
• "Domain" is the set of values λ can take on.
Function Domain Algorithm
exp(−λ) [0, 1) Uses von Neumann schema algorithm (VNS) with sorted permutations, and the λ coin. Return 1 if VNS returns 0, and 0 otherwise.
exp(λ − 1) = exp(−(1 − λ)) (0, 1] Uses VNS with sorted permutations, and the μ coin. Return 1 if VNS returns 0, and 0 otherwise.
(1−λ)*exp(λ) [0, 1) Uses VNS with sorted permutations, and the λ coin. Return 1 if VNS finishes in one iteration, and 0 otherwise.
λ*exp(1−λ) (0, 1] Uses VNS with sorted permutations, and the μ coin. Return 1 if VNS finishes in one iteration, and 0 otherwise.
λ/ln(1/(1−λ)) [0, 1) Uses VNS with single-cycle permutations, and the λ coin. Return 1 if VNS returns 1, and 0 otherwise.
(1−λ)/ln(1/λ) (0, 1] Uses VNS with single-cycle permutations, and the μ coin. Return 1 if VNS returns 1, and 0 otherwise.
(1−λ)*ln(1/(1−λ)) [0, 1) Uses VNS with single-cycle permutations, and the λ coin. Return 1 if VNS finishes in one iteration, and 0 otherwise.
λ*ln(1/λ) (0, 1] Uses VNS with single-cycle permutations, and the μ coin. Return 1 if VNS finishes in one iteration, and 0 otherwise.
cos(λ) [0, 1) Uses VNS with alternating even-sized permutations, and the λ coin. Return 1 if VNS returns 0, and 0 otherwise.
(1−λ)/cos(λ) = (1−λ)*sec(λ) [0, 1) Uses VNS with alternating even-sized permutations, and the λ coin. Return 1 if VNS finishes in one iteration, and 0 otherwise.
λ/tan(λ) [0, 1) Uses VNS with alternating odd-sized permutations, and the λ coin. Return 1 if VNS returns 1, and 0 otherwise.
(1−λ)*tan(λ) [0, 1) Uses VNS with alternating odd-sized permutations, and the λ coin. Return 1 if VNS finishes in one iteration, and 0 otherwise.

### 1/(1+(m+λ)2)

This is a rational function (ratio of two polynomials) with variable λ, and this rational function admits the following algorithm. In this algorithm, m must be an integer 0 or greater, and λ is the unknown heads probability of a coin.

1. Let d be the three-item list [1, 2, 1] (for numerator 1). Let e be the three-item list [1+m2, 2*(1+m2+m), 1+m2+2*m+1] (for denominator). Find the highest number in e, then divide each item in d and in e by that number (using rational arithmetic).
2. Run the first algorithm for rational functions in "Bernoulli Factory Algorithms", with n = 2, and with d and e given above.

## Irrational Probabilities

### Certain Numbers Based on the Golden Ratio

The following algorithm given by Fishman and Miller (2013)[^19] finds the continued fraction expansion of certain numbers described as—

• G(m, ) = (m + sqrt(m2 + 4 * ))/2
or (m − sqrt(m2 + 4 * ))/2,

whichever results in a real number greater than 1, where m is a positive integer and is either 1 or −1. In this case, G(1, 1) is the golden ratio.

First, define the following operations:

• Get the previous and next Fibonacci-based number given k, m, and :
1. If k is 0 or less, return an error.
2. Set g0 to 0, g1 to 1, x to 0, and y to 0.
3. Do the following k times: Set y to m * g1 + * g0, then set x to g0, then set g0 to g1, then set g1 to y.
4. Return x and y, in that order.
• Get the partial denominator given pos, k, m, and (this partial denominator is part of the continued fraction expansion found by Fishman and Miller):
1. Get the previous and next Fibonacci-based number given k, m, and , call them p and n, respectively.
2. If is 1 and k is odd[^7], return p + n.
3. If is −1 and pos is 0, return np − 1.
4. If is 1 and pos is 0, return (n + p) − 1.
5. If is −1 and pos is even[^10], return np − 2. (The paper had an error here; the correction given here was verified by Miller via personal communication.)
6. If is 1 and pos is even[^10], return (n + p) − 2.
7. Return 1.

An application of the continued fraction algorithm is the following algorithm that generates 1 with probability G(m, )k and 0 otherwise, where k is an integer that is 1 or greater (see "Continued Fractions" in my page on Bernoulli factory algorithms). The algorithm starts with pos = 0, then the following steps are taken:

1. Get the partial denominator given pos, k, m, and , call it kp.
2. Do the following process repeatedly, until this run of the algorithm returns a value:
1. With probability kp/(1 + kp), return a number that is 1 with probability 1/kp and 0 otherwise.
2. Do a separate run of the currently running algorithm, but with pos = pos + 1. If the separate run returns 1, return 0.

### Ratio of Lower Gamma Functions (γ(m, x)/γ(m, 1)).

1. Set ret to the result of kthsmallest with the two parameters m and m. (Thus, ret is distributed as u1/m where u is a uniform random variate greater than 0 and less than 1; although kthsmallest accepts only integers, this formula works for every m greater than 0.)
2. Set k to 1, then set u to point to the same value as ret.
3. Generate a uniform(0, 1) random variate v.
4. If v is less than u: Set u to v, then add 1 to k, then go to step 3.
5. If k is odd[^7], return a number that is 1 if ret is less than x and 0 otherwise. (If ret is implemented as a uniform partially-sampled random number (PSRN), this comparison should be done via URandLessThanReal.) If k is even[^10], go to step 1.

Derivation: See Formula 1 in the section "Probabilities Arising from Certain Permutations", where:

• ECDF(x) is the probability that a uniform random variate greater than 0 and less than 1 is x or less, namely x if x is in [0, 1], 0 if x is less than 0, and 1 otherwise.
• DPDF(x) is the probability density function for the maximum of m uniform random variates in [0, 1], namely m*xm−1 if x is in [0, 1], and 0 otherwise.

### 1/(exp(1) + c − 2)

Involves the continued fraction expansion and Bernoulli Factory algorithm 3 for continued fractions. In this algorithm, c≥1 is a rational number.

The algorithm begins with pos equal to 1. Then the following steps are taken.

• Do the following process repeatedly until this run of the algorithm returns a value:
1. If pos is divisible by 3 (that is, if rem(pos, 3) equals 0): Let k be (pos/3)*2. With probability k/(1+k), return a number that is 1 with probability 1/k and 0 otherwise.
2. If pos is 1: With probability c/(1+c), return a number that is 1 with probability 1/c and 0 otherwise.
3. If pos is greater than 1 and not divisible by 3: Generate an unbiased random bit. If that bit is 1 (which happens with probability 1/2), return 1.
4. Do a separate run of the currently running algorithm, but with pos = pos + 1. If the separate run returns 1, return 0.

### exp(1) − 2

Involves the continued fraction expansion and Bernoulli Factory algorithm 3 for continued fractions. Run the algorithm for 1/(exp(1)+c−2) above with c = 1, except the algorithm begins with pos equal to 2 rather than 1 (because the continued fractions are almost the same).

### Euler–Mascheroni Constant γ

As I learned, the fractional part of 1/U, where U is a uniform random variate in (0, 1), has a mean equal to 1 minus the Euler–Mascheroni constant γ, about 0.5772.[^20] This leads to the following algorithm to sample a probability equal to γ:

1. Generate a PSRN for the reciprocal of a uniform random variate, as described in another page of mine.
2. Set the PSRN's integer part to 0, then run SampleGeometricBag on that PSRN. Return 0 if the run returns 1, or 1 otherwise.

### π/4

The following algorithm to sample the probability π/4 is based on the section "Uniform Distribution Inside N-Dimensional Shapes", especially its Note 5.

1. Set S to 2. Then set c1 and c2 to 0.
2. Do the following process repeatedly, until the algorithm returns a value:
1. Set c1 to 2*c1 plus an unbiased random bit (either 0 or 1 with equal probability). Then, set c2 to 2*c2 plus an unbiased random bit.
2. If ((c1+1)2 + (c2+1)2) < S2, return 1. (Point is inside the quarter disk, whose area is π/4.)
3. If ((c1)2 + (c2)2) > S2, return 0. (Point is outside the quarter disk.)
4. Multiply S by 2.

### π/4 − 1/2 or (π − 2)/4

Follows the π/4 algorithm, except it samples from a quarter disk with an area equal to 1/2 removed.

1. Set S to 2. Then set c1 and c2 to 0.
2. Do the following process repeatedly, until the algorithm returns a value:
1. Set c1 to 2*c1 plus an unbiased random bit (either 0 or 1 with equal probability). Then, set c2 to 2*c2 plus an unbiased random bit.
2. Set diamond to MAYBE and disk to MAYBE.
3. If ((c1+1) + (c2+1)) < S, set diamond to YES.
4. If ((c1) + (c2)) > S, set diamond to NO.
5. If ((c1+1)2 + (c2+1)2) < S2, set disk to YES.
6. If ((c1)2 + (c2)2) > S2, set disk to NO.
7. If disk is YES and diamond is NO, return 1. Otherwise, if diamond is YES or disk is NO, return 0.
8. Multiply S by 2.

### (π − 3)/4

Follows the π/4 algorithm, except it samples from a quarter disk with enough boxes removed from it to total an area equal to 3/4.

1. Set S to 32. Then set c1 to a uniform random integer in the half-open interval [0, S) and c2 to another uniform random integer in that interval.
2. (Retained boxes.) If c1 is 0 and c2 is 0, or if c1 is 0 and c2 is 1, return 1.
3. (Removed boxes.) If ((c1+1)2 + (c2+1)2) < 1024, return 0.
4. Multiply S by 2.
5. (Sample the modified quarter disk.) Do the following process repeatedly, until the algorithm returns a value:
1. Set c1 to 2*c1 plus an unbiased random bit (either 0 or 1 with equal probability). Then, set c2 to 2*c2 plus an unbiased random bit.
2. If ((c1+1)2 + (c2+1)2) < S2, return 1. (Point is inside the quarter disk, whose area is π/4.)
3. If ((c1)2 + (c2)2) > S2, return 0. (Point is outside the quarter disk.)
4. Multiply S by 2.

### π − 3

Similar to the π/4 algorithm. First it samples a point inside an area covering 1/4 of the unit square, then inside that area, it determines whether that point is inside another area covering (π − 3)/4 of the unit square. Thus, the algorithm acts as though it samples ((π − 3)/4) / (1/4) = π − 3.

1. Set S to 2. Then set c1 and c2 to 0.
2. Do the following process repeatedly, until the algorithm aborts it or returns a value:
1. Set S to 32. Then set c1 to a uniform random integer in the half-open interval [0, S) and c2 to another uniform random integer in [0, S).
2. (Return 1 if in retained boxes.) If c1 is 0 and c2 is 0, or if c1 is 0 and c2 is 1, return 1.
3. (Check if outside removed boxes.) If ((c1+1)2 + (c2+1)2) >= 1024, abort this process and go to step 3. (Otherwise, c1 and c2 are rejected and this process continues.)
3. Set S to 64.
4. (Sample the modified quarter disk.) Do the following process repeatedly, until the algorithm returns a value:
1. Set c1 to 2*c1 plus an unbiased random bit (either 0 or 1 with equal probability). Then, set c2 to 2*c2 plus an unbiased random bit.
2. If ((c1+1)2 + (c2+1)2) < S2, return 1. (Point is inside the quarter disk, whose area is π/4.)
3. If ((c1)2 + (c2)2) > S2, return 0. (Point is outside the quarter disk.)
4. Multiply S by 2.

Note: Only a limited set of (c1, c2) pairs, including (0, 0) and (0, 1), will pass step 2 of this algorithm. Thus it may be more efficient to choose one of them uniformly at random, rather than do step 2 as shown. If (0, 0) or (0, 1) is chosen this way, the algorithm returns 1.

### 4/(3*π)

Given that the point (x, y) has positive coordinates and lies inside a disk of radius 1 centered at (0, 0), the mean value of x is 4/(3*π). This leads to the following algorithm to sample that probability:

1. Generate two PSRNs in the form of a uniformly chosen point inside a 2-dimensional quarter hypersphere (see "Uniform Distribution Inside N-Dimensional Shapes" below, as well as the examples).
2. Let x be one of those PSRNs. Run SampleGeometricBag on that PSRN and return the result (which will be either 0 or 1).

Note: The mean value 4/(3*π) can be derived as follows. The relative probability that x is "close" to z is p(z) = sqrt(1 − z*z), where z is in the interval [0, 1]. Now find the area under the graph of z*p(z)/c (where c=π/4 is the area under the graph of p(z)). The result is the mean value 4/(3*π). The following Python code prints this mean value using the SymPy computer algebra library: p=sqrt(1-z*z); c=integrate(p,(z,0,1)); print(integrate(z*p/c,(z,0,1)));.

### ln(π)/π

Special case of the algorithm for ln(c+λ)/(c+λ).

1. Run the algorithm for 1/π repeatedly, until the run returns 1, then set g to the number of runs that returned 0 this way.
2. If g is 0, return 0. Otherwise, return a number that is 1 with probability 1/g or 0 otherwise.

### (1 + exp(k)) / (1 + exp(k + 1))

This algorithm simulates this probability by computing lower and upper bounds of exp(1), which improve as more and more digits are calculated. These bounds are calculated through an algorithm by Citterio and Pavani (2016)[^21]. Note the use of the methodology in Łatuszyński et al. (2009/2011, algorithm 2)[^22] in this algorithm. In this algorithm, k must be an integer 0 or greater.

1. If k is 0, run the algorithm for 2 / (1 + exp(2)) and return the result. If k is 1, run the algorithm for (1 + exp(1)) / (1 + exp(2)) and return the result.
2. Generate a uniform(0, 1) random variate, call it ret.
3. If k is 3 or greater, return 0 if ret is greater than 38/100, or 1 if ret is less than 36/100. (This is an early return step. If ret is implemented as a uniform PSRN, these comparisons should be done via the URandLessThanReal algorithm, which is described in my article on PSRNs.)
4. Set d to 2.
5. Calculate a lower and upper bound of exp(1) (LB and UB, respectively) in the form of rational numbers whose numerator has at most d digits, using the Citterio and Pavani algorithm. For details, see later.
6. Set rl to (1+LBk) / (1+UBk + 1), and set ru to (1+UBk) / (1+LBk + 1); both these numbers should be calculated using rational arithmetic.
7. If ret is greater than ru, return 0. If ret is less than rl, return 1. (If ret is implemented as a uniform PSRN, these comparisons should be done via URandLessThanReal.)
8. Add 1 to d and go to step 5.

The following implements the parts of Citterio and Pavani's algorithm needed to calculate lower and upper bounds for exp(1) in the form of rational numbers.

Define the following operations:

• Setup: Set p to the list [0, 1], set q to the list [1, 0], set a to the list [0, 0, 2] (two zeros, followed by the integer part for exp(1)), set v to 0, and set av to 0.
• Ensure n: While v is less than or equal to n:
1. (Ensure partial denominator v, starting from 0, is available.) If v + 2 is greater than or equal to the size of a, append 1, av, and 1, in that order, to the list a, then add 2 to av.
2. (Calculate convergent v, starting from 0.) Append a[n+2] * p[n+1]+p[n] to the list p, and append a[n+2] * q[n+1]+q[n] to the list q. (Positions in lists start at 0. For example, p means the first item in p; p means the second; and so on.)
3. Add 1 to v.
• Get the numerator for convergent n: Ensure n, then return p[n+2].
• Get convergent n: Ensure n, then return p[n+2]/q[n+2].
• Get semiconvergent n given d:
1. Ensure n, then set m to floor(((10d)−1−p[n+1])/p[n+2]).
2. Return (p[n+2] * m +p[n+1]) / (q[n+2] * m +q[n+1]).

Then the algorithm to calculate lower and upper bounds for exp(1), given d, is as follows:

1. Set i to 0, then run the setup.
2. Get the numerator for convergent i, call it c. If c is less than 10d, add 1 to i and repeat this step. Otherwise, go to the next step.
3. Get convergent i − 1 and get semiconvergent i − 1 given d, call them conv and semi, respectively.
4. If (i − 1) is odd[^7], return semi as the lower bound and conv as the upper bound. Otherwise, return conv as the lower bound and semi as the upper bound.

### Other Probabilities

Algorithms in bold are given either in this page or in the "Bernoulli Factory Algorithms" page.

To simulate: Follow this algorithm:
3 − exp(1) Run the algorithm for exp(1) − 2, then return 1 minus the result.
1/(exp(1)−1) Run the algorithm for 1/(exp(1)+c−2) with c = 1.
1/(1+exp(1)) Run the algorithm for 1/(exp(1)+c−2) with c = 3.
n/π (n is 1, 2, or 3.)
Create λ coin for algorithm π − 3.
Run algorithm for d / (c + λ) with d=n and c=3.
r/π (r is a rational number greater than 0 and less than 3.)
Create λ coin for algorithm π − 3.
Create μ coin that does: "With probability r − floor(r), return 1; otherwise return 0."
Run algorithm for (d + μ) / (c + λ) with d=floor(r) and c=3.
exp(1)/π Create μ coin for algorithm exp(1) − 2.
Create λ coin for algorithm π − 3.
Run algorithm for (d + μ) / (c + λ) with d=2 and c=3.
exp(1)/4 Follow the algorithm for exp(λ/4)/2, except the probability in step 2 is 2n−1/(n!), c is 0, and step 3 is replaced with "Return 1."
r*λr + r*exp(−λ) (r is a rational number greater than 0, but not greater than 2. λ is the unknown heads probability of a coin.)
Run the general martingale algorithm with $g(\lambda) = \lambda$, and with $d_0 = r/2$ and coefficients $a_i = \frac{r}{(i)!} (-1)^i$ if $i\ge 2$ and $a_i=0$ otherwise.
n*exp(−1) = n/exp(1) (n is 1 or 2.)
Create λ coin for algorithm exp(1) − 2.
Run algorithm for d / (c + λ) with d=n and c=2.
r*exp(−1) = r/exp(1) (r is a rational number greater than 0, but not less than 2.)
Run algorithm for c*λc + c*exp(−λ) with r=r and λ being a coin that always returns 1.
λ/(2−λ) = (λ/2)/(1−(λ/2)) (λ is the unknown heads probability of a coin.)
(1) Flip λ coin; return 0 if it returns 0.
(2) Run algorithm for 1/(2−λ).
(1−λ)/(1+λ) (λ is the unknown heads probability of a coin.)
(1) Flip λ coin; return 0 if it returns 1.
(2) Run algorithm for d / (c + λ) with d=1 and c=1.

## General Arbitrary-Precision Samplers

### Uniform Distribution Inside N-Dimensional Shapes

The following is a general way to describe an arbitrary-precision sampler for generating a point uniformly at random inside a geometric shape located entirely in the hypercube [0, d1]×[0, d2]×...×[0,dN] in N-dimensional space, where d1, ..., dN are integers greater than 0. The algorithm will generally work if the shape is reasonably defined; the technical requirements are that the shape must have a zero-volume (Lebesgue measure zero) boundary and a nonzero finite volume, and must assign zero probability to any zero-volume subset of it (such as a set of individual points).

The sampler's description has the following skeleton.

1. Generate N empty uniform partially-sampled random numbers (PSRNs), with a positive sign, an integer part of 0, and an empty fractional part. Call the PSRNs p1, p2, ..., pN.
2. Set S to base, where base is the base of digits to be stored by the PSRNs (such as 2 for binary or 10 for decimal). Then set N coordinates to 0, call the coordinates c1, c2, ..., cN. Then set d to 1. Then, for each coordinate (c1, ..., cN), set that coordinate to an integer in [0, dX), chosen uniformly at random, where dX is the corresponding dimension's size.
3. For each coordinate (c1, ..., cN), multiply that coordinate by base and add a digit chosen uniformly at random to that coordinate.
4. This step uses a function known as InShape, which takes the coordinates of a box and returns one of three values: YES if the box is entirely inside the shape; NO if the box is entirely outside the shape; and MAYBE if the box is partly inside and partly outside the shape, or if the function is unsure. InShape, as well as the divisions of the coordinates by S, should be implemented using rational arithmetic. Instead of dividing those coordinates this way, an implementation can pass S as a separate parameter to InShape. See the appendix for further implementation notes. In this step, run InShape using the current box, whose coordinates in this case are ((c1/S, c2/S, ..., cN/S), ((c1+1)/S, (c2+1)/S, ..., (cN+1)/S)).
5. If the result of InShape is YES, then the current box was accepted. If the box is accepted this way, then at this point, c1, c2, etc., will each store the d digits of a coordinate in the shape, expressed as a number in the interval [0, 1], or more precisely, a range of numbers. (For example, if base is 10, d is 3, and c1 is 342, then the first coordinate is 0.342..., or more precisely, a number in the interval [0.342, 0.343].) In this case, do the following:
1. For each coordinate (c1, ..., cN), transfer that coordinate's least significant digits to the corresponding PSRN's fractional part. The variable d tells how many digits to transfer to each PSRN this way. Then, for each coordinate (c1, ..., cN), set the corresponding PSRN's integer part to floor(cX/based), where cX is that coordinate. (For example, if base is 10, d is 3, and c1 is 7342, set p1's fractional part to [3, 4, 2] and p1's integer part to 7.)
2. For each PSRN (p1, ..., pN), optionally fill that PSRN with uniform random digits as necessary to give its fractional part the desired number of digits (similarly to FillGeometricBag).
3. For each PSRN, optionally do the following: Generate an unbiased random bit. If that bit is 1 (which happens with probability 1/2), set that PSRN's sign to negative. (This will result in a symmetric shape in the corresponding dimension. This step can be done for some PSRNs and not others.)
4. Return the PSRNs p1, ..., pN, in that order.
6. If the result of InShape is NO, then the current box lies outside the shape and is rejected. In this case, go to step 2.
7. If the result of InShape is MAYBE, it is not known whether the current box lies fully inside or fully outside the shape, so multiply S by base, then add 1 to d, then go to step 3.

Notes:

1. See (Li and El Gamal 2016)[^23] and (Oberhoff 2018)[^24] for related work on encoding random points uniformly distributed in a shape.
2. Rejection sampling on a shape is subject to the "curse of dimensionality", since typical shapes of high dimension will tend to cover much less volume than their bounding boxes, so that it would take a lot of time on average to accept a high-dimensional box. Moreover, the more area the shape takes up in the bounding box, the higher the acceptance rate.
3. Devroye (1986, chapter 8, section 3)[^25] describes grid-based methods to optimize random point generation. In this case, the space is divided into a grid of boxes each with size 1/basek in all dimensions; the result of InShape is calculated for each such box and that box labeled with the result; all boxes labeled NO are discarded; and the algorithm is modified by adding the following after step 2: "2a. Choose a precalculated box uniformly at random, then set c1, ..., cN to that box's coordinates, then set d to k and set S to basek. If a box labeled YES was chosen, follow the substeps in step 5. If a box labeled MAYBE was chosen, multiply S by base and add 1 to d." (For example, if base is 10, k is 1, N is 2, and d1 = d2 = 1, the space could be divided into a 10×10 grid, made up of 100 boxes each of size (1/10)×(1/10). Then, InShape is precalculated for the box with coordinates ((0, 0), (1, 1)), the box ((0, 1), (1, 2)), and so on [the boxes' coordinates are stored as just given, but InShape instead uses those coordinates divided by basek, or 101 in this case], each such box is labeled with the result, and boxes labeled NO are discarded. Finally the algorithm above is modified as just given.)
4. Besides a grid, another useful data structure is a mapped regular paving (Harlow et al. 2012)[^26], which can be described as a binary tree with nodes each consisting of zero or two child nodes and a marking value. Start with a box that entirely covers the desired shape. Calculate InShape for the box. If it returns YES or NO then mark the box with YES or NO, respectively; otherwise it returns MAYBE, so divide the box along its first widest coordinate into two sub-boxes, set the parent box's children to those sub-boxes, then repeat this process for each sub-box (or if the nesting level is too deep, instead mark each sub-box with MAYBE). Then, to generate a random point (with a base-2 fractional part), start from the root, then: (1) If the box is marked YES, return a uniform random point between the given coordinates using the RandUniformInRange algorithm; or (2) if the box is marked NO, start over from the root; or (3) if the box is marked MAYBE, get the two child boxes bisected from the box, choose one of them with equal probability (for example, choose the left child if an unbiased random bit is 0, or the right child otherwise), mark the chosen child with the result of InShape for that child, and repeat this process with that child; or (4) the box has two child boxes, so choose one of them with equal probability and repeat this process with that child.
5. The algorithm can be adapted to return 1 with probability equal to its acceptance rate (which equals the shape's volume divided by the hyperrectangle's volume), and return 0 with the opposite probability. In this case, replace steps 5 and 6 with the following: "5. If the result of InShape is YES, return 1.; 6. If the result of InShape is NO, return 0." (I thank BruceET of the Cross Validated community for leading me to this insight.)

Examples:

• The following example generates a point inside a quarter diamond (centered at (0, ..., 0), "radius" k where k is an integer greater than 0): Let d1, ..., dN be k. Let InShape return YES if ((c1+1) + ... + (cN+1)) < S*k; NO if (c1 + ... + cN) > S*k; and MAYBE otherwise. For N=2, the acceptance rate (see note 5) is 1/2. For a full diamond, step 5.3 in the algorithm is done for each of the N dimensions.
• The following example generates a point inside a quarter hypersphere (centered at (0, ..., 0), radius k where k is an integer greater than 0): Let d1, ..., dN be k. Let InShape return YES if ((c1+1)2 + ... + (cN+1)2) < (S*k)2; NO if (c12 + ... + cN2) > (S*k)2; and MAYBE otherwise. For N=2, the acceptance rate (see note 5) is π/4. For a full hypersphere with radius 1, step 5.3 in the algorithm is done for each of the N dimensions. In the case of a 2-dimensional disk, this algorithm thus adapts the well-known rejection technique of generating X and Y coordinates until X2+Y2 < 1 (for example, (Devroye 1986, p. 230 et seq.)[^25]).
• The following example generates a point inside a quarter astroid (centered at (0, ..., 0), radius k where k is an integer greater than 0): Let d1, ..., dN be k. Let InShape return YES if ((skc1−1)2 + ... + (skcN−1)2) > sk2; NO if ((skc1)2 + ... + (skcN)2) < sk2; and MAYBE otherwise, where sk = S*k. For N=2, the acceptance rate (see note 5) is 1 − π/4. For a full astroid, step 5.3 in the algorithm is done for each of the N dimensions.

### Building an Arbitrary-Precision Sampler

If a probability distribution—

• has a probability density function (PDF), or a function proportional to the PDF, with a known symbolic form,
• has a cumulative distribution function (CDF) with a known symbolic form,
• takes on only values 0 or greater, and
• has a PDF that has an infinite tail to the right, is less than or equal to a finite number (so that PDF(0) is other than infinity), and is strictly decreasing,

it may be possible to describe an arbitrary-precision sampler for that distribution. Such a description has the following skeleton.

1. With probability A, set intval to 0, then set size to 1, then go to step 4.
• A is calculated as (CDF(1) − CDF(0)) / (1−CDF(0)), where CDF is the distribution's CDF. This should be found analytically using a computer algebra system such as SymPy.
• The symbolic form of A will help determine which Bernoulli factory algorithm, if any, will simulate the probability; if a Bernoulli factory exists, it should be used.
2. Set intval to 1 and set size to 1.
3. With probability B(size, intval), go to step 4. Otherwise, add size to intval, then multiply size by 2, then repeat this step.
• This step chooses an interval beyond 1, and grows this interval by geometric steps, so that an appropriate interval is chosen with the correct probability.
• The probability B(size, intval) is the probability that the interval is chosen given that the previous intervals weren't chosen, and is calculated as (CDF(size + intval) − CDF(intval)) / (1−CDF(intval)). This should be found analytically using a computer algebra system such as SymPy.
• The symbolic form of B will help determine which Bernoulli factory algorithm, if any, will simulate the probability; if a Bernoulli factory exists, it should be used.
4. Generate an integer in the interval [intval, intval + size) uniformly at random, call it i.
5. Create ret, a uniform PSRN with a positive sign and an integer part of 0.
6. Create an input coin that calls SampleGeometricBag on the PSRN ret. Run a Bernoulli factory algorithm that simulates the probability C(i, λ), using the input coin (here, λ is the probability built up in ret via SampleGeometricBag, and lies in the interval [0, 1]). If the call returns 0, go to step 4.
• The probability C(i, λ) is calculated as PDF(i + λ) / M, where PDF is the distribution's PDF or a function proportional to the PDF, and should be found analytically using a computer algebra system such as SymPy.
• In this formula, M is any convenient number in the interval [PDF(intval), max(1, PDF(intval))], and should be as low as feasible. M serves to ensure that C is as close as feasible to 1 (to improve acceptance rates), but no higher than 1. The choice of M can vary for each interval (each value of intval, which can only be 0, 1, or a power of 2). Any such choice for M preserves the algorithm's correctness because the PDF has to be strictly decreasing and a new interval isn't chosen when λ is rejected.
• The symbolic form of C will help determine which Bernoulli factory algorithm, if any, will simulate the probability; if a Bernoulli factory exists, it should be used.
7. The PSRN ret was accepted, so set ret's integer part to i, then optionally fill ret with uniform random digits as necessary to give its fractional part the desired number of digits (similarly to FillGeometricBag), then return ret.

Examples of algorithms that use this skeleton are the algorithm for the ratio of two uniform random variates, as well as the algorithms for the Rayleigh distribution and for the reciprocal of power of uniform, both given later.

Perhaps the most difficult part of describing an arbitrary-precision sampler with this skeleton is finding the appropriate Bernoulli factory for the probabilities A, B, and C, especially when these probabilities have a non-trivial symbolic form.

Note: The algorithm skeleton uses ideas similar to the inversion-rejection method described in (Devroye 1986, ch. 7, sec. 4.6)[^25]; an exception is that instead of generating a uniform random variate and comparing it to calculations of a CDF, this algorithm uses conditional probabilities of choosing a given piece, probabilities labeled A and B. This approach was taken so that the CDF of the distribution in question is never directly calculated in the course of the algorithm, which furthers the goal of sampling with arbitrary precision and without using floating-point arithmetic.

### Mixtures

A mixture involves sampling one of several distributions, where each distribution has a separate probability of being sampled. In general, an arbitrary-precision sampler is possible if all of the following conditions are met:

• There is a finite number of distributions to choose from.
• The probability of sampling each distribution is a rational number, or it can be expressed as a function for which a Bernoulli factory algorithm exists.
• For each distribution, an arbitrary-precision sampler exists.

Example: One example of a mixture is two beta distributions, with separate parameters. One beta distribution is chosen with probability exp(−3) (a probability for which a Bernoulli factory algorithm exists) and the other is chosen with the opposite probability. For the two beta distributions, an arbitrary-precision sampling algorithm exists (see my article on partially-sampled random numbers (PSRNs) for details).

### Weighted Choice Involving PSRNs

Given n uniform PSRNs, called weights, with labels starting from 0 and ending at n−1, the following algorithm chooses an integer in [0, n) with probability proportional to its weight. Each weight's sign must be positive.

1. Create an empty list, then for each weight starting with weight 0, add the weight's integer part plus 1 to that list. For example, if the weights are [2.22...,0.001...,1.3...], in that order, the list will be [3, 1, 2], corresponding to integers 0, 1, and 2, in that order. Call the list just created the rounded weights list.
2. Choose an integer i with probability proportional to the weights in the rounded weights list. This can be done, for example, by taking the result of WeightedChoice(list), where list is the rounded weights list and WeightedChoice is given in "Randomization and Samping Methods".
3. Run URandLessThanReal(w, rw), where w is the original weight for integer i, and rw is the rounded weight for integer i in the rounded weights list. That algorithm returns 1 if w turns out to be less than rw. If the result is 1, return i. Otherwise, go to step 2.

### Cumulative Distribution Functions

Suppose a real number z is given (which might be a uniform PSRN or a rational number). If a probability distribution—

• has a probability density function (PDF) (as with the normal or exponential distribution), and
• has an arbitrary-precision sampler that returns a uniform PSRN X,

then it's possible to generate 1 with the same probability as the sampler returns an X that is less than or equal to z, as follows:

1. Run the arbitrary-precision sampler to generate X, a uniform PSRN.
2. Run URandLess (if z is a uniform PSRN) or URandLessThanReal (if z is a real number) with parameters X and z, in that order, and return the result.

Specifically, the probability of returning 1 is the cumulative distribution function (CDF) for the distribution of X.

Notes:

1. Although step 2 of the algorithm checks whether X is merely less than z, this is still correct; because the distribution of X has a PDF, X is less than z with the same probability as X is less than or equal to z.
2. All probability distributions have a CDF, not just those with a PDF, but also discrete ones such as Poisson or binomial.

## Specific Arbitrary-Precision Samplers

### Rayleigh Distribution

The following is an arbitrary-precision sampler for the Rayleigh distribution with parameter s, which is a rational number greater than 0.

1. Set k to 0, and set y to 2 * s * s.
2. With probability exp(−(k * 2 + 1)/y), go to step 3. Otherwise, add 1 to k and repeat this step. (The probability check should be done with the exp(−x/y) algorithm in "Bernoulli Factory Algorithms", with x/y = (k * 2 + 1)/y.)
3. (Now, the piece located at [k, k + 1) is sampled.) Create a positive-sign zero-integer-part uniform PSRN, and create an input coin that returns the result of SampleGeometricBag on that uniform PSRN.
4. Set ky to k * k / y.
5. (Now simulating exp(−U2/y), exp(−k2/y) , exp(−U*k*2/y), as well as a scaled-down version of U + k, where U is the number built up by the uniform PSRN.) Call the exp(−x/y) algorithm with x/y = ky, then call the exp(−(λk * x)) algorithm using the input coin from step 2, x = 1/y, and k = 2, then call the first or third algorithm for exp(−(λk * c)) using the same input coin, c = floor(k * 2 / y), and k = 1, then call the sub-algorithm given later with the uniform PSRN and k = k. If all of these calls return 1, the uniform PSRN was accepted. Otherwise, remove all digits from the uniform PSRN's fractional part and go to step 4.
6. If the uniform PSRN, call it ret, was accepted by step 5, set ret's integer part to k, then optionally fill ret with uniform random digits as necessary to give its fractional part the desired number of digits (similarly to FillGeometricBag), and return ret.

The sub-algorithm below simulates a probability equal to (U+k)/basez, where U is the number built by the uniform PSRN, base is the base (radix) of digits stored by that PSRN, k is an integer 0 or greater, and z is the number of significant digits in k (for this purpose, z is 0 if k is 0).

For base 2:

1. Set N to 0.
2. Generate an unbiased random bit. If that bit is 1 (which happens with probability 1/2), go to the next step. Otherwise, add 1 to N and repeat this step.
3. If N is less than z, return rem(k / 2z − 1 − N, 2). (Alternatively, shift k to the right, by z − 1 − N bits, then return k AND 1, where "AND" is a bitwise AND-operation.)
4. Subtract z from N. Then, if the item at position N in the uniform PSRN's fractional part (positions start at 0) is not set to a digit (for example, 0 or 1 for base 2), set the item at that position to a digit chosen uniformly at random (for example, either 0 or 1 for base 2), increasing the capacity of the uniform PSRN's fractional part as necessary.
5. Return the item at position N.

For bases other than 2, such as 10 for decimal, this can be implemented as follows (based on URandLess):

1. Set i to 0.
2. If i is less than z:
1. Set da to rem(k / 2z − 1 − i, base), and set db to a digit chosen uniformly at random (that is, an integer in the interval [0, base)).
2. Return 1 if da is less than db, or 0 if da is greater than db.
3. If i is z or greater:
1. If the digit at position (iz) in the uniform PSRN's fractional part is not set, set the item at that position to a digit chosen uniformly at random (positions start at 0 where 0 is the most significant digit after the point, 1 is the second most significant, etc.).
2. Set da to the item at that position, and set db to a digit chosen uniformly at random (that is, an integer in the interval [0, base)).
3. Return 1 if da is less than db, or 0 if da is greater than db.
4. Add 1 to i and go to step 3.

### Hyperbolic Secant Distribution

The following algorithm adapts the rejection algorithm from p. 472 in (Devroye 1986)[^25] for arbitrary-precision sampling.

1. Generate ret, an exponential random variate with a rate of 1 via the ExpRand or ExpRand2 algorithm described in my article on PSRNs. This number will be a uniform PSRN.
2. Set ip to 1 plus ret's integer part.
3. (The rest of the algorithm accepts ret with probability 1/(1+ret).) With probability ip/(1+ip), generate a number that is 1 with probability 1/ip and 0 otherwise. If that number is 1, ret was accepted, in which case optionally fill it with uniform random digits as necessary to give its fractional part the desired number of digits (similarly to FillGeometricBag), then set ret's sign to positive or negative with equal probability, then return ret.
4. Call SampleGeometricBag on ret's fractional part (ignore ret's integer part and sign). If the call returns 1, go to step 1. Otherwise, go to step 3.

### Reciprocal of Power of Uniform

The following algorithm generates a PSRN of the form 1/U1/x, where U is a uniform random variate greater than 0 and less than 1 and x is an integer greater than 0.

1. Set intval to 1 and set size to 1.
2. With probability (4x−2x)/4x, go to step 3. Otherwise, add size to intval, then multiply size by 2, then repeat this step.
3. Generate an integer in the interval [intval, intval + size) uniformly at random, call it i.
4. Create ret, a uniform PSRN with a positive sign and an integer part of 0.
5. Create an input coin that calls SampleGeometricBag on the PSRN ret. Call the algorithm for dk / (c + λ)k in "Bernoulli Factory Algorithms", using the input coin, where d = intval, c = i, and k = x + 1 (here, λ is the probability built up in ret via SampleGeometricBag, and lies in the interval [0, 1]). If the call returns 0, go to step 3.
6. The PSRN ret was accepted, so set ret's integer part to i, then optionally fill ret with uniform random digits as necessary to give its fractional part the desired number of digits (similarly to FillGeometricBag), then return ret.

This algorithm uses the skeleton described earlier in "Building an Arbitrary-Precision Sampler". Here, the probabilities A, B, and C are as follows:

• A = 0, since the random variate can't lie in the interval [0, 1).
• B = (4x−2x)/4x.
• C = (x/(i + λ)x+1) / M. Ideally, M is either x if intval is 1, or x/intvalx+1 otherwise. Thus, the ideal form for C is intvalx+1/(i+λ)x+1.

### Distribution of U/(1−U)

The following algorithm generates a PSRN distributed as U/(1−U), where U is a uniform random variate greater than 0 and less than 1.

1. Generate an unbiased random bit. If that bit is 1 (which happens with probability 1/2), set intval to 0, then set size to 1, then go to step 4.
2. Set intval to 1 and set size to 1.
3. With probability size/(size + intval + 1), go to step 4. Otherwise, add size to intval, then multiply size by 2, then repeat this step.
4. Generate an integer in the interval [intval, intval + size) uniformly at random, call it i.
5. Create ret, a uniform PSRN with a positive sign and an integer part of 0.
6. Create an input coin that calls SampleGeometricBag on the PSRN ret. Call the algorithm for dk / (c + λ)k in "Bernoulli Factory Algorithms", using the input coin, where d = intval + 1, c = i + 1, and k = 2 (here, λ is the probability built up in ret via SampleGeometricBag, and lies in the interval [0, 1]). If the call returns 0, go to step 4.
7. The PSRN ret was accepted, so set ret's integer part to i, then optionally fill ret with uniform random digits as necessary to give its fractional part the desired number of digits (similarly to FillGeometricBag), then return ret.

This algorithm uses the skeleton described earlier in "Building an Arbitrary-Precision Sampler". Here, the probabilities A, B, and C are as follows:

• A = 1/2.
• B = size/(size + intval + 1).
• C = (1/(i+λ+1)2) / M. Ideally, M is 1/(intval+1)2. Thus, the ideal form for C is (intval+1)2/(i+λ+1)2.

### Arc-Cosine Distribution

The following is a reimplementation of an example from Devroye's book Non-Uniform Random Variate Generation (Devroye 1986, pp. 128–129)[^25]. The following arbitrary-precision sampler generates a random variate from a distribution with the following cumulative distribution function (CDF): 1 - cos(pi*x/2). The random variate will be in the interval [0, 1]. This algorithm's result is the same as applying acos(U)*2/π, where U is a uniform [0, 1] random variate, as pointed out by Devroye. The algorithm follows.

1. Call the kthsmallest algorithm with n = 2 and k = 2, but without filling it with digits at the last step. Let ret be the result.
2. Set m to 1.
3. Call the kthsmallest algorithm with n = 2 and k = 2, but without filling it with digits at the last step. Let u be the result.
4. With probability 4/(4*m*m + 2*m), call the URandLess algorithm with parameters u and ret in that order, and if that call returns 1, call the algorithm for π / 4, described in "Bernoulli Factory Algorithms", twice, and if both of these calls return 1, add 1 to m and go to step 3. (Here, an erratum in the algorithm on page 129 of the book is incorporated.)
5. If m is odd[^7], optionally fill ret with uniform random digits as necessary to give its fractional part the desired number of digits (similarly to FillGeometricBag), and return ret.
6. If m is even[^10], go to step 1.

And here is Python code that implements this algorithm. The code uses floating-point arithmetic only at the end, to convert the result to a convenient form, and it relies on methods from randomgen.py and bernoulli.py.

def example_4_2_1(rg, bern, precision=53):
while True:
ret=rg.kthsmallest_psrn(2,2)
k=1
while True:
u=rg.kthsmallest_psrn(2,2)
kden=4*k*k+2*k # erratum incorporated
if randomgen.urandless(rg,u, ret) and \
rg.zero_or_one(4, kden)==1 and \
bern.zero_or_one_pi_div_4()==1 and \
bern.zero_or_one_pi_div_4()==1:
k+=1
elif (k&1)==1:
return randomgen.urandfill(rg,ret,precision)/(1<<precision)
else: break


### Logistic Distribution

The following new algorithm generates a partially-sampled random number that follows the logistic distribution.

1. Set k to 0.
2. (Choose a 1-unit-wide piece of the logistic density.) Run the algorithm for (1+exp(k))/(1+exp(k+1)) described in "Bernoulli Factory Algorithms"). If the call returns 0, add 1 to k and repeat this step. Otherwise, go to step 3.
3. (The rest of the algorithm samples from the chosen piece.) Generate a uniform(0, 1) random variate, call it f.
4. (Steps 4 through 7 succeed with probability exp(−(f+k))/(1+exp(−(f+k)))2.) Generate an unbiased random bit. If that bit is 1 (which happens with probability 1/2), go to step 3.
5. Run the algorithm for exp(−k/1) (described in "Bernoulli Factory Algorithms"), then sample from the number f (for example, call SampleGeometricBag on f if f is implemented as a uniform PSRN). If any of these calls returns 0, go to step 4.
6. Generate an unbiased random bit. If that bit is 1 (which happens with probability 1/2), accept f. If f is accepted this way, set f's integer part to k, then optionally fill f with uniform random digits as necessary to give its fractional part the desired number of digits (similarly to FillGeometricBag), then set f's sign to positive or negative with equal probability, then return f.
7. Run the algorithm for exp(−k/1) and sample from the number f (for example, call SampleGeometricBag on f if f is implemented as a uniform PSRN). If both calls return 1, go to step 3. Otherwise, go to step 6.

### Cauchy Distribution

Uses the skeleton for the uniform distribution inside N-dimensional shapes.

1. Generate two empty PSRNs, with a positive sign, an integer part of 0, and an empty fractional part. Call the PSRNs p1 and p2.
2. Set S to base, where base is the base of digits to be stored by the PSRNs (such as 2 for binary or 10 for decimal). Then set c1 and c2 each to 0. Then set d to 1.
3. Multiply c1 and c2 each by base and add a digit chosen uniformly at random to that coordinate.
4. If ((c1+1)2 + (c2+1)2) < S2, then do the following:
1. Transfer c1's least significant digits to p1's fractional part, and transfer c2's least significant digits to p2's fractional part. The variable d tells how many digits to transfer to each PSRN this way. (For example, if base is 10, d is 3, and c1 is 342, set p1's fractional part to [3, 4, 2].)
2. Run the UniformDivide algorithm (described in the article on PSRNs) on p1 and p2, in that order, then set the resulting PSRN's sign to positive or negative with equal probability, then return that PSRN.
5. If (c12 + c22) > S2, then go to step 2.
6. Multiply S by base, then add 1 to d, then go to step 3.

### Exponential Distribution with Unknown Rate λ, Lying in (0, 1]

Exponential random variates can be generated using an input coin of unknown probability of heads of λ (which can be in the interval (0, 1]), by generating arrival times in a Poisson process of rate 1, then thinning the process using the coin. The arrival times that result will be exponentially distributed with rate λ. I found the basic idea in the answer to a Mathematics Stack Exchange question, and thinning of Poisson processes is discussed, for example, in Devroye (1986, chapter six)[^25]. The algorithm follows:

1. Generate an exponential(1) random variate using the ExpRand or ExpRand2 algorithm (with λ = 1), call it ex.
2. (Thinning step.) Flip the input coin. If it returns 1, return ex.
3. Generate another exponential(1) random variate using the ExpRand or ExpRand2 algorithm (with λ = 1), call it ex2. Then run UniformAdd on ex and ex2 and set ex to the result. Then go to step 2.

Notice that the algorithm's average running time increases as λ decreases.

### Exponential Distribution with Rate ln(x)

The following new algorithm generates a partially-sampled random number that follows the exponential distribution with rate ln(x). This is useful for generating a base-x logarithm of a uniform(0,1) random variate. This algorithm has two supported cases:

• x is a rational number that's greater than 1. In that case, let b be floor(ln(x)/ln(2)).
• x is a uniform PSRN with a positive sign and an integer part of 1 or greater. In that case, let b be floor(ln(i)/ln(2)), where i is x's integer part.

The algorithm follows.

1. (Samples the integer part of the random variate.) Generate a number that is 1 with probability 1/x and 0 otherwise, repeatedly until a zero is generated this way. Set k to the number of ones generated this way. (This is also known as a "geometric random variate", but this terminology is avoided because it has conflicting meanings in academic works.)
• If x is a rational number and a power of 2, this step can be implemented by generating blocks of b unbiased random bits until a non-zero block of bits is generated this way, then setting k to the number of all-zero blocks of bits generated this way.
• If x is a uniform PSRN, this step is implemented as follows: Run the first subalgorithm (later in this section) repeatedly until a run returns 0. Set k to the number of runs that returned 1 this way.
2. (The rest of the algorithm samples the fractional part.) Create f, a uniform PSRN with a positive sign, an empty fractional part, and an integer part of 0.
3. Create a μ input coin that does the following: "Sample from the number f (for example, call SampleGeometricBag on f if f is implemented as a uniform PSRN), then run the algorithm for ln(1+y/z) (given in "Bernoulli Factory Algorithms") with y/z = 1/1. If both calls return 1, return 1. Otherwise, return 0." (This simulates the probability λ = f*ln(2).) Then:
• If x is a rational number, but not a power of 2, also create a ν input coin that does the following: "Sample from the number f, then run the algorithm for ln(1 + y/z) with y/z = (x−2b)/2b. If both calls return 1, return 1. Otherwise, return 0."
• If x is a uniform PSRN, also create a ρ input coin that does the following: "Return the result of the second subalgorithm (later in this section), given x and b", and a ν input coin that does the following: "Sample from the number f, then run the algorithm for ln(1 + λ), using the ρ input coin. If both calls return 1, return 1. Otherwise, return 0."
4. Run the algorithm for exp(−λ) (described in "Bernoulli Factory Algorithms") b times, using the μ input coin. If a ν input coin was created in step 3, run the same algorithm once, using the ν input coin. If all these calls return 1, accept f. If f is accepted this way, set f's integer part to k, then optionally fill f with uniform random digits as necessary to give its fractional part the desired number of digits (similarly to FillGeometricBag), then return f.
5. If f was not accepted by the previous step, go to step 2.

Note: A bounded exponential random variate with rate ln(x) and bounded by m has a similar algorithm to this one. Step 1 is changed to read as follows: "Do the following m times or until a zero is generated, whichever happens first: 'Generate a number that is 1 with probability 1/x and 0 otherwise'. Then set k to the number of ones generated this way. (k is a so-called bounded-geometric(1−1/x, m) random variate, which an algorithm of Bringmann and Friedrich (2013)[^27] can generate as well. If x is a power of 2, this can be implemented by generating blocks of b unbiased random bits until a non-zero block of bits or m blocks of bits are generated this way, whichever comes first, then setting k to the number of all-zero blocks of bits generated this way.) If k is m, return m (this m is a constant, not a uniform PSRN; if the algorithm would otherwise return a uniform PSRN, it can return something else in order to distinguish this constant from a uniform PSRN)." Additionally, instead of generating a uniform(0,1) random variate in step 2, a uniform(0,μ) random variate can be generated instead, such as a uniform PSRN generated via RandUniformFromReal, to implement an exponential distribution bounded by m+μ (where μ is a real number in the interval (0, 1)).

The following generator for the rate ln(2) is a special case of the previous algorithm and is useful for generating a base-2 logarithm of a uniform(0,1) random variate. Unlike the similar algorithm of Ahrens and Dieter (1972)[^28], this one doesn't require a table of probability values.

1. (Samples the integer part of the random variate. This will be geometrically distributed with parameter 1/2.) Generate unbiased random bits until a zero is generated this way. Set k to the number of ones generated this way.
2. (The rest of the algorithm samples the fractional part.) Generate a uniform (0, 1) random variate, call it f.
3. Create an input coin that does the following: "Sample from the number f (for example, call SampleGeometricBag on f if f is implemented as a uniform PSRN), then run the algorithm for ln(1+y/z) (given in "Bernoulli Factory Algorithms") with y/z = 1/1. If both calls return 1, return 1. Otherwise, return 0." (This simulates the probability λ = f*ln(2).)
4. Run the algorithm for exp(−λ) (described in "Bernoulli Factory Algorithms"), using the input coin from the previous step. If the call returns 1, accept f. If f is accepted this way, set f's integer part to k, then optionally fill f with uniform random digits as necessary to give its fractional part the desired number of digits (similarly to FillGeometricBag), then return f.
5. If f was not accepted by the previous step, go to step 2.

The first subalgorithm samples the probability 1/x, where x≥1 is a uniform PSRN:

1. Set c to x's integer part. With probability c / (1 + c), return a number that is 1 with probability 1/c and 0 otherwise.
2. Run SampleGeometricBag on x (which ignores x's integer part and sign). If the run returns 1, return 0. Otherwise, go to step 1.

The second subalgorithm samples the probability (x−2b)/2b, where x≥1 is a uniform PSRN and b≥0 is an integer:

1. Subtract 2b from x's integer part, then create y as RandUniformFromReal(2b), then run URandLessThanReal(x, y), then add 2b back to x's integer part.
2. Return the result of URandLessThanReal from step 1.

### Symmetric Geometric Distribution

Samples from the symmetric geometric distribution from (Ghosh et al. 2012)[^29], with parameter λ, in the form of an input coin with unknown probability of heads of λ.

1. Flip the input coin until it returns 0. Set n to the number of times the coin returned 1 this way.
2. Run a Bernoulli factory algorithm for 1/(2−λ), using the input coin. If the run returns 1, return n. Otherwise, return −1 − n.

This is similar to an algorithm mentioned in an appendix in Li (2021)[^30], in which the input coin—

• has λ = 1−exp(−ε), and
• can be built as follows using another input coin with probability of heads ε: "Run a Bernoulli factory algorithm for exp(−λ) using the ε input coin, then return 1 minus the result."

### Lindley Distribution and Lindley-Like Mixtures

A random variate that follows the Lindley distribution (Lindley 1958)[^31] with parameter θ (a real number greater than 0) can be generated as follows:

Let A be 1 and let B be 2. Then:

1. With probability w = θ/(1+θ), generate A exponential random variates with a rate of r = θ via ExpRand or ExpRand2 (described in my article on PSRNs), then generate their sum by applying the UniformAdd algorithm, then return that sum.
2. Otherwise, generate B exponential random variates with a rate of r via ExpRand or ExpRand2, then generate their sum by applying the UniformAdd algorithm, then return that sum.

The table below describes other Lindley-like mixtures:

Distribution w is: r is: A is: B is:
Garima (Shanker 2016)[^32]. (1+θ)/(2+θ). θ. 1. 2.
i-Garima (Singh and Das 2020)[^33]. (2+θ)/(3+θ). θ. 1. 2.
Mixture-of-weighted-exponential-and-weighted-gamma (Iqbal and Iqbal 2020)[^34]. θ/(1+θ). θ. 2. 3.
Xgamma (Sen et al. 2016)[^35]. θ/(1+θ). θ. 1. 3.
Mirra (Sen et al. 2021)[^36]. θ2/(α+θ2) where α>0. θ. 1. 3.

Notes:

1. If θ is a uniform PSRN, then the check "With probability w = θ/(1+θ)" can be implemented by running the Bernoulli factory algorithm for (d + μ) / ((d + μ) + (c + λ)), where c is 1; λ represents an input coin that always returns 0; d is θ's integer part, and μ is an input coin that runs SampleGeometricBag on θ's fractional part. The check succeeds if the Bernoulli factory algorithm returns 1.
2. A Sushila random variate is α times a Lindley random variate, where α > 0 (Shanker et al. 2013)[^37].
3. An X-Lindley random variate (Chouia and Zeghdoudi 2021)[^38] is, with probability θ/(1+θ), an exponential random variate with a rate of θ (see step 1), and otherwise, a Lindley random variate with parameter θ.

### Gamma Distribution

The following arbitrary-precision sampler generates the sum of n independent exponential random variates (also known as the Erlang(n) or gamma(n) distribution), implemented via partially-sampled uniform random variates. Obviously, this algorithm is inefficient for large values of n.

1. Generate n exponential random variates with a rate of 1 via the ExpRand or ExpRand2 algorithm described in my article on partially-sampled random numbers (PSRNs). These numbers will be uniform PSRNs; this algorithm won't work for exponential PSRNs (e-rands), described in the same article, because the sum of two e-rands may follow a subtly wrong distribution. By contrast, generating exponential random variates via rejection from the uniform distribution will allow unsampled digits to be sampled uniformly at random without deviating from the exponential distribution.
2. Generate the sum of the random variates generated in step 1 by applying the UniformAdd algorithm given in another document.

### One-Dimensional Epanechnikov Kernel

Adapted from Devroye and Györfi (1985, p. 236)[^39].

1. Generate three uniform PSRNs a, b, and c, with a positive sign, an integer part of 0, and an empty fractional part.
2. Run URandLess on a and c in that order, then run URandLess on b and c in that order. If both runs return 1, set c to point to b.
3. Generate an unbiased random bit. If that bit is 1 (which will happen with probability 1/2), set c's sign to negative.
4. Return c.

### Uniform Distribution Inside Rectellipse

The following example generates a point inside a quarter rectellipse centered at (0, 0) with—

• horizontal "radius" d1, which is an integer greater than 0,
• vertical "radius" d2, which is an integer greater than 0, and
• squareness parameter σ, which is a rational number in [0, 1].

Use the algorithm in "Uniform Distribution Inside N-Dimensional Shapes", except:

• Let InShape return—

• YES if (σ*x1*y1)2/(d1*d2)2 − (x2/d1)2 − (y2/d2)2 + 1 > 0,
• NO if (σ*x2*y2)2/(d1*d2)2 − (x1/d1)2 − (y1/d2)2 + 1 < 0, and
• MAYBE otherwise,

where x1 = C1/S, x2 = (C1+1)/S, y1 = C2/S, and y2 = (C2+1)/S (these four values define the bounds, along the X and Y dimensions, of the box passed to InShape).

For a full rectellipse, step 5.3 in the algorithm is done for each of the two dimensions.

### Tulap distribution

The algorithm below samples a variate from the Tulap(m, b, q) distribution ("truncated uniform Laplace"; Awan and Slavković (2019)[^40]) and returns it as a uniform PSRN.

• The parameter b lies in the interval (0, 1), and can be a uniform PSRN.
• The parameter m is a rational number or a uniform PSRN.
• The parameter q is not used in this algorithm, and is treated as 0.

1. Sample from the number b (for example, call SampleGeometricBag on b if b is implemented as a uniform PSRN) until the result is 0. Set g1 to the number of times the result is 1 this way.
2. Sample from the number b until the result is 0. Set g2 to the number of times the result is 1 this way.
3. Set n to g2g1. Create ret, a uniform PSRN as follows: The sign part is positive if n ≥ 0 and negative otherwise; the integer part is abs(n); the fractional part is empty.
4. Add a uniform random variate in (−1/2, 1/2) to ret. If ret stores base-2 fractional digits, this can be done as follows. Set u to an unbiased random bit, then:
• If n < 0 and u is 1, subtract 1 from the integer part and append 1 to the fractional part.
• If n > 0 and u is 1, add 1 to the integer part and append 1 to the fractional part.
• If n is 0 or u is 0, append 0 to the fractional part.
5. If m is given and is not 0, run the UniformAdd or UniformAddRational algorithm with parameters ret and m, respectively, and set ret to the result.
6. Return ret.

## Requests and Open Questions

1. There is interest in seeing new implementations of the following:
• Algorithms that implement InShape for specific closed curves, specific closed surfaces, and specific signed distance functions. Recall that InShape determines whether a box lies inside, outside, or partly inside or outside a given curve or surface.
• Descriptions of new arbitrary-precision algorithms that use the skeleton given in the section "Building an Arbitrary-Precision Sampler".
2. The appendix contains implementation notes for InShape, which determines whether a box is outside or partially or fully inside a shape. However, practical implementations of InShape will generally only be able to evaluate a shape pointwise. What are necessary and/or sufficient conditions that allow an implementation to correctly classify a box just by evaluating the shape pointwise?
3. Take a polynomial f(λ) of even degree n of the form choose(n,n/2)*λn/2*(1−λ)n/2*k, where k is greater than 1 (thus all f's Bernstein coefficients are 0 except for the middle one, which equals k). Suppose f(1/2) lies in the interval (0, 1). If the degree elevation, described in the appendix, is done enough times (at least r times), then f's Bernstein coefficients will all lie in [0, 1]. The question is: how many degree elevations are enough? A MathOverflow answer showed that r is at least m = (n/f(1/2)2)/(1−f(1/2)2), but is it true that floor(m)+1 elevations are enough?
4. A finite-state generator is a finite-state machine that generates a real number's base-2 expansion such as 0.110101100..., driven by flips of a coin. A pushdown generator is a finite-state generator with a stack of memory. Both generators produce real numbers with a given probability distribution. For example, a generator with a loop that outputs 0 or 1 at an equal chance produces a uniform distribution. The following questions ask what kinds of distributions are possible with these generators.

1. Of the probability distributions that a finite-state generator can generate, what is the exact class of:
• Discrete distributions (those that cover a finite or countably infinite set of values)?
• Absolutely continuous distributions (those with a probability density function such as the uniform or triangular distribution)?
• Singular distributions (covering an uncountable but zero-volume set)?
• Absolutely continuous distributions with continuous density functions?
2. Same question as 1, but for pushdown generators.
3. Of the probability distributions that a pushdown generator can generate, what is the exact class of distributions with piecewise density functions whose pieces have infinitely many "slope" functions? (The answer is known for finite-state generators.)