TL;DR: Get the implementation here.

In biostatistics, a common effect measure when considering dichotomous exposures and outcomes is the odds ratio. With two proportions $π_0$ and $π_1$, the odds ratio is $ψ = \frac{π_1 / (1 - π_1)}{π_0 / (1 - π_0)}$, as compared to the risk ratio, $ρ = \frac{π_1}{π_0}$. Compared to the risk ratio, the odds ratio has the advantage that it is symmetric with respect to the exposure and outcome, and so can be computed from a case-control study where the population incidence of the outcome is not measured.

In Bayesian inference, a beta-binomial model is usually applied, where $π_0$ and $π_1$ have independent beta priors. When updated under binomial likelihood, this yields independent beta posteriors for $π_0$ and $π_1$. The distribution of the risk ratio $ρ = \frac{π_1}{π_0}$, being the ratio of two independent beta distributions, has been studied. I have previously discussed this distribution and provided a SciPy implementation. The distribution of the odds ratio $ψ = \frac{π_1 / (1 - π_1)}{π_0 / (1 - π_0)}$, however, receives less attention.1

The probability density for $ψ$ has been derived by Hora & Kelley [1], who give the distribution in equation 3.4 as:

$\mathrm{pdf}(ψ) = k(a, b, c, d) \sum_{j=0}^∞ k_j(a, b, c, d) f_β(ψ; a, j+1)$

for $0 ≤ ψ ≤ 1$, where $a = α_1$, $b = α_1 + β_1$, $c = α_2$ and $d = α_2 + β_2$, $f_β(\cdot; α, β)$ is the density of $\mathrm{Beta}(α, β)$, and:

\begin{align*} k(a, b, c, d) &= \frac{\mathrm{B}(a+b, c+d-a-b)}{\mathrm{B}(a, c-a) \mathrm{B}(b, d-b)} \\ k_j(a, b, c, d) &= \frac{(c)_j (a+b)_j}{(c+d)_j} \frac{\mathrm{B}(a, j+1)}{j!} \end{align*}

where $\mathrm{B}$ is the beta function, and $(\cdot)_j$ is the rising factorial.

With the appropriate substitutions, then, we have:

\begin{align*} \mathrm{pdf}(ψ) &= \frac{\mathrm{B}(a+b, c+d-a-b)}{\mathrm{B}(a, c-a) \mathrm{B}(b, d-b)} \sum_{j=0}^∞ \frac{(c)_j (a+b)_j}{(c+d)_j} \frac{\mathrm{B}(a, j+1)}{j!} \frac{ψ^{a-1} (1-ψ)^j}{\mathrm{B}(a, j+1)} \\ &= \frac{\mathrm{B}(α_1 + α_2, β_1 + β_2)}{\mathrm{B}(α_1, β_1) \mathrm{B}(α_2, β_2)} \sum_{j=0}^∞ \frac{(α_1 + β_1)_j (α_1 + α_2)_j}{(α_1 + α_2 + β_1 + β_2)_j} \frac{\mathrm{B}(α_1, j+1)}{j!} \frac{ψ^{α_1-1} (1-ψ)^j}{\mathrm{B}(α_1, j+1)} \\ &= \frac{\mathrm{B}(α_1 + α_2, β_1 + β_2)}{\mathrm{B}(α_1, β_1) \mathrm{B}(α_2, β_2)} ψ^{α_1-1} \sum_{j=0}^∞ \frac{(α_1 + β_1)_j (α_1 + α_2)_j}{(α_1 + α_2 + β_1 + β_2)_j} \frac{(1-ψ)^j}{j!} \end{align*}

The infinite sum in the expression has the form of the hypergeometric function ${}_2F_1$, so we have:

$\mathrm{pdf}(ψ) = \frac{\mathrm{B}(α_1 + α_2, β_1 + β_2)}{\mathrm{B}(α_1, β_1) \mathrm{B}(α_2, β_2)} ψ^{α_1-1} \cdot {}_2F_1(α_1 + β_1, α_1 + α_2; α_1 + α_2 + β_1 + β_2; 1-ψ)$

When $ψ > 1$, we can consider instead $ψ^{-1} = \frac{π_0 / (1 - π_0)}{π_1 / (1 - π_1)}$. This is also the odds ratio of two independent beta distributions, but with the numerator and denominator exchanged, so $(α_1, β_1)$ becomes $(α_2, β_2)$ and vice versa. Since $0 < ψ^{-1} < 1$, we have:

\begin{align*} \mathrm{pdf}(ψ) &= \mathrm{pdf}\left(ψ^{-1}\right) \left|\frac{\mathrm{d}ψ^{-1}}{\mathrm{d}ψ}\right| \\ &= \frac{\mathrm{B}(α_1 + α_2, β_1 + β_2)}{\mathrm{B}(α_1, β_1) \mathrm{B}(α_2, β_2)} \left(ψ^{-1}\right)^{α_2-1} \cdot {}_2F_1 \mathopen{}\left(α_2 + β_2, α_1 + α_2; α_1 + α_2 + β_1 + β_2; 1-ψ^{-1}\right) \left(ψ^{-2}\right) \\ &= \frac{\mathrm{B}(α_1 + α_2, β_1 + β_2)}{\mathrm{B}(α_1, β_1) \mathrm{B}(α_2, β_2)} \frac{1}{ψ^{α_2+1}} \cdot {}_2F_1 \mathopen{}\left(α_2 + β_2, α_1 + α_2; α_1 + α_2 + β_1 + β_2; 1-ψ^{-1}\right) \end{align*}

We can implement this in Python, using the mpmath library to compute the hypergeometric function and other terms, as follows:

a1, b1, a2, b2 = ...

def pdf(w):
if w <= 0:
return 0
elif w <= 1:
term1 = mpmath.beta(a1 + a2, b1 + b2) / (mpmath.beta(a1, b1) * mpmath.beta(a2, b2))
term2 = mpmath.power(w, a1 - 1)
term3 = mpmath.hyp2f1(a1 + b1, a1 + a2, a1 + a2 + b1 + b2, 1 - w)
else:
term1 = mpmath.beta(a1 + a2, b1 + b2) / (mpmath.beta(a1, b1) * mpmath.beta(a2, b2))
term2 = mpmath.power(w, -a2 - 1)
term3 = mpmath.hyp2f1(a2 + b2, a1 + a2, a1 + a2 + b1 + b2, 1 - mpmath.power(w, -1))

return float(term1 * term2 * term3)


To derive the CDF, we can integrate the PDF using the highly rigorous method of integration-by-Mathematica. This yields a ‘closed-form’ CDF expressed in terms of the Meijer G function. However, I found this to have poor numerical characteristics when the CDF approaches 1, so we can instead compute the CDF by numerical integration using SciPy:

def cdf_integrate(w):


This approach of integrating the PDF has the advantage that, when computing the CDF at multiple points, we can use SciPy's ODE-solving utilities to obtain the values of the CDF at those points all with one step. For example:

w = np.linspace(...)
scipy.integrate.solve_ivp(lambda x, _: pdf(x), (0, w.max()), [0], t_eval=w)


However, when computing the CDF only at a single point, the numeric integration can be a little slow for some points.2

An alternative method for computing the CDF is given by Hora & Kelley, who provide in equation 3.2:

$\mathrm{cdf}(ψ) = k(a, b, c, d) \sum_{j=0}^∞ k_j(a, b, c, d) I_ψ(a, j+1)$

for $0 ≤ ψ ≤ 1$, where $I_ψ(α, β) = \frac{\mathrm{B}(ψ; α, β)}{\mathrm{B}(α, β)}$ is the regularised incomplete beta function. Again with the appropriate substitutions, we then have:

\begin{align*} \mathrm{cdf}(ψ) &= \frac{\mathrm{B}(α_1 + α_2, β_1 + β_2)}{\mathrm{B}(α_1, β_1) \mathrm{B}(α_2, β_2)} \sum_{j=0}^∞ \frac{(α_1 + β_1)_j (α_1 + α_2)_j}{(α_1 + α_2 + β_1 + β_2)_j} \frac{\mathrm{B}(α_1, j+1)}{j!} \frac{\mathrm{B}(ψ; α_1, j+1)}{\mathrm{B}(α_1, j+1)} \\ &= \frac{\mathrm{B}(α_1 + α_2, β_1 + β_2)}{\mathrm{B}(α_1, β_1) \mathrm{B}(α_2, β_2)} \sum_{j=0}^∞ \frac{(α_1 + β_1)_j (α_1 + α_2)_j}{(α_1 + α_2 + β_1 + β_2)_j} \frac{\mathrm{B}(ψ; α_1, j+1)}{j!} \end{align*}

where $\mathrm{B}(\cdot; α, β)$ is the (unregularised) incomplete beta function. When $ψ > 1$, we can likewise consider $ψ^{-1}$ to obtain the CDF.

Hora & Kelley note that the CDF can be approximated by truncating the infinite sum in the expression (say, to 100 terms), which results in bounded error.

The truncated sum is much faster to compute than the numeric integral, which is particularly useful when solving numerically for the inverse CDF, where the CDF must be called repeatedly in sequence.

Similar to my SciPy beta ratio distribution, we can package these functions into a SciPy rv_continuous, which allows us to avail ourselves of the familiar SciPy syntax for working with probability distributions. The implementation is available here.

For example, calculating equal-tailed and highest density credible intervals for an odds ratio:

N0 = 750
n0 = 7
N1 = 250
n1 = 12

prior_alpha = 1
prior_beta = 1

posterior = yli.beta_oddsratio(
prior_alpha + n1, prior_beta + N1 - n1,
prior_alpha + n0, prior_beta + N0 - n0
)
print(posterior.interval(0.95))  # -> (2.120859725360223, 13.33383402638461)
print(yli.hdi(posterior, 0.95))  # -> (1.544949205759145, 11.57835713410250)


TL;DR: Get the implementation here.

## References

[1] Hora SC, Kelley GD. Bayesian inference on the odds and risk ratios. Communications in Statistics: Theory and Methods. 1983;12(6):725–38. doi: 10.1080/03610928308828491

## Footnotes

1. If $X \sim \mathrm{Beta}(α, β)$, then $\frac{X}{1-X} \sim \mathrm{Beta}'(α, β)$, so $ψ$ is also the ratio of 2 independent beta prime distributed variables.

2. One optimisation for large $ψ$ is to integrate along the PDF for $ψ^{-1}$ instead, which involves less numerical integration than for the corresponding $ψ$.