Beta ratio distribution for SciPy

Warning! I am not a statistician. This article is not reviewed. Please confer with a responsible adult!

TL;DR: Get the implementation here.

The quotient of 2 independent beta-distributed random variables has a known distribution, but its closed-form expression is a little hairy [1, 2]. One Python implementation of this distribution is available from Julian Saffer [3], but it suffers from some numerical issues in some circumstances. For example, below is the PDF generated by Saffer's implementation for $\frac{\mathrm{Beta}(13, 239)}{\mathrm{Beta}(8, 744)}$:

Beta ratio PDF

As shown, the procedure fails to converge to the correct density for ratios between approximately 1 and 2. In addition, the procedure is very slow for distributions with large α and β parameters. The above plot took about 10 seconds to generate.

Instead of the procedure employed by Saffer's implementation, the following snippet effects a more direct computation of the PDF, making use of primitives provided by the mpmath library:

def pdf(self, w, a1, b1, a2, b2):
	if w < 1:
		term1 = mpmath.beta(a1 + a2, b2) / (mpmath.beta(a1, b1) * mpmath.beta(a2, b2))
		term2 = mpmath.power(w, a1 - 1)
		term3 = mpmath.hyp2f1(a1 + a2, 1 - b1, a1 + a2 + b2, w)
	else:
		term1 = mpmath.beta(a1 + a2, b1) / (mpmath.beta(a1, b1) * mpmath.beta(a2, b2))
		term2 = 1 / mpmath.power(w, a2 + 1)
		term3 = mpmath.hyp2f1(a1 + a2, 1 - b2, a1 + a2 + b1, 1/w)
	
	return float(term1 * term2 * term3)

This produces the following PDF:

Beta ratio PDF

This implementation is also much faster, coming in at a mean (SD) 17.6 ± 0.3 milliseconds.

We can package this, and the corresponding CDF, into a subclass of scipy.stats.rv_continuous. A full implementation is available here.

This allows us to avail ourselves of the familiar SciPy interface for probability distributions. For example, to easily calculate the equal-tailed credible interval for an epidemiologic risk ratio:

N0 = 750
n0 = 7
N1 = 250
n1 = 12

prior_alpha = 1
prior_beta = 1

posterior = yli.beta_ratio(
	prior_alpha + n1, prior_beta + N1 - n1,
	prior_alpha + n0, prior_beta + N0 - n0
)
print(posterior.interval(0.95))  # -> (2.082349732594243, 12.507478970304723)

Reproducing the figure from Saffer's repository:

beta1 = stats.beta(3, 6)
beta2 = stats.beta(12, 7)
ratio = yli.beta_ratio.from_scipy(beta1, beta2)

x = np.linspace(0, 2, 100)
ax = sns.lineplot(x=x, y=beta1.pdf(x))
sns.lineplot(x=x, y=beta2.pdf(x))
sns.lineplot(x=x, y=ratio.pdf(x))
sns.lineplot(x=x, y=ratio.cdf(x), linestyle='dashdot')

x = np.linspace(*beta1.interval(0.95), 100)
ax.fill_between(x, beta1.pdf(x), color='C0', alpha=0.3, zorder=0)
x = np.linspace(*beta2.interval(0.95), 100)
ax.fill_between(x, beta2.pdf(x), color='C1', alpha=0.3, zorder=0)
x = np.linspace(*ratio.interval(0.95), 100)
ax.fill_between(x, ratio.pdf(x), color='C2', alpha=0.3, zorder=0)

ax.axvline(beta1.mean(), linestyle='dashed', color='C0')
ax.axvline(beta2.mean(), linestyle='dashed', color='C1')
ax.axvline(ratio.mean(), linestyle='dashed', color='C2')

Replica of Saffer's demonstration figure

TL;DR: Get the implementation here.

References

[1] Pham-Gia T. Distributions of the ratios of independent beta variables and applications. Communications in Statistics: Theory and Methods. 2000;29(12):2693–715. doi: 10.1080/03610920008832632

[2] Weekend Editor. On the ratio of Beta-distributed random variables. Some Weekend Reading. 2021 Sep 13. https://www.someweekendreading.blog/beta-ratios/

[3] Saffer J. Beta quotient distribution. GitHub. https://github.com/jsaffer/beta_quotient_distribution