Beta ratio distribution for SciPy
Warning! I am not a statistician. This article is not reviewed. Please confer with a responsible adult!
TL;DR: Get the implementation here.
The quotient of 2 independent beta-distributed random variables has a known distribution, but its closed-form expression is a little hairy [1, 2]. One Python implementation of this distribution is available from Julian Saffer [3], but it suffers from some numerical issues in some circumstances. For example, below is the PDF generated by Saffer's implementation for $\frac{\mathrm{Beta}(13, 239)}{\mathrm{Beta}(8, 744)}$:
As shown, the procedure fails to converge to the correct density for ratios between approximately 1 and 2. In addition, the procedure is very slow for distributions with large α and β parameters. The above plot took about 10 seconds to generate.
Instead of the procedure employed by Saffer's implementation, the following snippet effects a more direct computation of the PDF, making use of primitives provided by the mpmath library:
def pdf(self, w, a1, b1, a2, b2):
if w < 1:
term1 = mpmath.beta(a1 + a2, b2) / (mpmath.beta(a1, b1) * mpmath.beta(a2, b2))
term2 = mpmath.power(w, a1 - 1)
term3 = mpmath.hyp2f1(a1 + a2, 1 - b1, a1 + a2 + b2, w)
else:
term1 = mpmath.beta(a1 + a2, b1) / (mpmath.beta(a1, b1) * mpmath.beta(a2, b2))
term2 = 1 / mpmath.power(w, a2 + 1)
term3 = mpmath.hyp2f1(a1 + a2, 1 - b2, a1 + a2 + b1, 1/w)
return float(term1 * term2 * term3)
This produces the following PDF:
This implementation is also much faster, coming in at a mean (SD) 17.6 ± 0.3 milliseconds.
We can package this, and the corresponding CDF, into a subclass of scipy.stats.rv_continuous. A full implementation is available here.
This allows us to avail ourselves of the familiar SciPy interface for probability distributions. For example, to easily calculate the equal-tailed credible interval for an epidemiologic risk ratio:
N0 = 750
n0 = 7
N1 = 250
n1 = 12
prior_alpha = 1
prior_beta = 1
posterior = yli.beta_ratio(
prior_alpha + n1, prior_beta + N1 - n1,
prior_alpha + n0, prior_beta + N0 - n0
)
print(posterior.interval(0.95)) # -> (2.082349732594243, 12.507478970304723)
Reproducing the figure from Saffer's repository:
beta1 = stats.beta(3, 6)
beta2 = stats.beta(12, 7)
ratio = yli.beta_ratio.from_scipy(beta1, beta2)
x = np.linspace(0, 2, 100)
ax = sns.lineplot(x=x, y=beta1.pdf(x))
sns.lineplot(x=x, y=beta2.pdf(x))
sns.lineplot(x=x, y=ratio.pdf(x))
sns.lineplot(x=x, y=ratio.cdf(x), linestyle='dashdot')
x = np.linspace(*beta1.interval(0.95), 100)
ax.fill_between(x, beta1.pdf(x), color='C0', alpha=0.3, zorder=0)
x = np.linspace(*beta2.interval(0.95), 100)
ax.fill_between(x, beta2.pdf(x), color='C1', alpha=0.3, zorder=0)
x = np.linspace(*ratio.interval(0.95), 100)
ax.fill_between(x, ratio.pdf(x), color='C2', alpha=0.3, zorder=0)
ax.axvline(beta1.mean(), linestyle='dashed', color='C0')
ax.axvline(beta2.mean(), linestyle='dashed', color='C1')
ax.axvline(ratio.mean(), linestyle='dashed', color='C2')
TL;DR: Get the implementation here.
References
[1] Pham-Gia T. Distributions of the ratios of independent beta variables and applications. Communications in Statistics: Theory and Methods. 2000;29(12):2693–715. doi: 10.1080/03610920008832632
[2] Weekend Editor. On the ratio of Beta-distributed random variables. Some Weekend Reading. 2021 Sep 13. https://www.someweekendreading.blog/beta-ratios/
[3] Saffer J. Beta quotient distribution. GitHub. https://github.com/jsaffer/beta_quotient_distribution