# Directly computing HDIs from PDFs in SciPy

**Warning!** I am not a statistician. This article is not reviewed. Please confer with a responsible adult!

In Bayesian inference, it is often desired to calculate credible intervals for model parameters. The 2 common choices are the highest posterior density interval (HPD/HDI), and the equal-tailed interval. In many cases, the posterior density must be estimated by simulation, but in some cases the posterior density has a known closed-form expression, which enables these intervals to be directly computed.

In SciPy, equal-tailed intervals are easily computed from known distributions, via the *interval* method on an *rv_continuous*, e.g. `stats.norm.interval(confidence=0.95, loc=0, scale=1)`

.

Conversely, SciPy does not have a simple method to compute the highest density interval for a known distribution. The PyMC Bayesian modelling library has functionality for computing the HDI (*arviz.hdi*), but from samples from the posterior, not directly from the PDF.

We can leverage SciPy's numerical optimisation and root-finding functions to compute the highest density interval directly from the distribution. We can achieve this by solving for the narrowest interval which covers 95% of the distribution (or some other desired level). The following code snippet achieves this:

```
from scipy import optimize
def hdi(distribution, level=0.95):
"""
Get the highest density interval for the distribution, e.g. for a Bayesian posterior, the highest posterior density interval (HPD/HDI)
"""
# For a given lower limit, we can compute the corresponding 95% interval
def interval_width(lower):
upper = distribution.ppf(distribution.cdf(lower) + level)
return upper - lower
# Find such interval which has the smallest width
# Use equal-tailed interval as initial guess
initial_guess = distribution.ppf((1-level)/2)
optimize_result = optimize.minimize(interval_width, initial_guess)
lower_limit = optimize_result.x[0]
width = optimize_result.fun
upper_limit = lower_limit + width
return (lower_limit, upper_limit)
```

For example, using this function to compute a highest posterior density interval, for a beta-binomial model with uniform prior:

```
from scipy import stats
n, N = 12, 250
prior_a, prior_b = 1, 1
distribution = stats.beta(n + prior_a, N - n + prior_b)
print(hdi(distribution, 0.95)) # -> (0.025943479765227942, 0.07930059177617696)
```

We can confirm this matches the highest posterior density interval computed by the R *binom.bayes* function:

```
> library(binom)
> binom.bayes(12, 250, conf.level=0.95, type='highest', prior.shape1=1, prior.shape2=1)
method x n shape1 shape2 mean lower upper sig
1 bayes 12 250 13 239 0.0515873 0.02594348 0.0793006 0.05
```

This helper function is available as part of a library here.