
A highperformance Rust implementation of the Turnbull nonparametric maximum likelihood estimator for intervalcensored survival data
A common need in biostatistics is to estimate survival curves, but particular difficulty arises when observations are interval censored, i.e. the time of event is not observed exactly, but is known only to fall within a particular interval. In this setting, Turnbull estimator [1] extends… »

A highperformance Rust implementation of intervalcensored Cox regression
Cox proportional hazards models are commonly used in biostatistics for the modelling of timetoevent data, such as mortality or disease progression. A difficulty in applying Cox models arises when observations are interval censored, i.e. the time of event is not observed exactly, but is known… »

SciPy distribution for the odds ratio of independent beta variables
In biostatistics, a common effect measure when considering dichotomous exposures and outcomes is the odds ratio. With two proportions $π_0$ and $π_1$, the odds ratio is $ψ = \frac{π_1 / (1  π_1)}{π_0 / (1  π_0)}$, as compared to the risk ratio,… »

Directly computing HDIs from PDFs in SciPy
In Bayesian inference, it is often desired to calculate credible intervals for model parameters. The 2 common choices are the highest posterior density interval (HPD/HDI), and the equaltailed interval. In many cases, the posterior density must be estimated by simulation, but in some cases the… »

Beta ratio distribution for SciPy
The quotient of 2 independent betadistributed random variables has a known distribution, but its closedform expression is a little hairy [1, 2]. One Python implementation of this distribution is available from Julian Saffer [3], but it suffers from some numerical issues… »

Bayesian biostatistics procedures matching frequentist confidence intervals
Confidence intervals are commonly misinterpreted as there being, after observing the data, a 95% probability that the true parameter lies within the confidence interval. The usual explanation why this is incorrect is that the true parameter is not random, and so is either inside or… »

Custom fonts in KaTeX
KaTeX is a webbased mathematics typesetting library, similar to the erstwhile untouchable MathJax. Unfortunately, out of the box, KaTeX only supports one font, its default Computer Modernbased font, and does not have builtin functionality for customising this. Thankfully, it is not difficult to achieve… »

On the credible probability of confidence intervals
Confidence intervals are commonly misinterpreted by consumers of statistics. Hoekstra et al. [1] presented 120 psychology researchers and 442 students with ‘a fictitious scenario of a professor who conducts an experiment and reports a 95% CI for the mean that ranges from 0.1 to… »

Quasilikelihood gamma regression in statsmodels for zeroes in observations
Generalised linear models with a gamma distribution and log link are frequently used to model nonnegative rightskewed continuous data, such as costs [1].
For example, in statsmodels:
… »import numpy as np import pandas as pd from scipy import stats import statsmodels.api as sm #

Robust Poisson regression in medical biostatistics
Logbinomial and robust (modified) Poisson regression are common approaches to estimating risk ratios in medical biostatistics [1].
I have discussed logbinomial regression in a previous post about generalised linear models. The conceptual basis for using logbinomial regression to estimate risk ratios is straightforward –… »

Generalised linear models for medical biostatistics
Recently, I've been doing some statistical analysis using logbinomial generalised linear models (GLMs). Resources on the topic seem to fall largely into 2 categories:

Assume you want to know none of the background: ‘Use a logbinomial GLM if you want a risk ratio.’^{1}

Assume
