• ### A high-performance Rust implementation of interval-censored Cox regression

Cox proportional hazards models are commonly used in biostatistics for the modelling of time-to-event data, such as mortality or disease progression. A difficulty in applying Cox models arises when observations are interval censored, i.e. the time of event is not observed exactly, but is known… »

• ### SciPy distribution for the odds ratio of independent beta variables

In biostatistics, a common effect measure when considering dichotomous exposures and outcomes is the odds ratio. With two proportions $π_0$ and $π_1$, the odds ratio is $ψ = \frac{π_1 / (1 - π_1)}{π_0 / (1 - π_0)}$, as compared to the risk ratio,… »

• ### Directly computing HDIs from PDFs in SciPy

In Bayesian inference, it is often desired to calculate credible intervals for model parameters. The 2 common choices are the highest posterior density interval (HPD/HDI), and the equal-tailed interval. In many cases, the posterior density must be estimated by simulation, but in some cases the… »

• ### Beta ratio distribution for SciPy

The quotient of 2 independent beta-distributed random variables has a known distribution, but its closed-form expression is a little hairy [1, 2]. One Python implementation of this distribution is available from Julian Saffer , but it suffers from some numerical issues… »

• ### Bayesian biostatistics procedures matching frequentist confidence intervals

Confidence intervals are commonly misinterpreted as there being, after observing the data, a 95% probability that the true parameter lies within the confidence interval. The usual explanation why this is incorrect is that the true parameter is not random, and so is either inside or… »

• ### On the credible probability of confidence intervals

Confidence intervals are commonly misinterpreted by consumers of statistics. Hoekstra et al.  presented 120 psychology researchers and 442 students with ‘a fictitious scenario of a professor who conducts an experiment and reports a 95% CI for the mean that ranges from 0.1 to»

• ### Quasi-likelihood gamma regression in statsmodels for zeroes in observations

Generalised linear models with a gamma distribution and log link are frequently used to model non-negative right-skewed continuous data, such as costs .

For example, in statsmodels:

import numpy as np
import pandas as pd
from scipy import stats
import statsmodels.api as sm

#
»
• ### Robust Poisson regression in medical biostatistics

Log-binomial and robust (modified) Poisson regression are common approaches to estimating risk ratios in medical biostatistics .

I have discussed log-binomial regression in a previous post about generalised linear models. The conceptual basis for using log-binomial regression to estimate risk ratios is straightforward –… »

• ### Generalised linear models for medical biostatistics

Recently, I've been doing some statistical analysis using log-binomial generalised linear models (GLMs). Resources on the topic seem to fall largely into 2 categories:

• Assume you want to know none of the background: ‘Use a log-binomial GLM if you want a risk ratio.’1

• Assume

»