{\centering\bfseries Supplemental documentation for hpstat \textit{turnbull} command\par}
The hpstat \textit{turnbull} command implements Turnbull's nonparametric survival curve estimation for interval-censored observations [1]. This documentation discusses technical details of the implementation.
Let $\hat{F}(t)$ be a maximum likelihood estimator for the cumulative distribution function for failure times. Turnbull [1] demonstrated that $\hat{F}(t)$ increases only on the set of what are now called ‘Turnbull intervals’, or ‘innermost intervals’, $T_j$ for $j =1, 2, …, k$.
Let $p_j$ be the probability of failure within the interval $T_j$. We seek a maximum likelihood estimator for the vector $\symbf{p}=(p_1, p_2, …, p_k)^\mathrm{T}$. We apply an efficient expectation maximisation–iterative convex minorant (EM-ICM) algorithm described by Anderson-Bergman [2] to find $\hat{\symbf{p}}$.
Now take the $i$-th observation, $1 ≤ i ≤ n$, whose failure time falls in $O_i$, and let $α_{i,j}=\mathrm{I}\left(T_j \subseteq O_i\right)$. Let $\hat{F}_0=0 ≤ \hat{F}_1 ≤ \hat{F}_2 ≤ … ≤ \hat{F}_k =1$ be the values of $\hat{F}(t)$ outside the Turnbull intervals, such that $\hat{p}_j =\hat{F}_j -\hat{F}_{j-1}$. We seek the standard errors of these $\hat{\symbf{F}}=(\hat{F}_1, \hat{F}_2, …, \hat{F}_{k-1})^\mathrm{T}$.
The covariance matrix of $\hat{\symbf{F}}$ is given by the inverse of $-\nablasub{\hat{\symbf{F}}}\mathcal{L}$. The standard errors for each of $\hat{\symbf{F}}$ are the square roots of the diagonal elements of the covariance matrix, as required.
Alternatively, when \textit{--se-method oim-drop-zeros} is passed, columns/rows of $\nablasub{\hat{\symbf{F}}}\mathcal{L}$ corresponding with intervals where $\hat{s}_i =0$ are dropped before the matrix is inverted, which enables greater numerical stability but whose theoretical justification is not well explored [3].
In the further alternative, when \textit{--se-method likelihood-ratio} is passed, confidence intervals for $\hat{\symbf{F}}$ are computed by inverting a likelihood ratio test at each point, as described by Goodall, Dunn \& Babiker~[3].
\item Turnbull BW. The empirical distribution function with arbitrarily grouped, censored and truncated data. \textit{Journal of the Royal Statistical Society, Series B (Methodological)}. 1976;38(3):290–5. \href{https://doi.org/10.1111/j.2517-6161.1976.tb01597.x}{doi: 10.1111\slash j.2517-6161.1976.tb01597.x}
\item Anderson-Bergman C. An efficient implementation of the EMICM algorithm for the interval censored NPMLE. \textit{Journal of Computational and Graphical Statistics}. 2017;26(2):463–7. \href{https://doi.org/10.1080/10618600.2016.1208616}{doi: 10.1080\slash 10618600.2016.1208616}
\item Goodall RL, Dunn DT, Babiker AG. Interval-censored survival time data: confidence intervals for the non-parametric survivor function. \textit{Statistics in Medicine}. 2004;23(7):1131–45. \href{https://doi.org/10.1002/sim.1682}{doi: 10.1002\slash sim.1682}