turnbull: Update documentation to reflect EM-ICM algorithm

This commit is contained in:
RunasSudo 2023-10-29 00:40:41 +11:00
parent 2880fe866d
commit f4d436e608
Signed by: RunasSudo
GPG Key ID: 7234E476BF21C61A
2 changed files with 14 additions and 25 deletions

View File

@ -22,52 +22,41 @@
The hpstat \textit{turnbull} command implements Turnbull's nonparametric survival curve estimation for interval-censored observations [1]. This documentation discusses technical details of the implementation. The hpstat \textit{turnbull} command implements Turnbull's nonparametric survival curve estimation for interval-censored observations [1]. This documentation discusses technical details of the implementation.
Let $\hat{F}(t)$ be a maximum likelihood estimator for the cumulative distribution function for failure times. Turnbull [1] demonstrated that $\hat{F}(t)$ decreases only on the set of what are now called ‘Turnbull intervals’, or ‘innermost intervals’, $(q_j, p_j]$ for $j = 1, 2, …, m$. Let $\hat{F}(t)$ be a maximum likelihood estimator for the cumulative distribution function for failure times. Turnbull [1] demonstrated that $\hat{F}(t)$ increases only on the set of what are now called ‘Turnbull intervals’, or ‘innermost intervals’, $T_j$ for $j = 1, 2, …, k$.
Let $s_j$ be the probability of failure within the interval $(q_j, p_j]$. We seek a maximum likelihood estimator for the vector $\symbf{s} = (s_1, s_2, …, s_m)^\mathrm{T}$. Let $p_j$ be the probability of failure within the interval $T_j$. We seek a maximum likelihood estimator for the vector $\symbf{p} = (p_1, p_2, …, p_k)^\mathrm{T}$. We apply an efficient expectation maximisation–iterative convex minorant (EM-ICM) algorithm described by Anderson-Bergman [2] to find $\hat{\symbf{p}}$.
Take the $i$-th observation, $1 ≤ i ≤ n$ , whose failure time falls in $(L_i, R_i]$. Let $α_{i,j} = \mathrm{I}\left((q_j, p_j] \subseteq (L_i, R_i]\right)$. Now take the $i$-th observation, $1 ≤ i ≤ n$, whose failure time falls in $O_i$, and let $α_{i,j} = \mathrm{I}\left(T_j \subseteq O_i\right)$. Let $\hat{F}_0 = 0\hat{F}_1\hat{F}_2 ≤ … ≤ \hat{F}_k = 1$ be the values of $\hat{F}(t)$ outside the Turnbull intervals, such that $\hat{p}_j = \hat{F}_j - \hat{F}_{j-1}$. We seek the standard errors of these $\hat{\symbf{F}} = (\hat{F}_1, \hat{F}_2, …, \hat{F}_{k-1})^\mathrm{T}$.
As discussed by Turnbull [1], noting that we consider only the case of no truncation, we commence with an arbitrary initial guess for $\hat{\symbf{s}}$, and iteratively apply:
%
\begin{align*}
μ_{ij}(\hat{\symbf{s}}) &= \frac{α_{i,j} \hat{s}_j}{\sum_{k=1}^m α_{i,k} \hat{s}_k} \\
π_j(\hat{\symbf{s}}) &= \frac{\sum_{i=1}^n μ_{ij}(\hat{\symbf{s}})}{n} \\
\hat{s}_j &\leftarrow π_j(\hat{\symbf{s}}), \qquad \text{for all $j = 1, 2, …, m$}
\end{align*}
%
This yields the maximum likelihood estimator $\hat{\symbf{s}}$.
Now let $\hat{F}_0 = 0\hat{F}_1\hat{F}_2 ≤ … ≤ \hat{F}_m = 1$ be the values of $\hat{F}(t)$ outside the Turnbull intervals, such that $\hat{s}_j = \hat{F}_j - \hat{F}_{j-1}$. We seek the standard errors of these $\hat{\symbf{F}} = (\hat{F}_1, \hat{F}_2, …, \hat{F}_{m-1})^\mathrm{T}$.
% %
Note that the log-likelihood $\mathcal{L}_i$ for the $i$-th observation is: Note that the log-likelihood $\mathcal{L}_i$ for the $i$-th observation is:
% %
\begin{align*} \begin{align*}
\mathcal{L}_i &= \log\left(\sum_{j=1}^m α_{i,j} \hat{s}_j\right) \\ \mathcal{L}_i &= \log\left(\sum_{j=1}^k α_{i,j} \hat{p}_j\right) \\
&= \log\left(\sum_{j=1}^m α_{i,j} (\hat{F}_j - \hat{F}_{j-1})\right) &= \log\left(\sum_{j=1}^k α_{i,j} (\hat{F}_j - \hat{F}_{j-1})\right)
\end{align*} \end{align*}
% %
Note the gradient $\nablasub{\hat{\symbf{F}}} \mathcal{L}_i$ is the vector whose $h$-th element is: Note the gradient $\nablasub{\hat{\symbf{F}}} \mathcal{L}_i$ is the vector whose $h$-th element is:
% %
\begin{align*} \begin{align*}
\frac{\partial \mathcal{L}_i}{\partial \hat{F}_h} &= \frac{α_{i,h} - α_{i,h+1}}{\sum_{j=1}^m α_{i,j} (\hat{F}_j - \hat{F}_{j-1})} \frac{\partial \mathcal{L}_i}{\partial \hat{F}_h} &= \frac{α_{i,h} - α_{i,h+1}}{\sum_{j=1}^k α_{i,j} (\hat{F}_j - \hat{F}_{j-1})}
\end{align*} \end{align*}
% %
And so the Hessian $\nablasub{\hat{\symbf{F}}} \mathcal{L}_i$ has $(h, k)$-th elements: And so the Hessian $\nablasub{\hat{\symbf{F}}} \mathcal{L}_i$ has $(h, k)$-th elements:
% %
\begin{align*} \begin{align*}
\frac{\partial \mathcal{L}_i}{\partial \hat{F}_h \partial \hat{F}_k} &= - \frac{( α_{i,h} - α_{i,h+1} ) ( α_{i,k} - α_{i,k+1} )}{\left( \sum_{j=1}^m α_{i,j} (\hat{F}_j - \hat{F}_{j-1}) \right)^2} \frac{\partial \mathcal{L}_i}{\partial \hat{F}_h \partial \hat{F}_k} &= - \frac{( α_{i,h} - α_{i,h+1} ) ( α_{i,k} - α_{i,k+1} )}{\left( \sum_{j=1}^k α_{i,j} (\hat{F}_j - \hat{F}_{j-1}) \right)^2}
\end{align*} \end{align*}
% %
The sum of all $\nablasub{\hat{\symbf{F}}} \mathcal{L}_i$ yields the Hessian of the log-likelihood $\nablasub{\hat{\symbf{F}}} \mathcal{L}$. The sum of all $\nablasub{\hat{\symbf{F}}} \mathcal{L}_i$ yields the Hessian of the log-likelihood $\nablasub{\hat{\symbf{F}}} \mathcal{L}$.
The covariance matrix of $\hat{\symbf{F}}$ is given by the inverse of $-\nablasub{\hat{\symbf{F}}} \mathcal{L}$. The standard errors for each of $\hat{\symbf{F}}$ are the square roots of the diagonal elements of the covariance matrix, as required. Alternatively, when \textit{--se-method oim-drop-zeros} is passed, columns/rows of $\nablasub{\hat{\symbf{F}}} \mathcal{L}$ corresponding with intervals where $\hat{s}_i = 0$ are dropped before the matrix is inverted, which enables greater numerical stability but whose theoretical justification is not well explored [2]. The covariance matrix of $\hat{\symbf{F}}$ is given by the inverse of $-\nablasub{\hat{\symbf{F}}} \mathcal{L}$. The standard errors for each of $\hat{\symbf{F}}$ are the square roots of the diagonal elements of the covariance matrix, as required. Alternatively, when \textit{--se-method oim-drop-zeros} is passed, columns/rows of $\nablasub{\hat{\symbf{F}}} \mathcal{L}$ corresponding with intervals where $\hat{s}_i = 0$ are dropped before the matrix is inverted, which enables greater numerical stability but whose theoretical justification is not well explored [3].
%{\vspace{0.5cm}\scshape\centering References\par} {\vspace{0.5cm}\scshape\centering References\par}
{\pagebreak\scshape\centering References\par} %{\pagebreak\scshape\centering References\par}
\begin{enumerate} \begin{enumerate}
\item Turnbull BW. The empirical distribution function with arbitrarily grouped, censored and truncated data. \textit{Journal of the Royal Statistical Society, Series B (Methodological)}. 1976;38(3):290–5. \href{https://doi.org/10.1111/j.2517-6161.1976.tb01597.x}{doi: 10.1111\slash j.2517-6161.1976.tb01597.x} \item Turnbull BW. The empirical distribution function with arbitrarily grouped, censored and truncated data. \textit{Journal of the Royal Statistical Society, Series B (Methodological)}. 1976;38(3):290–5. \href{https://doi.org/10.1111/j.2517-6161.1976.tb01597.x}{doi: 10.1111\slash j.2517-6161.1976.tb01597.x}
\item Anderson-Bergman C. An efficient implementation of the EMICM algorithm for the interval censored NPMLE. \textit{Journal of Computational and Graphical Statistics}. 2017;26(2):463–7. \href{https://doi.org/10.1080/10618600.2016.1208616}{doi: 10.1080\slash 10618600.2016.1208616}
\item Goodall RL, Dunn DT, Babiker AG. Interval-censored survival time data: confidence intervals for the non-parametric survivor function. \textit{Statistics in Medicine}. 2004;23(7):1131–45. \href{https://doi.org/10.1002/sim.1682}{doi: 10.1002\slash sim.1682} \item Goodall RL, Dunn DT, Babiker AG. Interval-censored survival time data: confidence intervals for the non-parametric survivor function. \textit{Statistics in Medicine}. 2004;23(7):1131–45. \href{https://doi.org/10.1002/sim.1682}{doi: 10.1002\slash sim.1682}
\end{enumerate} \end{enumerate}

View File

@ -485,12 +485,12 @@ fn do_icm_step(data: &TurnbullData, _p: &Vec<f64>, s: &Vec<f64>, ll_model: f64)
} }
} }
fn compute_hessian(data: &TurnbullData, s: &Vec<f64>) -> DMatrix<f64> { fn compute_hessian(data: &TurnbullData, p: &Vec<f64>) -> DMatrix<f64> {
let mut hessian: DMatrix<f64> = DMatrix::zeros(data.num_intervals() - 1, data.num_intervals() - 1); let mut hessian: DMatrix<f64> = DMatrix::zeros(data.num_intervals() - 1, data.num_intervals() - 1);
for (idx_left, idx_right) in data.data_time_interval_indexes.iter() { for (idx_left, idx_right) in data.data_time_interval_indexes.iter() {
// Compute 1 / (Σ_j α_{i,j} s_j) // Compute 1 / (Σ_j α_{i,j} p_j)
let mut one_over_hessian_denominator: f64 = s[*idx_left..(*idx_right + 1)].iter().sum(); let mut one_over_hessian_denominator: f64 = p[*idx_left..(*idx_right + 1)].iter().sum();
one_over_hessian_denominator = one_over_hessian_denominator.powi(-2); one_over_hessian_denominator = one_over_hessian_denominator.powi(-2);
// The numerator of the log-likelihood is -(α_{i,h} - α_{i,h+1})(α_{i,k} - α_{i,k+1}) // The numerator of the log-likelihood is -(α_{i,h} - α_{i,h+1})(α_{i,k} - α_{i,k+1})