Survival analysis

Functions

yli.kaplanmeier(df, time, status, by=None, *, ci=True, transform_x=None, transform_y=None, nan_policy='warn')

Generate a Kaplan–Meier plot

Uses the Python matplotlib library.

Parameters:
  • df (DataFrame) – Data to generate plot for

  • time (str) – Column in df for the time to event (numeric or timedelta)

  • status (str) – Column in df for the status variable (True/False or 1/0)

  • by (str) – Column in df to stratify by (categorical)

  • ci (bool) – Whether to plot confidence intervals around the survival function

  • transform_x (callable) – Function to transform x axis by

  • transform_y (callable) – Function to transform y axis by

  • nan_policy (str) – How to handle nan values (see NaN handling)

Return type:

(Figure, Axes)

yli.logrank(df, time, status, by, nan_policy='warn')

Perform the log-rank test for equality of survival functions

Parameters:
  • df (DataFrame) – Data to perform the test on

  • time (str) – Column in df for the time to event (numeric or timedelta)

  • status (str) – Column in df for the status variable (True/False or 1/0)

  • by (str) – Column in df to stratify by (categorical)

  • nan_policy (str) – How to handle nan values (see NaN handling)

Return type:

yli.sig_tests.ChiSquaredResult

yli.turnbull(df, time_left, time_right, by=None, *, step_loc=0.5, transform_x=None, transform_y=None, nan_policy='warn')

Generate a Turnbull estimator plot, which extends the Kaplan–Meier estimator to interval-censored observations

The intervals are assumed to be half-open intervals, (left, right]. right == np.inf implies the event was right-censored. Unlike yli.kaplanmeier(), times must be given as numeric dtypes and not as pandas timedelta.

By default, the survival function is drawn as a step function at the midpoint of each Turnbull interval.

Uses the Python lifelines and matplotlib libraries.

Parameters:
  • df (DataFrame) – Data to generate plot for

  • time_left (str) – Column in df for the time to event, left interval endpoint (numeric)

  • time_right (str) – Column in df for the time to event, right interval endpoint (numeric)

  • by (str) – Column in df to stratify by (categorical)

  • step_loc (float) – Proportion along the length of each Turnbull interval to step down the survival function, e.g. 0 for left bound, 1 for right bound, 0.5 for interval midpoint (numeric)

  • transform_x (callable) – Function to transform x axis by

  • transform_y (callable) – Function to transform y axis by

  • nan_policy (str) – How to handle nan values (see NaN handling)

Return type:

(Figure, Axes)