Chapter 3 Wednesday: Competing risk
3.1 Key (operative) concepts
Patient are exposed simultaneously to \(k (\geq2)\) causes
Effect Free Survival (EFS) is univariate, i.e. only the First Observed Event (FOE) is considered and of interest
The interest is not in the survival model
“At \(\infty\) all individuals will not die in the ICU”
- Type of observed time
- Cansored (conventionally coded w/ \(0\))
- Failure w/ a FOE different from the last absorbing one (coded w/ \(1 -- k-1\))
- Failure w/ the FOE as the last absorbing event (coded w/ \(k\))
- \(T_k = min\Bigl(\tilde{T_k}^{D_k} | D_k \in \{\textrm{causes of failure for }k\} \Bigr)\)
Cumulative Incidence Function (CIF) do not require independence between causes
In competing risk, K-M is biased, i.e. overestimates the CIF (because it the independence assumption is violated)
- Tests
- w/out competing risk: log-rank
- w/ competing risk: modified \(\chi^2\) (Gray 1988)
- Regression strategies for competing risk
- Case Specific Hazard Ratio (CS-HR) — Cox, useful for clinical interests (present it for each competing risk taken singularly)
- Subdistribution Hazard Ratio (SHR) — Fine-Gray, useful for administrative] interests (present it for the global risk considering the contribution of each competing one)
Test the proportional hazard assumption for SHR
There are formulas for the sample size calculation when considering competing risk
3.2 Data manipulation
set.seed(171004)
data(mgus, package = 'survival')
# ?mgus
mgus_df <- as_tibble(mgus)
dd <- datadist(mgus_df)
mgus_df
## # A tibble: 241 x 12
## id age sex dxyr pcdx pctime futime death alb creat hgb
## * <dbl> <dbl> <fctr> <dbl> <fctr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 78 female 68 <NA> NA 748 1 2.8 1.2 11.5
## 2 2 73 female 66 LP 1310 6751 1 NA NA NA
## 3 3 87 male 68 <NA> NA 277 1 2.2 1.1 11.2
## 4 4 86 male 69 <NA> NA 1815 1 2.8 1.3 15.3
## 5 5 74 female 68 <NA> NA 2587 1 3.0 0.8 9.8
## 6 6 81 male 68 <NA> NA 563 1 2.9 0.9 11.5
## 7 7 72 female 68 <NA> NA 1135 1 3.0 0.8 13.5
## 8 8 79 female 69 <NA> NA 2016 1 3.1 0.8 15.5
## 9 9 85 female 70 <NA> NA 2422 1 3.2 1.0 12.4
## 10 10 58 male 65 <NA> NA 6155 1 3.5 1.0 14.8
## # ... with 231 more rows, and 1 more variables: mspike <dbl>
- Find number of patient w/ malignancy (AKA transition), death (w/out malignancy) and Free of Events.
mgus_df <- mgus_df %>%
mutate(
malignancy = !is.na(pcdx)
)
mgus_df %>%
group_by(malignancy, death) %>%
summarise(n = n())
## # A tibble: 4 x 3
## # Groups: malignancy [?]
## malignancy death n
## <lgl> <dbl> <int>
## 1 FALSE 0 14
## 2 FALSE 1 163
## 3 TRUE 0 2
## 4 TRUE 1 62
Patients w/ malignancy as a FOE are 64; patients which experiment death as FOE are 163, while the ones FoE are 14. 163.
- Find the indicator for censored, malignancy and death (
indicator
) - Find the time-to-event to use in the models (
time_t
)
mgus_df <- mgus_df %>%
mutate(
indicator = if_else(malignancy, 1, 2 * death),
time_t = pmin(futime, pctime, na.rm = TRUE)
)
mgus_df
## # A tibble: 241 x 15
## id age sex dxyr pcdx pctime futime death alb creat hgb
## <dbl> <dbl> <fctr> <dbl> <fctr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 78 female 68 <NA> NA 748 1 2.8 1.2 11.5
## 2 2 73 female 66 LP 1310 6751 1 NA NA NA
## 3 3 87 male 68 <NA> NA 277 1 2.2 1.1 11.2
## 4 4 86 male 69 <NA> NA 1815 1 2.8 1.3 15.3
## 5 5 74 female 68 <NA> NA 2587 1 3.0 0.8 9.8
## 6 6 81 male 68 <NA> NA 563 1 2.9 0.9 11.5
## 7 7 72 female 68 <NA> NA 1135 1 3.0 0.8 13.5
## 8 8 79 female 69 <NA> NA 2016 1 3.1 0.8 15.5
## 9 9 85 female 70 <NA> NA 2422 1 3.2 1.0 12.4
## 10 10 58 male 65 <NA> NA 6155 1 3.5 1.0 14.8
## # ... with 231 more rows, and 4 more variables: mspike <dbl>,
## # malignancy <lgl>, indicator <dbl>, time_t <dbl>
- Estimate the naive K-M and the cumulative incidence functions
# Using survival
cuminc(mgus_df$time_t, mgus_df$indicator) %>% # ?cmprsk::cuminc
plot( # ?cmprsk:::plot.cuminc
main = 'Cumulative Incidence Estimates curves',
col = c('blue', 'red'),
xlab = 'Days',
curvlab = c('Transition', 'Death'),
wh = c(1, 1) # legend position
)
survfit(Surv(time_t, malignancy) ~ 1, # using `rms::npsurv()` is the same
data = mgus_df
) %>%
lines( # Use `lines()` to draw over the previous plot
fun = 'event', # plot the cumulative events
conf.int = FALSE,
col = 'black',
lty = 3
)
legend(x = 1, y = 0.86,
legend = 'Naive K-M (Overestimation of Transitions)',
col = 'black',
lty = 3,
bty = 'n' # remove box arround the legend (because we have to add an entry)
)
3.3 Simulation of Competing risk
- Specify two cause-specific exponential hazard \(\lambda_1(t)\) and \(\lambda_2(t)\) of means \(0.8\) and \(1.2\). (Set sample size as you like.)
n <- 1e4
lambda_1 <- 0.8
lambda_2 <- 1.2
- Simulate survival times \(T\) based on the all causes hazard \(\lambda_.(t) = \lambda_1(t) + \lambda_2(t)\).
lambda <- lambda_1 + lambda_2
surv_time <- rexp(n,
rate = 1 / lambda
)
- Generate Bernoulli \(B(p)\) random variables, w/ \(p = \lambda_1(t) / \lambda_.(t)\), i.e. is the probability of occurrence of the event of type 1.
p_cens <- lambda_1 / lambda
transition <- rbinom(n,
size = 1,
prob = p_cens
) %>%
as.logical # Set as logical to use the variable for conditional statements
- Simulate uniform censoring times over \([0, 1]\).
censor_time <- runif(n,
min = 0,
max = 1
)
- Estimate the Cumulative Incidence of each competing event, w/ and w/out censoring; discuss the results.
# create the dataset
sim_data <- data_frame(
id = seq_len(n),
transition = transition,
surv_t = surv_time,
cens_t = censor_time,
time_t = pmin(surv_t, cens_t),
status = case_when(
time_t == cens_t ~ 0L, # All the censored patients has status 0
transition ~ 1L, # Among the other, the ones which has a transition
# have state 1
TRUE ~ 2L # All the other were dead (before the end of f-up)
)
)
# Explore a (random) sample of three cases for each staus
sim_data %>%
group_by(status) %>%
sample_n(3)
## # A tibble: 9 x 6
## # Groups: status [3]
## id transition surv_t cens_t time_t status
## <int> <lgl> <dbl> <dbl> <dbl> <int>
## 1 6759 FALSE 6.844765752 0.5175084 0.517508388 0
## 2 906 TRUE 1.378642827 0.5063402 0.506340163 0
## 3 3568 TRUE 0.623047318 0.5752797 0.575279657 0
## 4 2196 TRUE 0.506380841 0.9344626 0.506380841 1
## 5 8119 TRUE 0.157231928 0.9007847 0.157231928 1
## 6 5854 TRUE 0.742379822 0.8771130 0.742379822 1
## 7 1685 FALSE 0.539729742 0.9635529 0.539729742 2
## 8 4550 FALSE 0.001042022 0.3692372 0.001042022 2
## 9 6893 FALSE 0.410360153 0.9322769 0.410360153 2
# Using survival
cuminc(sim_data$time_t, sim_data$status) %>% # ?cmprsk::cuminc
plot( # ?cmprsk:::plot.cuminc
main = 'Cumulative Incidence Estimates curves',
col = c('blue', 'red'),
xlab = 'Time (normalized [0, 1])',
curvlab = c('Transition', 'Event'),
wh = c(0.01, 1) # legend position
)
survfit(Surv(time_t, transition) ~ 1, # using `rms::npsurv()` is the same
data = sim_data
) %>%
lines( # Use `lines()` to draw over the previous plot
fun = 'event', # plot the cumulative events
conf.int = FALSE,
col = 'black',
lty = 3
)
legend(x = 0.01, y = 0.86,
legend = 'Naive K-M (overestimation of transitions)',
col = 'black',
lty = 3,
bty = 'n' # remove box arround the legend (because we have to add an entry)
)
3.4 Estimation of the effect of sex on MGUS incidence
- Compare the results of Cox cause specific hazard model…
For clinical questions, i.e. cause specific risk to experiment the event w/out taking into account the other couse(s)
dd <- datadist(mgus_df)
cox_sex <- cph(Surv(time_t, malignancy) ~ sex,
data = mgus_df,
x = TRUE,
y = TRUE
)
summary(cox_sex) # this is good for a clean view of the effects
## Effects Response : Surv(time_t, malignancy)
##
## Factor Low High Diff. Effect S.E. Lower 0.95 Upper 0.95
## sex - female:male 2 1 NA 0.049342 0.25103 -0.44268 0.54136
## Hazard Ratio 2 1 NA 1.050600 NA 0.64232 1.71830
cox_sex # Here there are more informations (and the p-values)
## Cox Proportional Hazards Model
##
## cph(formula = Surv(time_t, malignancy) ~ sex, data = mgus_df,
## x = TRUE, y = TRUE)
##
## Model Tests Discrimination
## Indexes
## Obs 241 LR chi2 0.04 R2 0.000
## Events 64 d.f. 1 Dxy -0.039
## Center -0.028 Pr(> chi2) 0.8441 g 0.024
## Score chi2 0.04 gr 1.025
## Pr(> chi2) 0.8441
##
## Coef S.E. Wald Z Pr(>|Z|)
## sex=male -0.0493 0.2510 -0.20 0.8442
##
Predict(cox_sex) %>% # It is necessary to have the predictions for the plot
plot
cox_sex_death <- cph(Surv(time_t, indicator == 2) ~ sex,
data = mgus_df,
x = TRUE,
y = TRUE
)
summary(cox_sex_death)
## Effects Response : Surv(time_t, indicator == 2)
##
## Factor Low High Diff. Effect S.E. Lower 0.95 Upper 0.95
## sex - female:male 2 1 NA -0.44221 0.16183 -0.75939 -0.12502
## Hazard Ratio 2 1 NA 0.64262 NA 0.46795 0.88248
cox_sex_death
## Cox Proportional Hazards Model
##
## cph(formula = Surv(time_t, indicator == 2) ~ sex, data = mgus_df,
## x = TRUE, y = TRUE)
##
## Model Tests Discrimination
## Indexes
## Obs 241 LR chi2 7.65 R2 0.031
## Events 163 d.f. 1 Dxy 0.124
## Center 0.2514 Pr(> chi2) 0.0057 g 0.218
## Score chi2 7.59 gr 1.243
## Pr(> chi2) 0.0059
##
## Coef S.E. Wald Z Pr(>|Z|)
## sex=male 0.4422 0.1618 2.73 0.0063
##
Predict(cox_sex_death) %>%
plot
- …to those of the Fine and Gray model
For administrative questions, i.e. overall risk of experiment each event taking into account the competing risk
mgus_num <- mgus_df %>%
mutate(sex = as.numeric(sex))
crr( # We do not have a plot method for crr
ftime = mgus_num$time_t,
fstatus = mgus_num$indicator,
cov1 = mgus_num$sex
) %>%
summary
## Competing Risks Regression
##
## Call:
## crr(ftime = mgus_num$time_t, fstatus = mgus_num$indicator, cov1 = mgus_num$sex)
##
## coef exp(coef) se(coef) z p-value
## mgus_num$sex1 -0.339 0.713 0.249 -1.36 0.17
##
## exp(coef) exp(-coef) 2.5% 97.5%
## mgus_num$sex1 0.713 1.4 0.437 1.16
##
## Num. cases = 241
## Pseudo Log-likelihood = -341
## Pseudo likelihood ratio test = 1.83 on 1 df,
References
Gray, Robert J. 1988. “A Class of K-Sample Tests for Comparing the Cumulative Incidence of a Competing Risk.” The Annals of Statistics. JSTOR, 1141–54.