How the concordance index is calculated in Cox model if the actual event times are not predicted?

I am new to the field of survival analysis. I was reading about the interpretation of C-index and realized it only cares about the sequence of predictions. I was always using the sci-kit survival package and never deeply though how the C-index is calculated if the actual survival times are not predicted in Cox proportional hazard model. I would appreciate if someone simply explain this to me.

Cross Validated Asked on November 14, 2021

2 Answers

2 Answers

You are correct that time is not the default output of a Cox model. However, for any given unit with its covariate pattern, the model gives a relative hazard. By definition, units with higher hazard ratios should have shorter time to event. The censored c-index compares the estimated hazard ratio to both the actual event status and actual time to event (or censoring time) to produce its estimate.

Answered by Todd D on November 14, 2021

Below is my attempt to answer this question.

Concordance index is a measure of how discriminant your model is.
For survival analysis, say you have a covariate $X$ and a survival time $T$.
Assume that higher values of $X$ imply shorter value for $T$ (thus $X$ has a deleterious effect on $T$).
Discrimination means that you are able to say, with high reliability, that between two patients which one will have a shorter survival time.

For a perfectly discriminative model, if you pick two sujects at random $(X_1,T_1)$ and $(X_2,T_2)$ then the one with the largest value of $X$ will have, with probability $1$, a shorter survival time:

$$ c=mathbb P( T_1 < T_2 mid X_1 geq X_2) = 1 $$

In your dataset if you pick two patients at random, there is 4 cases:

  1. $X_1 geq X_2$ and $T_1 < T_2$ : There is corcordance $(C)$
  2. $X_1 geq X_2$ and $T_1 > T_2$ : Discordance $(D)$
  3. $X_1 = X_2$ : Equal risks $(R)$
  4. $T_1 = T_2$ : Equal times

The last case is not taken into account to estimate the concordance (at least I think so).

In case $3$, since the two patients have the same risk, the best you can do to say which one will have the shorter survival time is to toss a fair coin.

The estimated concordance index based on your data is:

$$ hat c= frac{C+frac{R}{2}}{C+D+R} $$ where $C$, $D$ are the total number of concordant, discordant couples, $R$ the total number of couple with the exact same risk. The $frac{R}{2}$ at the numerator comes from the coin toss.

When there is censoring (as often with survival data) the computation of $hat c$ is modified but the idea and interpretation of $c$ remains the same.


Say you have $8$ patients with data: begin{array}{c| c|c} text{Id} & text{Time} (T) & X \ hline 1 & 1 & 1 \ 2 & 2 & 3 \ 3 & 3 & 2 \ 4 & 12 & 10 \ 5 & 17 & 15 \ 6 & 27 & 40 \ 7 & 36 & 60 \ 8 & 55 & 80 end{array}

In that case, we see that larger values of $X$ imply larger values of $T$. Thus a couple is concordant if $X_1 < X_2$ and $T_1 < T_2$.

There are $binom{8}{2}=28$ choices of couples of patients, among those only the couple $(2,3)$ is discordant (since $X_2 > X_3$ but $T_2 < T_3$). There is no couple with equal risk thus $R=0$.

Then the estimated concordance index is $frac{27}{28} approx 0.964$.

You can check this with the R package survival (sorry I'm not used to survival analysis with Python):

data<-data.frame(matrix(c(time,X),ncol=2,8,byrow = F))
mod$concordance #~0.964

So to answer your question about predicted times, you can see that neither the values of $T$ or $X$ change the estimation of $c$: it's only a matter of ordering between predictor and survival times. You can change the value in the previous example without breaking the number of concordant/discordant couples and still have the same estimated concordance.

In which direction should I look for the covariate $X$?

Is a couple concordant if $X_1 > X_2$ and $T_1 < T_2$ or if $X_1 < X_2$ and $T_1 < T_2$?

For the Cox model, it depends on the estimated hazard-ratio. If the ratio, $e^beta$ is $>1$ then larger values of $X$ imply larger risks thus shorter times. So if $e^beta > 1$ a couple is concordant if $X_1 > X_2$ and $T_1 < T_2$, and if $e^beta < 1$ a couple is concordant if $X_1 < X_2$ and $T_1 < T_2$.

Finally in the case of a vector of covariates, I think the procedure remain the same but instead of using the vector $X$ we use the predicted risk $hat beta X$ with $hat beta$ estimated from the Cox model.

Answered by periwinkle on November 14, 2021

Add your own answers!

Related Questions

What statistical analysis to used for kinetic data with multiple groups?

1  Asked on August 5, 2020 by carlos-valenzuela


Random forest after cross validation

1  Asked on August 1, 2020 by steven-niggebrugge


Grey relation between two datasets?

0  Asked on July 31, 2020 by msilvy


What is the seasonal trend lowess model in time series?

0  Asked on July 28, 2020 by christopher-u


Extended Cox model and cox.zph

2  Asked on July 25, 2020 by finance


Ask a Question

Get help from others!

© 2021 All rights reserved.