survival::predict.coxph - R-help

Terry Therneau · 2009-02-26T14:09:03Z

You are mostly correct. Because of the censoring issue, there is no good estimate of the mean survival time. The survival curve either does not go to zero, or gets very noisy near the right hand tail (large standard error); a smooth parametric estimate is what is really needed to deal with this. For this reason the mean survival, though computed (but see the survfit.print.mean option, help(print.survfit)) is not highly regarded. It is not an option in predict.coxph. Terry T.

Bernhard Reinhardt

Fri, Feb 27, 2009 3:14 AM #

Hello Therry,

it?s really great to receive some feedback from a "pro". I?m not sure if 
I?ve got the point right:
You suppose that the cox-model isn?t good at forecasting an expected 
survival time because of the issues with the prediction of the 
survival-function at the right tail and one should better use parametric 
models like an exponential model? Or what do you mean by "smooth 
parametric estimate"?
Anyways I just ordered your book at the library. Hopefully I?ll get some 
more insights by the lecture of it.

Maybe I should point out why I even tried to do such forecasts.

Following the article "Quantifying climate-related risks and 
uncertainties using Cox regression models" by Maia and Meinke I try to 
deduce winter-precipitation from lagged Sea-Surface-Temperatures (SSTs).
So precipitation is my survival-time and and the SST-Observations at 
different lags are my covariates.
The sample size is only 55 and I?ve got 11 covariates (Lag=0 months to 
Lag=10 months) to choose from.
My first goal is to identify the optimal time-lag(s) between 
SST-Anomaly-Observation and Precipitation-Observation.
Expectation was that the lag should be some months.

I thought a cox-model would easily provide such a selection. At first I 
used the covariates individually. Coefficients for lags between 0 and 5 
months were all quite big and then decreasing from 6 to 10 months. So I 
think 5 months could be the lag of the process and high persistence of 
the SST accounts for the big coefficients for 0-4 months.

As the next step I used all 11 covariates at once. I hoped to gain 
similar results. Instead the sign of the coefficients "randomly" jumps 
from plus to minus and the magnitude as well is randomly distributed.

I also tried to using sets of three covariates e.g. with lag 4,5,6. But 
even then the sign of the coefficients is varying.

So my thought was that maybe I overfitted the model. But in fact I did 
not find any literature if that?s even possible. As far as my limited 
knowledge goes, overfitted models should reproduce the training-period 
very good but other periods very poor. So I first tried to reproduce the 
training-period. But so far with no success - as well with using 11 
covariates or just 1.

Regards

Bernhard R.

Terry Therneau wrote: