Skip to content
Back to formatted view

Raw Message

Message-ID: <2f3a88$fud6n@ironport10.mayo.edu>
Date: 2015-04-21T12:32:52Z
From: Terry Therneau
Subject: Predict in glmnet for Cox family
In-Reply-To: <mailman.1.1429610401.30052.r-help@r-project.org>

On 04/21/2015 05:00 AM, r-help-request at r-project.org wrote:
> Dear All,
>
> I am in some difficulty with predicting 'expected time of survival' for each
> observation for a glmnet cox family with LASSO.
>
> I have two dataset 50000 * 450 (obs * Var) and 8000 * 450 (obs * var), I
> considered first one as train and second one as test.
>
> I got the predict output and I am bit lost here,
>
> pre <- predict(fit,type="response", newx =selectedVar[1:20,])
>
>           s0
> 1  0.9454985
> 2  0.6684135
> 3  0.5941740
> 4  0.5241938
> 5  0.5376783
>
> This is the output I am getting - I understood with type "response" gives
> the fitted relative-risk for "cox" family.
>
> I would like to know how I can convert it or change the fitted relative-risk
> to 'expected time of survival' ?
>
> Any help would be great, thanks for all your time and effort.
>
> Sincerely,

The answer is that you cannot predict survival time, in general.  The reason is that most 
studies do not follow the subjects for a sufficiently long time.  For instance, say that 
the data set comes from a study that enrolled subjects and then followed them for up to 5 
years, at which time 35% had experienced mortality (using the usual Kaplan-Meier).  Fit a 
model to the data and ask "what is the predicted survival time for a low risk subject". 
The answer will at best be "greater than 5 years".   The program cannot say if it is 6 or 
10 or even 1000.  A bigger data set does not help.

Terry Therneau