Prediction in Cox Proportional-Hazard Regression - R-help

Giuseppe.Palermo@bo.infn.it

Thu, Jun 9, 2005 1:37 AM #

He,
I used the "coxph" function, with four covariates.

Let's say something like that

So I obtain the 4 coefficients B1,B2,B3,B4 such that

h(t) = h0(t) exp(B1*X1+ B2*X2 + B3*X3 + B4*X4).

When I use the function on the same data

how it works in making the prediction?
I mean which is the formula, given the data-point P1=[X1(1),X2(1),X3(1),X4(1)],
that the function "predict.coxph" use to make the prediction of P1.

I really hope that someone will reply to my question.

Best regards to all
Giuseppe

Brian Ripley

Thu, Jun 9, 2005 2:13 AM #

On Thu, 9 Jun 2005 Giuseppe.Palermo at bo.infn.it wrote:

How does that work?  predict.coxph is not an exported function!

if (type == "lp" || type == "risk") {
         if (missing(newdata)) {
             pred <- object$linear.predictors
             names(pred) <- names(object$residuals)
         }
         else pred <- x %*% coef + offset
...

so that is the formula it uses.  As you did not supply 'newdata', it 
quotes the 'linear.predictors' component of the fit: see ?coxph.object.

Effectively it centred the explanatory variables on their means and then 
applied the linear regression formula to give the linear predictor. It is 
the centring that may be non-obvious: effectively h_0(t), the baseline 
hazard, is taken at the average of the subjects.

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Giuseppe.Palermo@bo.infn.it

Thu, Jun 9, 2005 3:13 AM #

Quoting Prof Brian Ripley <ripley at stats.ox.ac.uk>:

Dear Prof. Ripley
Thanks for replying to me email.
I only have an other question:

since h(t) = h0(t) exp(B1*X1+ B2*X2 + B3*X3 + B4*X4)
represent the hazard at time t.

In a linear prediction,
what     Value = B1*(X1-mean(X1)) + B2*(X2-mean(X2)) + ....
represent?

Brian Ripley

Thu, Jun 9, 2005 3:19 AM #

On Thu, 9 Jun 2005 Giuseppe.Palermo at bo.infn.it wrote:

The linear predictor, as you asked for.

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Thomas Lumley

Thu, Jun 9, 2005 6:57 AM #

On Thu, 9 Jun 2005 Giuseppe.Palermo at bo.infn.it wrote:

coxph() parametrizes the model so that

     h(t)=h_0(t)exp(B1(X1-mean(X1))+B2(X2-mean(X2))

as Brian pointed out.  This doesn't affect the coefficients B1, B2,..., it 
just redefines h_0 to be the hazard at mean covariates rather than at zero 
covariates.

The reason is that this makes h_0(t) more likely to be a useful thing to 
estimate. For example, if one covariate is age then extrapolating the 
baseline hazard to age zero is numerically unreliable and not very 
interesting.

 	-thomas