Skip to content

coxph linear.predictors

7 messages · Bond, Stephen, David Winsemius

#
On Oct 27, 2010, at 12:12 PM, Bond, Stephen wrote:

            
Hazards are not probabilities (since probabilities are constrained to  
the range [0,1] and hazards are unbounded upward.)
That's the linear predictor (the beta*X) and that particular number  
only applies to the first case.
I don't know what that might be and you are not telling us what you  
think it is.
The Nelson-Aalen estimator of the cumulative hazard as a function of  
intervals prior to t is sum( n-event(t)/ n.risk(t))
No. The "baseline hazard", as you are calling this, would be an  
estimate for persons with all covariates = 0, so in this case is for  
women of age=0. (Not a particularly interpretable result in many  
situations. The baseline hazard following treated ovarian cancer for  
neonates is not medically sensible.)

What is the purpose of this request? Is someone telling you you need  
to provide estimates for instantaneous hazards?
#
I think this the calculation of the linear predictor, multiplying
I expected that coef(fit)*model.matrix(fit)[1,1]= fit$linear.predictors[1]

And since it does not, I am lost.

Ultimately I need unconditional death probability from day 1 to day t in the end. Survfit takes too much time as it does a lot of things I do not need. I hope I should be able to get the death prob for every subject from a single call to survfit and then rescale with exp(linear.predictor) for each subject.

Just a comment: as there is not deltat argument, I think that coxph assumes deltat=1 so the hazard is the conditional probability over the next time interval ( t(i-1), t(i) ). With dense data sampled daily this is a day interval.

Your comments are appreciated.

Stephen Bond 

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net] 
Sent: Wednesday, October 27, 2010 1:15 PM
To: Bond, Stephen
Cc: r-help at r-project.org
Subject: Re: [R] coxph linear.predictors
On Oct 27, 2010, at 12:12 PM, Bond, Stephen wrote:

            
Hazards are not probabilities (since probabilities are constrained to  
the range [0,1] and hazards are unbounded upward.)
That's the linear predictor (the beta*X) and that particular number  
only applies to the first case.
I don't know what that might be and you are not telling us what you  
think it is.
The Nelson-Aalen estimator of the cumulative hazard as a function of  
intervals prior to t is sum( n-event(t)/ n.risk(t))
No. The "baseline hazard", as you are calling this, would be an  
estimate for persons with all covariates = 0, so in this case is for  
women of age=0. (Not a particularly interpretable result in many  
situations. The baseline hazard following treated ovarian cancer for  
neonates is not medically sensible.)

What is the purpose of this request? Is someone telling you you need  
to provide estimates for instantaneous hazards?
#
On Oct 27, 2010, at 1:45 PM, Bond, Stephen wrote:

            
Now it sounds like you do want cap-lambda, the cumulative hazard.  
Let's call it Lambda

Since S(t) = exp(-Lambda(t)), you can invert to get Lambda(t) = - 
log(S(t))

?survfit

Possibly:
plot(-log( survfit(fit)$surv ))
I have run into slowness with that approach myself. The rms package  
provides survest and Predict methods, but if you wanted to stay in  
survival, then why not create a survfit()$surv object based on a  
vector of means and then calculate an exp(newdata) by which you would  
multiply the S(mean|t)? There is also a predict.coxph function with  
its type="risk" parameter that you might want to check.
That's not the way I understand it. The Lambda() and S() functions  
only change at the particular event times on a continuous time so they  
are not assumed to have integer values. Pretty sure that Therneau  
avoids presenting extraction methods for instantaneous hazards from  
Cox models. In his book "Modeling Survival Data" he mostly sticks to  
discussing cumulative hazard functions rather than estimating an  
interval or instantaneous hazard. And fortunately he visits rhelp so  
it's quite possible that my errors in this area will get expert review  
and correction.
David Winsemius, MD
West Hartford, CT
#
What you suggest is exactly what I intended. The big problem is that 

X*beta  != linear.predictor  # it works for glm 

That's a big loose end. The survival documentation on p41 implies that the above should be an equality, but it is not. And I could not produce the surv vector using a variety of formulas, so I guess this is the spirit of SAS disguising itself as an R package.

Stephen Bond 

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net] 
Sent: Wednesday, October 27, 2010 2:17 PM
To: Bond, Stephen
Cc: r-help at r-project.org
Subject: Re: [R] coxph linear.predictors
On Oct 27, 2010, at 1:45 PM, Bond, Stephen wrote:

            
Now it sounds like you do want cap-lambda, the cumulative hazard.  
Let's call it Lambda

Since S(t) = exp(-Lambda(t)), you can invert to get Lambda(t) = - 
log(S(t))

?survfit

Possibly:
plot(-log( survfit(fit)$surv ))
I have run into slowness with that approach myself. The rms package  
provides survest and Predict methods, but if you wanted to stay in  
survival, then why not create a survfit()$surv object based on a  
vector of means and then calculate an exp(newdata) by which you would  
multiply the S(mean|t)? There is also a predict.coxph function with  
its type="risk" parameter that you might want to check.
That's not the way I understand it. The Lambda() and S() functions  
only change at the particular event times on a continuous time so they  
are not assumed to have integer values. Pretty sure that Therneau  
avoids presenting extraction methods for instantaneous hazards from  
Cox models. In his book "Modeling Survival Data" he mostly sticks to  
discussing cumulative hazard functions rather than estimating an  
interval or instantaneous hazard. And fortunately he visits rhelp so  
it's quite possible that my errors in this area will get expert review  
and correction.
David Winsemius, MD
West Hartford, CT
#
I think I found the first piece of it. Typical SAS:
age 
-4.4408920985e-16

Well misrepresented in several places... 

Stephen Bond 

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net] 
Sent: Wednesday, October 27, 2010 1:15 PM
To: Bond, Stephen
Cc: r-help at r-project.org
Subject: Re: [R] coxph linear.predictors
On Oct 27, 2010, at 12:12 PM, Bond, Stephen wrote:

            
Hazards are not probabilities (since probabilities are constrained to  
the range [0,1] and hazards are unbounded upward.)
That's the linear predictor (the beta*X) and that particular number  
only applies to the first case.
I don't know what that might be and you are not telling us what you  
think it is.
The Nelson-Aalen estimator of the cumulative hazard as a function of  
intervals prior to t is sum( n-event(t)/ n.risk(t))
No. The "baseline hazard", as you are calling this, would be an  
estimate for persons with all covariates = 0, so in this case is for  
women of age=0. (Not a particularly interpretable result in many  
situations. The baseline hazard following treated ovarian cancer for  
neonates is not medically sensible.)

What is the purpose of this request? Is someone telling you you need  
to provide estimates for instantaneous hazards?
#
To close the issue:
[1] 0.848450223861993 0.696973203043377 0.530604282995790 0.391136296063732 0.228370373409774 0.132991279576665 0.071742127376216
 [8] 0.030988671457923 0.012340713882587 0.004553665545169 0.000865566322530 0.000135746409544
[1] 0.848450223861993 0.696973203043377 0.530604282995791 0.391136296063732 0.228370373409774 0.132991279576665 0.071742127376216
 [8] 0.030988671457923 0.012340713882587 0.004553665545169 0.000865566322530 0.000135746409544

A single call to survfit will suffice for thousands of subjects, avoiding the iterative loop calculations inside survfit.

Stephen Bond 

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net] 
Sent: Wednesday, October 27, 2010 1:15 PM
To: Bond, Stephen
Cc: r-help at r-project.org
Subject: Re: [R] coxph linear.predictors
On Oct 27, 2010, at 12:12 PM, Bond, Stephen wrote:

            
Hazards are not probabilities (since probabilities are constrained to  
the range [0,1] and hazards are unbounded upward.)
That's the linear predictor (the beta*X) and that particular number  
only applies to the first case.
I don't know what that might be and you are not telling us what you  
think it is.
The Nelson-Aalen estimator of the cumulative hazard as a function of  
intervals prior to t is sum( n-event(t)/ n.risk(t))
No. The "baseline hazard", as you are calling this, would be an  
estimate for persons with all covariates = 0, so in this case is for  
women of age=0. (Not a particularly interpretable result in many  
situations. The baseline hazard following treated ovarian cancer for  
neonates is not medically sensible.)

What is the purpose of this request? Is someone telling you you need  
to provide estimates for instantaneous hazards?