Skip to content

What's the baseline model when using coxph with factor variables?

5 messages · William Dunlap, David Winsemius, Andreas Schlicker

#
Hi all,

I'm trying to fit a Cox regression model with two factor variables but 
have some problems with the interpretation of the results. Considering 
the following model, where var1 and var2 can assume value 0 and 1:

coxph(Surv(time, cens) ~ factor(var1) * factor(var2),  data=temp)

What is the baseline model? Is that considering the whole population or 
the case when both var1 and var2 = 0?

Kind regards,
andi
#
On Dec 1, 2011, at 12:00 PM, Andreas Schlicker wrote:

            
This has been discussed several times in the past on rhelp. My  
suggestion would be to search your favorite rhelp archive using   
"baseline hazard Therneau", since Terry Therneau is the author of  
survival. (The answer is closer to the first than to the second.)
David Winsemius, MD
West Hartford, CT
#
Terry will correct me if I'm wrong, but I don't think the
answer to this question is specific to the coxph function.
For all the [well-written] formula-based modelling functions
(essentially, those that call model.frame and model.matrix to interpret
the formula) the option "contrasts" controls how factor
variables are parameterized in the model matrix.  contr.treatment
makes the baseline the first factor level, contr.SAS makes
the baseline the last, contr.sum makes the baseline the mean,
etc.  E.g.,
cens=rep(c(0,0,1), len=20),
                   var1=factor(rep(0:1, each=10)),
                   var2=factor(rep(0:1, 10)))
Call:
coxph(formula = Surv(time, cens) ~ var1 + var2, data = df)


        coef exp(coef) se(coef)      z    p
var11 0.1640      1.18    0.822 0.1995 0.84
var21 0.0806      1.08    0.830 0.0971 0.92

Likelihood ratio test=0.05  on 2 df, p=0.974  n= 20, number of events= 6
Call:
coxph(formula = Surv(time, cens) ~ var1 + var2, data = df)


         coef exp(coef) se(coef)       z    p
var10 -0.1640     0.849    0.822 -0.1995 0.84
var20 -0.0806     0.923    0.830 -0.0971 0.92

Likelihood ratio test=0.05  on 2 df, p=0.974  n= 20, number of events= 6
Call:
coxph(formula = Surv(time, cens) ~ var1 + var2, data = df)


         coef exp(coef) se(coef)       z    p
var11 -0.0820     0.921    0.411 -0.1995 0.84
var21 -0.0403     0.960    0.415 -0.0971 0.92

Likelihood ratio test=0.05  on 2 df, p=0.974  n= 20, number of events= 6

(lm() has a contrasts argument that can override getOption("contrasts")
and set different contrasts for each variable but coxph() does not have
that argument.)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
On Dec 1, 2011, at 1:00 PM, William Dunlap wrote:

            
It depends on our interpretation of the questioner's intent. My answer  
was predicated on the assumption that the phrase "baseline model"  
meant baseline survival function, ... S_0(t) in survival analysis  
notation.
David Winsemius, MD
West Hartford, CT
#
William and David, thanks for your help.
The contrasts option was indeed what I was looking for but didn't find.

andi
On 01.12.2011 20:56, David Winsemius wrote: