I?m trying to set up proportional hazard model that is stratified with respect to covariate 1 and has an interaction between covariate 1 and another variable, covariate 2. Both variables are categorical. In the following, I try to illustrate the two problems that I?ve encountered, using the lung dataset. The first problem is the warning: To me, it seems that there are too many dummies generated. The second problem is the error: -- View this message in context: http://r.789695.n4.nabble.com/Problems-with-coxph-and-survfit-in-a-stratified-model-with-interactions-tp4646096.html Sent from the R help mailing list archive at Nabble.com.
Problems with coxph and survfit in a stratified model with interactions
6 messages · Chris Andrews, rm, Andrews, Chris
Dear all,
Reading a previous post to this forum, I think I managed to get the part of
the code that generates the dataframe (that I input as newdata to survift)
fixed. The problem was that I had not specified the full set of possible
levels for the 2 factors in the dataframe.
Unfortunately, I still get the error ? number of variables != number of
variable names? when I try to fit newdata to a very simple model that is
stratified with respect to cov1 and has an interaction between cov1 and
cov2. It seems that it is the interaction term cov1:cov2 that causes the
problem.
Any help would be much appreciated. Complete code follows.
Roland
require(survival)
data(lung)
#
lung$cov1 <- as.factor(lung$ph.ecog)
lung$cov2 <- as.factor(lung$sex)
levels(lung$cov1)[levels(lung$cov1)==0] <- "zero"
levels(lung$cov1)[levels(lung$cov1)==1] <- "one"
levels(lung$cov1)[levels(lung$cov1)==2] <- "two"
levels(lung$cov1)[levels(lung$cov1)==3] <- "three"
levels(lung$cov2)[levels(lung$cov2)==1] <- "male"
levels(lung$cov2)[levels(lung$cov2)==2] <- "female"
#
df <- data.frame(
cov1=factor("one", levels = levels(lung$cov1)),
cov2=factor("male", levels = levels(lung$cov2))
)
#the following 2 lines work
sCox <- coxph(Surv(time, status) ~ cov1 + cov1:cov2, data=lung)
sfCox <- survfit(sCox,newdata=df)
#as do the following
sCox <- coxph(Surv(time, status) ~ strata(cov1) + cov2, data=lung)
sfCox <- survfit(sCox,newdata=df)
#whereas the following 2 lines doesn't, any suggestions of how to fix them?
sCox <- coxph(Surv(time, status) ~ strata(cov1) + cov1:cov2, data=lung)
sfCox <- survfit(sCox,newdata=df)
--
View this message in context: http://r.789695.n4.nabble.com/Problems-with-coxph-and-survfit-in-a-stratified-model-with-interactions-tp4646096p4646159.html
Sent from the R help mailing list archive at Nabble.com.
rm wrote
I?m trying to set up proportional hazard model that is stratified with respect to covariate 1 and has an interaction between covariate 1 and another variable, covariate 2. Both variables are categorical. In the following, I try to illustrate the two problems that I?ve encountered, using the lung dataset.
The first problem is the warning:
To me, it seems that there are too many dummies generated. The second problem is the error:
Adding the main effect of cov2 allows the model to run
library(survival)
lung$cov1 <- factor(lung$ph.ecog, 0:3, c("zero","one","two","three"))
lung$cov2 <- factor(lung$sex, 1:2, c("male","female"))
sCox <- coxph(Surv(time, status) ~ strata(cov1) + cov2 + cov1:cov2,
data=lung)
df <- data.frame(
cov1=factor(0:3, 0:3, c("zero","one","two","three"))[1],
cov2=factor(1:2, 1:2, c("male","female"))[1]
)
df
sfCox <- survfit(sCox, newdata=df)
plot(sfCox[1], conf=FALSE)
lines(sfCox[2], col=2)
lines(sfCox[3], col=3)
lines(sfCox[4], col=4)
--
View this message in context: http://r.789695.n4.nabble.com/Problems-with-coxph-and-survfit-in-a-stratified-model-with-interactions-tp4646096p4646241.html
Sent from the R help mailing list archive at Nabble.com.
Dear Terry,
Thanks for your valuable comments! Could you please restate the example that
you refer to? If I include cov1 in the data frame, as you suggest, I get the
following error message. Complete code follows.
? Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels"
Regarding Chris?s suggestion, I would prefer not to have the main effect of
cov2 in the model. I would prefer to stick with ?~ strata(cov1):cov2? to
keep the interpretation of the regression coefficient straightforward.
Best regards,
Roland
require(survival)
data(lung)
#
lung$cov1 <- as.factor(lung$ph.ecog)
lung$cov2 <- as.factor(lung$sex)
levels(lung$cov1)[levels(lung$cov1)==0] <- "zero"
levels(lung$cov1)[levels(lung$cov1)==1] <- "one"
levels(lung$cov1)[levels(lung$cov1)==2] <- "two"
levels(lung$cov1)[levels(lung$cov1)==3] <- "three"
levels(lung$cov2)[levels(lung$cov2)==1] <- "male"
levels(lung$cov2)[levels(lung$cov2)==2] <- "female"
#
df <- data.frame(
cov1=factor("one", levels = levels(lung$cov1)),
cov2=factor("female", levels = levels(lung$cov2))
)
sCox <- coxph(Surv(time, status) ~ strata(cov1):cov2, data=lung)
sfCox <- survfit(sCox,newdata=df)
--
View this message in context: http://r.789695.n4.nabble.com/Problems-with-coxph-and-survfit-in-a-stratified-model-with-interactions-tp4646096p4646245.html
Sent from the R help mailing list archive at Nabble.com.
Hi Roland (and Terry),
Is this the model you want to fit?
--a separate 'baseline' hazard for each stratum defined by cov1
--a coefficient for cov2 that is different for each stratum defined by cov1
Then you can have a separate call to coxph for each stratum.
sCox0 <- coxph(Surv(time, status) ~ cov2, data=lung, subset=cov1=="zero")
sCox1 <- coxph(Surv(time, status) ~ cov2, data=lung, subset=cov1=="one")
sCox2 <- coxph(Surv(time, status) ~ cov2, data=lung, subset=cov1=="two")
I compared betas of the two approaches and found them to be equivalent.
***HOWEVER, the plots of stratum=="two" for men are not equal. And I don't yet understand that.
Chris
library(survival)
lung$cov1 <- factor(lung$ph.ecog, 0:3, c("zero","one","two","three"))
lung$cov2 <- factor(lung$sex, 1:2, c("male","female"))
# first suggestion
sCox <- coxph(Surv(time, status) ~ strata(cov1) + cov2 + cov1 :cov2, data=lung) # warning, X matrix singular
# Fit strata separately
sCox0 <- coxph(Surv(time, status) ~ cov2, data=lung, subset=cov1=="zero")
sCox1 <- coxph(Surv(time, status) ~ cov2, data=lung, subset=cov1=="one")
sCox2 <- coxph(Surv(time, status) ~ cov2, data=lung, subset=cov1=="two")
sCox3 <- coxph(Surv(time, status) ~ cov2, data=lung, subset=cov1=="three") # warning, only 1 observation
# compare summaries
summary(sCox)
#cov2female -0.6151
#cov2male:cov1one 0.0182
#cov2male:cov1two -0.2513
summary(sCox0)
#cov2female -0.615 (as above)
summary(sCox1)
#cov2female -0.633 (-0.6151 - 0.0182)
summary(sCox2)
#cov2female -0.364 (-0.6151 + 0.2513)
summary(sCox3)
#cov2female 0 1 0 NA NA
# plot
df <- data.frame(
cov1 =factor(rep(0:3 , each=2), 0:3, c("zero","one","two","three")),
cov2 =factor(rep(1:2 ,times=4), 1:2, c("male","female"))
)
df
# fit 4 curves for men, method 1
sfCox <- survfit(sCox , newdata=df[1,])
# fit 4 curves for men, method 2
sfCox0 <- survfit(sCox0, newdata=df[1,])
sfCox1 <- survfit(sCox1, newdata=df[3,])
sfCox2 <- survfit(sCox2, newdata=df[5,])
sfCox3 <- survfit(sCox3, newdata=df[7,])
# plot, men, method 1
plot (sfCox[1], col=1, mark.time=FALSE, conf=FALSE)
lines(sfCox[2], col=2, mark.time=FALSE)
lines(sfCox[3], col=3, mark.time=FALSE)
lines(sfCox[4], col=4, mark.time=FALSE)
# plot, men, method 2
lines(sfCox0, lty=2, lwd=3, mark.time=FALSE, col=1)
lines(sfCox1, lty=2, lwd=3, mark.time=FALSE, col=2)
lines(sfCox2, lty=2, lwd=3, mark.time=FALSE, col=3) # miss?!
lines(sfCox3, lty=2, lwd=3, mark.time=FALSE, col=4)
# repeat for women
# fit 4 curves for women, method 1
sfCoxw <- survfit(sCox , newdata=df[2,])
# fit 4 curves for women, method 2
sfCoxw0 <- survfit(sCox0, newdata=df[2,])
sfCoxw1 <- survfit(sCox1, newdata=df[4,])
sfCoxw2 <- survfit(sCox2, newdata=df[6,])
sfCoxw3 <- survfit(sCox3, newdata=df[8,])
# plot, women, method 1
plot (sfCoxw[1], col=1, mark.time=FALSE, conf=FALSE)
lines(sfCoxw[2], col=2, mark.time=FALSE)
lines(sfCoxw[3], col=3, mark.time=FALSE)
lines(sfCoxw[4], col=4, mark.time=FALSE)
# plot, women, method 2
lines(sfCoxw0, lty=2, lwd=3, mark.time=FALSE, col=1)
lines(sfCoxw1, lty=2, lwd=3, mark.time=FALSE, col=2)
lines(sfCoxw2, lty=2, lwd=3, mark.time=FALSE, col=3)
lines(sfCoxw3, lty=2, lwd=3, mark.time=FALSE, col=4)
-----Original Message-----
From: rm [mailto:rm at wippies.se]
Sent: Monday, October 15, 2012 11:10 AM
To: r-help at r-project.org
Subject: Re: [R] Problems with coxph and survfit in a stratified model with interactions
Dear Terry,
Thanks for your valuable comments! Could you please restate the example that you refer to? If I include cov1 in the data frame, as you suggest, I get the following error message. Complete code follows.
? Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels"
Regarding Chris?s suggestion, I would prefer not to have the main effect of
cov2 in the model. I would prefer to stick with ?~ strata(cov1):cov2? to keep the interpretation of the regression coefficient straightforward.
Best regards,
Roland
require(survival)
data(lung)
#
lung$cov1 <- as.factor(lung$ph.ecog)
lung$cov2 <- as.factor(lung$sex)
levels(lung$cov1)[levels(lung$cov1)==0] <- "zero"
levels(lung$cov1)[levels(lung$cov1)==1] <- "one"
levels(lung$cov1)[levels(lung$cov1)==2] <- "two"
levels(lung$cov1)[levels(lung$cov1)==3] <- "three"
levels(lung$cov2)[levels(lung$cov2)==1] <- "male"
levels(lung$cov2)[levels(lung$cov2)==2] <- "female"
#
df <- data.frame(
cov1=factor("one", levels = levels(lung$cov1)),
cov2=factor("female", levels = levels(lung$cov2))
)
sCox <- coxph(Surv(time, status) ~ strata(cov1):cov2, data=lung) sfCox <- survfit(sCox,newdata=df)
--
View this message in context: http://r.789695.n4.nabble.com/Problems-with-coxph-and-survfit-in-a-stratified-model-with-interactions-tp4646096p4646245.html
Sent from the R help mailing list archive at Nabble.com.
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
Hi Chris, Yes, that is exactly the model I want to fit, i.e. a separate 'baseline' hazard for each stratum (defined by cov1) and a coefficient for cov2 that is different for each stratum. Is suspect that there is a problem with the following line of your code. sCox <- coxph(Surv(time, status) ~ strata(cov1) + cov2 + cov1 :cov2, data=lung) # warning, X matrix singular The reason why I suspect is that in an earlier post <http://r.789695.n4.nabble.com/Re-Problems-with-coxph-and-survfit-in-a-stratified-model-with-interactions-td4646171.html> Terry wrote that ? ..by using ~ strata(cov1) + cov1:cov2 you fooled the program into thinking that there was no strata by covariate interaction, ? Best regards, Roland -- View this message in context: http://r.789695.n4.nabble.com/Problems-with-coxph-and-survfit-in-a-stratified-model-with-interactions-tp4646096p4646431.html Sent from the R help mailing list archive at Nabble.com.