Skip to content
Prev 58420 / 398502 Next

R: [R] time dependency of Cox regression

Hi,
in order to fit Cox model with time-dependent coeff, you have to restruct
your dataframe. For instance you
can use the counting process formulation "(start,stop,status)".
Some years ago I wrote a function (reCox() below) to make the job. It seems
to work if there are not ties, but
if there are ties it works but some problem rises (results do not match
exactly). Let me know if you can fix such problem.

Note that in the restructed dataset, a (linear) time varying effect for the
variable X, say, may be included by the "X:stop" term in the model.

Hope this helps,
regards,
vito

#example taken from ?cph in the Hmisc package
n <- 100
set.seed(731)
age <- 50 + 12*rnorm(n)
sex <- factor(sample(c('Male','Female'), n,rep=TRUE, prob=c(.6, .4)))
cens <- 15*runif(n)
h <- .02*exp(.04*(age-50)+.8*(sex=='Female'))
ttime <- -log(runif(n))/h
status <- ifelse(ttime <= cens,1,0)
ttime <- pmin(ttime, cens)
d<-data.frame(ttime=ttime,status=status,age=age,sex=sex)
#restruct the original dataframe
dd<-reCox(d)
#compare fitted model without ties (minor (negligible?) differences)
coxph(Surv(ttime,status) ~ age + sex,data=d)
coxph(Surv(start,stop,status) ~ age + sex,data=dd)

#data with ties (major differences)
d$ttime[d$status==1]<-round(d$ttime[d$status==1],1)
dd<-reCox(d,k=1:2,epss=0.01) #here you have to specify epss>0 to allow
coxph() to work
coxph(Surv(ttime,status) ~ age + sex,data=d)
coxph(Surv(start,stop,status) ~ age + sex,data=dd)


reCox<-function(data,k=1:2,epss=0){
#FUNZIONA SOLO SE NON CI SONO TIES :-(
#(Preliminary) function to reshape dataframe according to the
"counting-process" formulation
#       author: <vito.muggeo at giustizia.it>
#data: the data-frame to be transformed
#k: indices of SurvTime and Status variables in data
#epss: if for any new record start==stop, then stop is incremented by epss
        if(ncol(data)<=3) data[,"tmp"]<-rep(99,nrow(data))
        dati<-data[order(data[,k[1]]),] #order(unique(surv.time))#???
        status<-dati[,k[2]]
        b<-dati[,-k]
        dati[,"start"]<-rep(0,nrow(dati))
        names(dati)[k[1]]<-"stop"
        n<-nrow(dati)
        a<-matrix(-99,(n*(n-1)/2+n),3)
        a[1,]<-c(1,as.numeric(as.matrix(dati)[1,c("start","stop")]))
        colnames(a)<-c('id.new','start','stop')
        a[,"id.new"]<-rep(1:n,1:n)
      for(i in 2:nrow(dati)){

a[a[,1]==i,-1]<-rbind(a[a[,1]==(i-1),2:3],c(dati[(i-1),"stop"],dati[i,"stop"
]))
                             }
        a<-cbind(a,status=rep(0,nrow(a)))
        a[cumsum(1:n),"status"]<-status
        bb<-sapply(b,function(x)rep(x,1:n))#le categorie le trasforma in
numeri....
        #bb<-lapply(b,function(x)rep(x,1:n))
        #bb<-apply(b,2,function(x)rep(x,1:n))NO!!!
        A<-data.frame(cbind(a,bb),row.names=NULL)
        #if(!missing(epss))
A[,"stop"]<-A[,"stop"]+ifelse(A[,"stop"]==A[,"start"],epss,0)
        if(epss>0)
A[,"stop"]<-A[,"stop"]+ifelse(A[,"stop"]==A[,"start"],epss,0)
        A$tmp<-NULL
        return(A)
        }



----- Original Message -----
From: array chip <arrayprofile at yahoo.com>
To: <R-help at stat.math.ethz.ch>
Sent: Wednesday, November 03, 2004 12:41 AM
Subject: [R] time dependency of Cox regression
http://www.R-project.org/posting-guide.html