Hi,
in order to fit Cox model with time-dependent coeff, you have to restruct
your dataframe. For instance you
can use the counting process formulation "(start,stop,status)".
Some years ago I wrote a function (reCox() below) to make the job. It seems
to work if there are not ties, but
if there are ties it works but some problem rises (results do not match
exactly). Let me know if you can fix such problem.
Note that in the restructed dataset, a (linear) time varying effect for the
variable X, say, may be included by the "X:stop" term in the model.
Hope this helps,
regards,
vito
#example taken from ?cph in the Hmisc package
n <- 100
set.seed(731)
age <- 50 + 12*rnorm(n)
sex <- factor(sample(c('Male','Female'), n,rep=TRUE, prob=c(.6, .4)))
cens <- 15*runif(n)
h <- .02*exp(.04*(age-50)+.8*(sex=='Female'))
ttime <- -log(runif(n))/h
status <- ifelse(ttime <= cens,1,0)
ttime <- pmin(ttime, cens)
d<-data.frame(ttime=ttime,status=status,age=age,sex=sex)
#restruct the original dataframe
dd<-reCox(d)
#compare fitted model without ties (minor (negligible?) differences)
coxph(Surv(ttime,status) ~ age + sex,data=d)
coxph(Surv(start,stop,status) ~ age + sex,data=dd)
#data with ties (major differences)
d$ttime[d$status==1]<-round(d$ttime[d$status==1],1)
dd<-reCox(d,k=1:2,epss=0.01) #here you have to specify epss>0 to allow
coxph() to work
coxph(Surv(ttime,status) ~ age + sex,data=d)
coxph(Surv(start,stop,status) ~ age + sex,data=dd)
reCox<-function(data,k=1:2,epss=0){
#FUNZIONA SOLO SE NON CI SONO TIES :-(
#(Preliminary) function to reshape dataframe according to the
"counting-process" formulation
# author: <vito.muggeo at giustizia.it>
#data: the data-frame to be transformed
#k: indices of SurvTime and Status variables in data
#epss: if for any new record start==stop, then stop is incremented by epss
if(ncol(data)<=3) data[,"tmp"]<-rep(99,nrow(data))
dati<-data[order(data[,k[1]]),] #order(unique(surv.time))#???
status<-dati[,k[2]]
b<-dati[,-k]
dati[,"start"]<-rep(0,nrow(dati))
names(dati)[k[1]]<-"stop"
n<-nrow(dati)
a<-matrix(-99,(n*(n-1)/2+n),3)
a[1,]<-c(1,as.numeric(as.matrix(dati)[1,c("start","stop")]))
colnames(a)<-c('id.new','start','stop')
a[,"id.new"]<-rep(1:n,1:n)
for(i in 2:nrow(dati)){
a[a[,1]==i,-1]<-rbind(a[a[,1]==(i-1),2:3],c(dati[(i-1),"stop"],dati[i,"stop"
]))
}
a<-cbind(a,status=rep(0,nrow(a)))
a[cumsum(1:n),"status"]<-status
bb<-sapply(b,function(x)rep(x,1:n))#le categorie le trasforma in
numeri....
#bb<-lapply(b,function(x)rep(x,1:n))
#bb<-apply(b,2,function(x)rep(x,1:n))NO!!!
A<-data.frame(cbind(a,bb),row.names=NULL)
#if(!missing(epss))
A[,"stop"]<-A[,"stop"]+ifelse(A[,"stop"]==A[,"start"],epss,0)
if(epss>0)
A[,"stop"]<-A[,"stop"]+ifelse(A[,"stop"]==A[,"start"],epss,0)
A$tmp<-NULL
return(A)
}
----- Original Message -----
From: array chip <arrayprofile at yahoo.com>
To: <R-help at stat.math.ethz.ch>
Sent: Wednesday, November 03, 2004 12:41 AM
Subject: [R] time dependency of Cox regression
Hi,
How can I specify a Cox proportional hazards model
with a covariate which i believe its strength on
survival changes/diminishes with time? The value of
the covariate was only recorded once at the beginning
of the study for each individual (e.g. at the
diagnosis of the disease), so I do not have the time
course data of the covariate for any given individual.
For example, I want to state at the end of the
analysis that the hazard ratio of the covariate is 6
at the beginning, decrease to 3 after 2 years and
decrease to 1.5 after 5 years.
Is this co-called time-dependent covariate? I guess
not, because it's really about the influence of the
covariate (which was measured once at the beginning)
on survival changing over time.
Thanks for any input.
__________________________________
______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!