Skip to content

Survival Regression with multiple events per subject

4 messages · Terry Therneau, Fabian Hefner, Dimitris Rizopoulos +1 more

#
Data sets with multiple records per subjects are used for several things, you 
need to tell me what it is that you want to accomplish.  Multiple records is a 
method, not a goal.
  
  1. Robust variance: If each observation is a separate measurement on the 
subject, with it's own covariates, time 0,  and endpoint, and you want a "GEE" 
type variance that accounts for the fact that multiple observations are for the 
same subject:
   survreg(Surv(time, exercise) ~ itm + posret + negret + cluster(id), ...
where id is a variable that is unique for unique subjects.
   
  2. Time dependent covariates: Each subject has one endpoint, but covariates 
change over time.  The bookkeeping for time dependent covariates is reasonably 
straightforward for proportional hazards models, but a major pain for an 
accelerated failure time (ACF) model.  I've thought about it but never 
implemented the feature in survreg, though this may change one day due to the 
increased interest in accelerated aging as a biological model among the 
researchers I work with (but don't hold your breath).  For example, if you 
smoked in your youth but later quit, in an ACF model this 'adds years' to your 
biological age which you never lose; the computer code has to keep track of 
covariate histories.  In a proportional hazards model today's risk = 
function(today's covariates), which is easier.  A weibull can be written in 
either ACF or PH form, survreg uses the acf style, I don't know which stata 
uses.

  3. Multiple events per subject, with a single time scale per subject.  This is 
seen in reliability analysis where hazard = function of age.  Survreg does not 
handle this case either.  
  
  	Terry Therneau
1 day later
#
Dear R users!

I reformulate the question with another example perhaps my question will be
more clearly now.
 
I have several subjects. One subject has multiple records. Only a starting
point exists the end point is vague.
Here is an example:
 
   itm      ID     exercise      time
1.401869    1        0             1
1.324390    1        0             2
1.324390    1        0             3
1.333338    1        0             4
1.346761    1        0             5
1.315441    1        1             6
1.337812    2        0             1
1.319915    2        0             2
1.351235    2        1             3
itm is the covariate;
ID is the subject Id;
exercise indicates if the subject is dead=1 or alive=0
 
How can I allocate the multiple records to one subject (for example record
1-6 are part of subject with ID 1 record 7-9 are part of subject with ID2)
and process a survival regression.
 
the "survRegData <- survreg(formula=Surv(time,exercise)~itm, data=Data,
dist="weibull")" command doesn't take into account that multiple records are
part of one subject.
 
Many thanks!
 
Fabian Hefner
#
If 'itm' is a covariate with measurement error, then you could also 
have a look at the 'JM' package.

I hope it helps.

Best,
Dimitris

----
Dimitris Rizopoulos
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
     http://www.student.kuleuven.be/~m0390867/dimitris.htm


----- Original Message ----- 
From: "Fabian Hefner" <fabian-hefner at web.de>
To: "'Terry Therneau'" <therneau at mayo.edu>
Cc: <r-help at r-project.org>
Sent: Wednesday, April 30, 2008 1:25 PM
Subject: [R] Survival Regression with multiple records per subject
Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
#
Em Qua 30 Abr 2008, Fabian Hefner escreveu:
Hi,

If I got it, the time must be converted to time to death, your table must be:
The problem is your itm covariate, may be it must be converted to de 
difference between the first value and the last value.

Inte
Ronaldo