Skip to content

How do I specify a partially completed survival analysis model?

8 messages · RWilliam, David Winsemius, Marc Schwartz

#
Hello,

I just started using R to do epidemiologic simulation research using the Cox
proportional hazard model. I have 2 covariates X1 and X2 which I want to
model as h(t,X)=h0(t)*exp(b1*X1+b2*X2). I assume independence of X from t. 

After I simulate Time and Censor data vectors denoting the censoring time
and status respectively, I can call the following function to fit the data
into the Cox model (a is a data.frame containing 4 columns X1, X2, Time and
Censor):
b = coxph (Surv (Time, Censor) ~ X1 + X2, data = a, method = "breslow");

Now the purpose of me doing simulation is that I have another mechanism to
generate the number b2. From the given b2 (say it's 4.3), Cox model can be
fit to generate b1 and check how feasible the new model is. Thus, my
question is, how do I specify such a model that is partially completed (as
in b2 is known). I tried things like Surv(Time,Censor)~X1+4.3*X2, but it's
not working. Thanks very much.
#
Sorry for being impatient but is there really no way of doing this at all?
It's quite urgent so any help is very much appreciated. Thank you.
RWilliam wrote:

  
    
#
On Nov 20, 2009, at 9:46 AM, RWilliam wrote:

            
The general method with glm's to specify a model with fixed  
coefficients is to use an offset. I believe that the coxph function  
also has that facility and seem to remember that Therneau uses offsets  
in some of the examples he offers in his books and technical reports.

Perhaps:
cmod <- coxph( Surv(Time,Censor)~X1, offset=4.3*X2, data= <dfname>  )

Further requests about specifics should be accompanied (as suggested  
by the Posting Guide) by some code that sets up a reproducible example.
#
On Nov 20, 2009, at 9:57 AM, David Winsemius wrote:

            
Or much more likely:
cmod <- coxph( Surv(Time,Censor)~X1, offset=log(4.3*X2), data=  
<dfname>  )

I forgot what scale I should be thinking on. Sorry.
#
In reply to suggestion by David W., setting an offset parameter doesn't seem
to work as R is not recognizing the "X2" part of  coxph(
Surv(Time,Censor)~X1, offset=log(4.3*X2), data= a ). Also, here's some
sample data:

   X1         X2         Time            Censor
1   1 0.40619454  77.00666      0
2   1 0.20717868 100.00000      0
3   1 0.77360963  79.03463      1
4   1 0.62221954 100.00000      0
5   1 0.32191280 100.00000      0
6   1 0.73790704  72.84842      0
7   1 0.65012237 100.00000      0
8   1 0.71596105 100.00000      0
9   1 0.74787202  84.00172      0
10  1 0.66803790  41.65760      0
11  1 0.79922364  92.41999      0
12  1 0.76433736  90.99983      0
13  1 0.57014524 100.00000      0
14  1 0.39642235 100.00000      0
15  1 0.55756045 100.00000      0
16  0 0.60079340 100.00000      0
17  0 0.43630695 100.00000      0
18  0 0.09388013 100.00000      0
19  0 0.55956791 100.00000      0
20  0 0.52491597  97.71884      1

where we set the coefficient of X2 to be 8.
RWilliam wrote:

  
    
#
On Nov 20, 2009, at 11:07 AM, RWilliam wrote:

            
The problem, arising as a result of not having a dataset against which  
to test my memories of syntactic niceties, is that glm and coxph use  
different methods of supplying offsets. Thereau and Gramsch's book has  
examples, but if you did not have the book you still had alternatives.  
A bit of searching with the terms: coxph Therneau offset;  produced  
lots of hits for the occurrence of offset in warning messages so  
adding -warning to that search then produced a hit to the Google books  
look at T&G's text with a worked example:

 > a$logX2 <- log(a$X2)
 > coxph(Surv(Time,Censor)~X1 + offset(logX2), data= a )
Call:
coxph(formula = Surv(Time, Censor) ~ X1 + offset(logX2), data = a)


      coef exp(coef) se(coef)     z    p
X1 -0.885     0.413     1.43 -0.62 0.54

#Or just:

 > coxph(Surv(Time,Censor)~X1 + offset(log(4.3*X2)), data= a )
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
3 days later
#
On Nov 20, 2009, at 1:27 PM, David Winsemius wrote:

            
It's been pointed out to me that coxph()'s required syntactic  
incorporation of offsets is the same as glm()'s preferred inclusion in  
the formula, and that my erroneous impression that a separate offset  
argument is necessary might have be the result of "SAS poisoning".

I suspect that "infection" is the more correct biomedical analogy,  
since I copied my use from another who was probably the index case.  
That usage was also similar to the separate specification of offsets  
(e.g.  $CAL LPY=%LOG(PY) $OFFSET LPY) in GLIM which was my statistical  
upbringing.
#
On Nov 23, 2009, at 12:50 PM, David Winsemius wrote:

            
Would that be SAS1N1 and is there a vaccine that one can distribute to  
universities and corporations to prevent the spread of the infection?

;-)

Regards,

Marc Schwartz