Skip to content

further notes on model.frame issue

2 messages · Terry Therneau, Charles C. Berry

#
This is a follow-up on my note of Saturday.  Let me start with two important 
clarifications
    - I think this would be a nice addition, but I've had exactly one use for it 
in the 15+ years of developing the survival package.  
    - I have a work around for the current case.   
Prioritize accordingly.

The ideal would be to change survexp as follows:
    fit <- survexp( ~ gender, data=mydata, ratetable=survexp.us,
    	ratevar=list(sex=gender, year=enroll.dt, age=age*365.25))
 
The model statement says that I want separate curves by gender, and is similar 
to other model statements.

The ratevar option gives the mapping between my variable names and the dimnames 
of the survexp.us rate table.  It wants age in days, enrollment date to be some 
sort of date object, and sex to be a factor.  Then the heading of the R code 
would be
	m <- match.call()
	m <- m[c(1, match(names(m), c('data','formula','na.action', 'subset',
		                      'weights', 'ratevar'), nomatch=0)
        m[[1]] <- as.name('model.frame')
        m <- eval.parent(m)

That is, the variables enroll.dat and age are searched for in the data= arg. 
This is like the start opion in glm, but a more complex result than a vector.

  The model.frame function can't handle this.  (Splus fails too, same spot, less 
useful error message).  
  
-----

  The current code uses 
  	fit <- survexp(~ gender + ratetable(sex=gender, year=enroll.dt, 
age=age*365.25), 
  	      data=mydata, ratetable=survexp.us)
 
The ratetable function creates a matrix with extra attributes. The matrix 
contains as.numeric of the factors with the levels remembered as an extra 
attribute, and also looks out for dates.  So the result is like ns() in the eyes 
of model.frame, and it works.  But having to write gender twice on the rhs is 
confusing to users.  

    Thanks in advance for any comments.
    
    	Terry Therneau
#
On Mon, 19 Jan 2009, Terry Therneau wrote:

            
Maybe I am missing something. Why not do something like this?

ratetable <- cbind ## for illustration purposes

foo <- function(formula,ratevars,data,...){
   m <- match.call()
   mfun <- m$ratevars
   mfun[[1]] <- as.name('ratetable')
   m$ratevars <- NULL
   frm <- eval(m$formula)
   frm <- update(frm,as.formula( paste("~.+",deparse(substitute(mfun)))))
   m$formula <- frm
   m[[1]] <- as.name('model.frame')
   m <- eval(m,parent.frame())
   m ## to show that model frame obeys
}

dat <- data.frame(diag(4))
foo(~X1+X2,list(sex=X3,year=X4),dat)

HTH,

Chuck
Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901