predict()
On Apr 14, 2011, at 16:52 , Ivo Shterev wrote:
Dear Dr. Therneau, Thank you for your response. Just to point out that we didn't experience any problems with the lm() function under R version 2.12.2 (2011-02-25):
I couldn't reproduce it from Terry's description either, but there _is_ an issue which parallels the coxph one: In model.frame.lm, we have
function (formula, ...)
{
dots <- list(...)
nargs <- dots[match(c("data", "na.action", "subset"), names(dots),
0)]
if (length(nargs) || is.null(formula$model)) {
fcall <- formula$call
fcall$method <- "model.frame"
fcall[[1L]] <- as.name("lm")
fcall[names(nargs)] <- nargs
env <- environment(formula$terms)
if (is.null(env))
env <- parent.frame()
eval(fcall, env, parent.frame())
}
else formula$model
}
<environment: namespace:stats>
So under some circumstances, we evaluate formula$call (where "formula" is actually an "lm" object --- and age-old design infelicity) with modified arguments. This evaluation takes place in the environment of the model formula, nested in the parent frame, but neither necessarily contains the arguments to formula$call:
t3
function(dat,fm)
{
model.frame(lm(fm,data=dat),subset=1:5)
}
testdat
otime event x 1 0.84345726 0 -0.4456620 2 0.57661027 0 1.2240818 3 1.32905487 0 0.3598138 4 0.03157736 0 0.4007715 5 0.05621098 0 0.1106827 6 0.31650122 1 -0.5558411 7 0.31422729 1 1.7869131 8 0.14526680 1 0.4978505 9 2.72623646 1 -1.9666172 10 0.02915345 1 0.7013559
fm2
otime ~ x
t3(testdat,fm2)
Error in model.frame(formula = fm, data = dat, subset = 1:5, drop.unused.levels = TRUE) : object 'fm' not found It looks more than a bit iffy to try and set this right. One suggestion could be to explicitly replace the formula part of fcall with the formula contained in the model object. Or, the original evaluation frame should be kept with the model object and used rather than parent.frame(). This should probably move to r-devel.
set.seed(123)
testdat=data.frame(y=rexp(10),event=rep(0:1,each=5),x=rnorm(10))
testfm=as.formula('y~x')
testfun=function(dat,fm)
+ {
+ predict(lm(fm,data=dat),newdata=dat)
+ }
testfun(testdat,testfm)
1 2 3 4 5 6
1.00414971 0.07061335 0.55381658 0.53091759 0.69310319 1.06574974
7 8 9 10
-0.24405980 0.47664172 1.85449993 0.36286396
sessionInfo()
R version 2.12.2 (2011-02-25) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.12.2
Regards Ivo --- On Thu, 4/14/11, Terry Therneau <therneau at mayo.edu> wrote:
From: Terry Therneau <therneau at mayo.edu>
Subject: Re: predict()
To: "Ivo Shterev" <idc318 at yahoo.com>
Cc: r-help at r-project.org
Date: Thursday, April 14, 2011, 2:59 PM
--- begin included message ---
I am experimenting with the function predict() in two
versions of R and
the R extension package "survival".
library(survival)
set.seed(123)
testdat=data.frame(otime=rexp(10),event=rep(0:1,each=5),x=rnorm(10))
testfm=as.formula('Surv(otime,event)~x')
testfun=function(dat,fm)
{
predict(coxph(fm,data=dat),type='lp',newdata=dat)
}
-- end inclusion ----
The question was: this works under survival 2.35-8, but
not survival
2.36-5
Answer: The predict and underlying model.frame functions
for coxph were
brought into line with lm and other models. The major
advantage is that
I now deal with factors and data dependent predictors like
ns()
correctly.
You've shown a disadvantage of which I was not
aware. Using your
example but replacing coxph() by lm() with otime ~x as the
model I get a
similar failure. I'd like to ask a wider audience of
R-devel since it
is bigger than coxph.
Terry Therneau
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com