By changing three lines in drop1 from access based on $ to access
based on standard accessor methods (terms() and residuals()), it becomes
*much* easier to extend drop1 to work with other model types.
The use of $ rather than accessors in this context seems to be an
oversight rather than a design decision, but maybe someone knows better ...
In particular, if one makes these changes (which I am pretty certain
will not break anything, as the original code basically mimicked the
default methods anyway), it becomes possible to make drop1() work with
mer objects (Doug Bates's new mixed model code) merely by defining:
terms.mer <- function(x, ...) {
attr(x at frame,"terms")
}
extractAIC.default <- function(fit, scale=0, k=2, ...) {
L <- logLik(fit)
edf <- attr(L,"df")
c(edf,-2*L+2*edf)
}
Adding this definition of extractAIC.default also makes drop1() work
with lme fits ...
Comments? Should I submit to the bug database as "enhancement
request"? Are there any hidden downsides to this?
Ben Bolker
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: add_diff.txt
URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20110223/0cb0c994/attachment.txt>
request for patch in "drop1" (add.R)
7 messages · Ben Bolker, Peter Dalgaard, Brian Ripley +1 more
Ben Bolker <bbolker at gmail.com>
on Wed, 23 Feb 2011 09:14:37 -0500 writes:
> By changing three lines in drop1 from access based on $
> to access based on standard accessor methods (terms() and
> residuals()), it becomes *much* easier to extend drop1 to
> work with other model types. The use of $ rather than
> accessors in this context seems to be an oversight rather
> than a design decision, but maybe someone knows better ...
> In particular, if one makes these changes (which I am
> pretty certain will not break anything, as the original
> code basically mimicked the default methods anyway), it
> becomes possible to make drop1() work with mer objects
> (Doug Bates's new mixed model code) merely by defining:
> terms.mer <- function(x, ...) {
> attr(x at frame,"terms")
> }
> extractAIC.default <- function(fit, scale=0, k=2, ...) {
> L <- logLik(fit)
> edf <- attr(L,"df")
> c(edf,-2*L+2*edf)
> }
> Adding this definition of extractAIC.default also makes drop1() work
> with lme fits ...
> Comments? Should I submit to the bug database as "enhancement
> request"? Are there any hidden downsides to this?
drawback: a possible very small performance cut for the cases
where the "$" access is ok. But that should not count.
I like the idea.... {it's a pity that only S3 methods work that way,
because residuals(), terms(), etc... are unfortunately not
general (implicit) S4 generics but just S3 ones..
I'm currently testing your change, plus some more in step().
However, for step() to work automagically there is more needed.
It currently relies in more places on 'object' being a list to
which you can append new components, basically by
fit <- object
fit$new1 <- ...
fit$new2 <- ...
That would have to be changed to something like
fit <- list(obj = object)
fit$new1 <- ...
fit$new2 <- ...
and more changes where 'fit' has to be replaced by 'fit$obj'.
Would that not be of interest as well?
Martin
> Ben Bolker
> ----------------------------------------------------------------------
> Index: add.R
> ===================================================================
> --- add.R (revision 54562)
> +++ add.R (working copy)
> @@ -330,7 +330,7 @@
> drop1.default <- function(object, scope, scale = 0, test=c("none", "Chisq"),
> k = 2, trace = FALSE, ...)
> {
> - tl <- attr(object$terms, "term.labels")
> + tl <- attr(terms(object), "term.labels")
> if(missing(scope)) scope <- drop.scope(object)
> else {
> if(!is.character(scope))
> @@ -344,7 +344,7 @@
> ans <- matrix(nrow = ns + 1L, ncol = 2L,
> dimnames = list(c("<none>", scope), c("df", "AIC")))
> ans[1, ] <- extractAIC(object, scale, k = k, ...)
> - n0 <- length(object$residuals)
> + n0 <- length(residuals(object))
> env <- environment(formula(object))
> for(i in seq(ns)) {
> tt <- scope[i]
> @@ -356,7 +356,7 @@
> evaluate = FALSE)
> nfit <- eval(nfit, envir=env) # was eval.parent(nfit)
> ans[i+1, ] <- extractAIC(nfit, scale, k = k, ...)
> - if(length(nfit$residuals) != n0)
> + if(length(residuals(nfit)) != n0)
> stop("number of rows in use has changed: remove missing values?")
> }
> dfs <- ans[1L , 1L] - ans[, 1L]
> ----------------------------------------------------------------------
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
On 11-02-23 03:20 PM, Martin Maechler wrote:
Ben Bolker <bbolker at gmail.com>
on Wed, 23 Feb 2011 09:14:37 -0500 writes:
> By changing three lines in drop1 from access based on $
> to access based on standard accessor methods (terms() and
> residuals()), it becomes *much* easier to extend drop1 to
> work with other model types. The use of $ rather than
> accessors in this context seems to be an oversight rather
> than a design decision, but maybe someone knows better ...
> In particular, if one makes these changes (which I am
> pretty certain will not break anything, as the original
> code basically mimicked the default methods anyway), it
> becomes possible to make drop1() work with mer objects
> (Doug Bates's new mixed model code) merely by defining:
> terms.mer <- function(x, ...) {
> attr(x at frame,"terms")
> }
> extractAIC.default <- function(fit, scale=0, k=2, ...) {
> L <- logLik(fit)
> edf <- attr(L,"df")
> c(edf,-2*L+2*edf)
> }
> Adding this definition of extractAIC.default also makes drop1() work
> with lme fits ...
> Comments? Should I submit to the bug database as "enhancement
> request"? Are there any hidden downsides to this?
drawback: a possible very small performance cut for the cases
where the "$" access is ok. But that should not count.
I like the idea.... {it's a pity that only S3 methods work that way,
because residuals(), terms(), etc... are unfortunately not
general (implicit) S4 generics but just S3 ones..
I'm currently testing your change, plus some more in step().
However, for step() to work automagically there is more needed.
It currently relies in more places on 'object' being a list to
which you can append new components, basically by
fit <- object
fit$new1 <- ...
fit$new2 <- ...
That would have to be changed to something like
fit <- list(obj = object)
fit$new1 <- ...
fit$new2 <- ...
and more changes where 'fit' has to be replaced by 'fit$obj'.
Would that not be of interest as well?
Martin
Potentially, but I am personally much more interested in enabling
drop1(), which seems to be a much more legitimate tool for testing terms
in models than step(), which is so easy to abuse ...
Ben Bolker
On Feb 23, 2011, at 21:38 , Ben Bolker wrote:
Potentially, but I am personally much more interested in enabling drop1(), which seems to be a much more legitimate tool for testing terms in models than step(), which is so easy to abuse ...
Yes, although repeated use of drop1() easily leads to backwards elimination....
However, I have a different point: To make drop1() a better generic, I suspect that something needs to be done about the test = c("none", "Chisq") bit. It would be nice if the list of possible tests could vary according to model type, e.g. by doing all tests via anova(model1,model2,...). I'm not quite up to figuring out how complicated that would be.
Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
residuals() and $residuals are often very different: residuals() is generic, but even the default method is *not* simple extraction. Their values can be of different lengths: think about an lm fit with na.action = na.exclude. That is precisely the sort of thing the tests in add.R were designed to detect, hence the use of $residuals. None of this is used in drop1()! One of the places $residuals was used in that file was the default method for drop1(), others being step() and the default method for add1(). As default methods these have to continue to work for any class of object that has previously been thrown at them over the last 10+ years, and even all CRAN packages will not be a good enough test suite. In any case, the current code in R-devel makes use of the new generic nobs(), which is intended to help sort out the many versions of functions called 'BIC" in packages, but is also useful here. (It is still under development.) terms() is also generic so there is also some danger that its substitution could give an inappropriate result. But as it used in several other places in add.R the breakage will probably occur elsewhere already.
On Wed, 23 Feb 2011, Martin Maechler wrote:
Ben Bolker <bbolker at gmail.com>
on Wed, 23 Feb 2011 09:14:37 -0500 writes:
> By changing three lines in drop1 from access based on $ > to access based on standard accessor methods (terms() and > residuals()), it becomes *much* easier to extend drop1 to > work with other model types. The use of $ rather than > accessors in this context seems to be an oversight rather > than a design decision, but maybe someone knows better ...
> In particular, if one makes these changes (which I am > pretty certain will not break anything, as the original > code basically mimicked the default methods anyway), it > becomes possible to make drop1() work with mer objects > (Doug Bates's new mixed model code) merely by defining:
> terms.mer <- function(x, ...) {
> attr(x at frame,"terms")
> }
> extractAIC.default <- function(fit, scale=0, k=2, ...) {
> L <- logLik(fit)
> edf <- attr(L,"df")
> c(edf,-2*L+2*edf)
> }
> Adding this definition of extractAIC.default also makes drop1() work > with lme fits ...
> Comments? Should I submit to the bug database as "enhancement > request"? Are there any hidden downsides to this?
drawback: a possible very small performance cut for the cases
where the "$" access is ok. But that should not count.
I like the idea.... {it's a pity that only S3 methods work that way,
because residuals(), terms(), etc... are unfortunately not
general (implicit) S4 generics but just S3 ones..
I'm currently testing your change, plus some more in step().
However, for step() to work automagically there is more needed.
It currently relies in more places on 'object' being a list to
which you can append new components, basically by
fit <- object
fit$new1 <- ...
fit$new2 <- ...
That would have to be changed to something like
fit <- list(obj = object)
fit$new1 <- ...
fit$new2 <- ...
and more changes where 'fit' has to be replaced by 'fit$obj'.
Would that not be of interest as well?
Martin
> Ben Bolker
> ----------------------------------------------------------------------
> Index: add.R
> ===================================================================
> --- add.R (revision 54562)
> +++ add.R (working copy)
> @@ -330,7 +330,7 @@
> drop1.default <- function(object, scope, scale = 0, test=c("none", "Chisq"),
> k = 2, trace = FALSE, ...)
> {
> - tl <- attr(object$terms, "term.labels")
> + tl <- attr(terms(object), "term.labels")
> if(missing(scope)) scope <- drop.scope(object)
> else {
> if(!is.character(scope))
> @@ -344,7 +344,7 @@
> ans <- matrix(nrow = ns + 1L, ncol = 2L,
> dimnames = list(c("<none>", scope), c("df", "AIC")))
> ans[1, ] <- extractAIC(object, scale, k = k, ...)
> - n0 <- length(object$residuals)
> + n0 <- length(residuals(object))
> env <- environment(formula(object))
> for(i in seq(ns)) {
> tt <- scope[i]
> @@ -356,7 +356,7 @@
> evaluate = FALSE)
> nfit <- eval(nfit, envir=env) # was eval.parent(nfit)
> ans[i+1, ] <- extractAIC(nfit, scale, k = k, ...)
> - if(length(nfit$residuals) != n0)
> + if(length(residuals(nfit)) != n0)
> stop("number of rows in use has changed: remove missing values?")
> }
> dfs <- ans[1L , 1L] - ans[, 1L]
> ---------------------------------------------------------------------- > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On 11-02-23 06:12 PM, Prof Brian Ripley wrote:
residuals() and $residuals are often very different: residuals() is generic, but even the default method is *not* simple extraction. Their values can be of different lengths: think about an lm fit with na.action = na.exclude. That is precisely the sort of thing the tests in add.R were designed to detect, hence the use of $residuals. None of this is used in drop1()! One of the places $residuals was used in that file was the default method for drop1(), others being step() and the default method for add1(). As default methods these have to continue to work for any class of object that has previously been thrown at them over the last 10+ years, and even all CRAN packages will not be a good enough test suite. In any case, the current code in R-devel makes use of the new generic nobs(), which is intended to help sort out the many versions of functions called 'BIC" in packages, but is also useful here. (It is still under development.) terms() is also generic so there is also some danger that its substitution could give an inappropriate result. But as it used in several other places in add.R the breakage will probably occur elsewhere already.
Thanks Prof. Ripley. (I will say that, while I understand why residuals(x) and x$residuals could be different, I am happy that a more transparent form of coding is being introduced ...)
On Wed, 23 Feb 2011, Martin Maechler wrote:
Ben Bolker <bbolker at gmail.com>
on Wed, 23 Feb 2011 09:14:37 -0500 writes:
> By changing three lines in drop1 from access based on $ > to access based on standard accessor methods (terms() and > residuals()), it becomes *much* easier to extend drop1 to > work with other model types. The use of $ rather than > accessors in this context seems to be an oversight rather > than a design decision, but maybe someone knows better ...
> In particular, if one makes these changes (which I am > pretty certain will not break anything, as the original > code basically mimicked the default methods anyway), it > becomes possible to make drop1() work with mer objects > (Doug Bates's new mixed model code) merely by defining:
> terms.mer <- function(x, ...) {
> attr(x at frame,"terms")
> }
> extractAIC.default <- function(fit, scale=0, k=2, ...) {
> L <- logLik(fit)
> edf <- attr(L,"df")
> c(edf,-2*L+2*edf)
> }
> Adding this definition of extractAIC.default also makes drop1() work > with lme fits ...
> Comments? Should I submit to the bug database as "enhancement > request"? Are there any hidden downsides to this?
drawback: a possible very small performance cut for the cases
where the "$" access is ok. But that should not count.
I like the idea.... {it's a pity that only S3 methods work that way,
because residuals(), terms(), etc... are unfortunately not
general (implicit) S4 generics but just S3 ones..
I'm currently testing your change, plus some more in step().
However, for step() to work automagically there is more needed.
It currently relies in more places on 'object' being a list to
which you can append new components, basically by
fit <- object
fit$new1 <- ...
fit$new2 <- ...
That would have to be changed to something like
fit <- list(obj = object)
fit$new1 <- ...
fit$new2 <- ...
and more changes where 'fit' has to be replaced by 'fit$obj'.
Would that not be of interest as well?
Martin
> Ben Bolker
>
----------------------------------------------------------------------
> Index: add.R > =================================================================== > --- add.R (revision 54562) > +++ add.R (working copy) > @@ -330,7 +330,7 @@ > drop1.default <- function(object, scope, scale = 0,
test=c("none", "Chisq"),
> k = 2, trace = FALSE, ...)
> {
> - tl <- attr(object$terms, "term.labels")
> + tl <- attr(terms(object), "term.labels")
> if(missing(scope)) scope <- drop.scope(object)
> else {
> if(!is.character(scope))
> @@ -344,7 +344,7 @@
> ans <- matrix(nrow = ns + 1L, ncol = 2L,
> dimnames = list(c("<none>", scope), c("df", "AIC")))
> ans[1, ] <- extractAIC(object, scale, k = k, ...)
> - n0 <- length(object$residuals)
> + n0 <- length(residuals(object))
> env <- environment(formula(object))
> for(i in seq(ns)) {
> tt <- scope[i]
> @@ -356,7 +356,7 @@
> evaluate = FALSE)
> nfit <- eval(nfit, envir=env) # was eval.parent(nfit)
> ans[i+1, ] <- extractAIC(nfit, scale, k = k, ...)
> - if(length(nfit$residuals) != n0)
> + if(length(residuals(nfit)) != n0)
> stop("number of rows in use has changed: remove missing values?")
> }
> dfs <- ans[1L , 1L] - ans[, 1L]
>
----------------------------------------------------------------------
> ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Ben Bolker <bbolker at gmail.com>
on Wed, 23 Feb 2011 19:36:43 -0500 writes:
> On 11-02-23 06:12 PM, Prof Brian Ripley wrote:
>> residuals() and $residuals are often very different:
>> residuals() is generic, but even the default method is *not*
>> simple extraction. Their values can be of different lengths:
>> think about an lm fit with na.action = na.exclude. That is
>> precisely the sort of thing the tests in add.R were designed to
>> detect, hence the use of $residuals.
>>
>> None of this is used in drop1()! One of the places $residuals
>> was used in that file was the default method for drop1(),
>> others being step() and the default method for add1(). As
>> default methods these have to continue to work for any class of
>> object that has previously been thrown at them over the last
>> 10+ years, and even all CRAN packages will not be a good enough
>> test suite.
>>
>> In any case, the current code in R-devel makes use of the new
>> generic nobs(), which is intended to help sort out the many
>> versions of functions called 'BIC" in packages, but is also
>> useful here. (It is still under development.)
>>
>> terms() is also generic so there is also some danger that its
>> substitution could give an inappropriate result. But as it
>> used in several other places in add.R the breakage will
>> probably occur elsewhere already.
:-) yes. I think it is very good change to have drop1.default() and
other R functions that do "generic" model analysis work
via calls to simpler generic functions such as residuals(), or
the new nobs() . Thanks for introducing that, a really good
idea I think (and along which I think I was also musing when I
last looked at the BIC() implementations..).
I'm glad you've started to clean this up nicely.
> Thanks Prof. Ripley. (I will say that, while I understand why
> residuals(x) and x$residuals could be different, I am happy that
> a more transparent form of coding is being introduced ...)
(indeed, see above).
Advertising the use of nobs(), i.e. asking package authors to
write nobs() methods for their models will be probably worth
doing, as soon as 2.13.0 hits the roads..
Martin
>> On Wed, 23 Feb 2011, Martin Maechler wrote:
>>
>>>>>>>> Ben Bolker <bbolker at gmail.com>
>>>>>>>> on Wed, 23 Feb 2011 09:14:37 -0500 writes:
>>>
>>> > By changing three lines in drop1 from access based on $
>>> > to access based on standard accessor methods (terms() and
>>> > residuals()), it becomes *much* easier to extend drop1 to
>>> > work with other model types. The use of $ rather than
>>> > accessors in this context seems to be an oversight rather
>>> > than a design decision, but maybe someone knows better ...
>>>
>>> > In particular, if one makes these changes (which I am
>>> > pretty certain will not break anything, as the original
>>> > code basically mimicked the default methods anyway), it
>>> > becomes possible to make drop1() work with mer objects
>>> > (Doug Bates's new mixed model code) merely by defining:
>>>
>>> > terms.mer <- function(x, ...) {
>>> > attr(x at frame,"terms")
>>> > }
>>>
>>> > extractAIC.default <- function(fit, scale=0, k=2, ...) {
>>> > L <- logLik(fit)
>>> > edf <- attr(L,"df")
>>> > c(edf,-2*L+2*edf)
>>> > }
>>>
>>> > Adding this definition of extractAIC.default also makes
>>> > drop1() work with lme fits ...
>>>
>>> > Comments? Should I submit to the bug database as
>>> > "enhancement request"? Are there any hidden downsides to
>>> > this?
>>>
>>> drawback: a possible very small performance cut for the cases
>>> where the "$" access is ok. But that should not count.
>>>
>>> I like the idea.... {it's a pity that only S3 methods work
>>> that way, because residuals(), terms(), etc... are
>>> unfortunately not general (implicit) S4 generics but just S3
>>> ones..
>>>
>>> I'm currently testing your change, plus some more in step().
>>> However, for step() to work automagically there is more
>>> needed. It currently relies in more places on 'object' being
>>> a list to which you can append new components, basically by
>>> fit <- object fit$new1 <- ... fit$new2 <- ...
>>>
>>> That would have to be changed to something like fit <-
>>> list(obj = object) fit$new1 <- ... fit$new2 <- ... and more
>>> changes where 'fit' has to be replaced by 'fit$obj'. Would
>>> that not be of interest as well?
>>>
>>> Martin
>>>
>>>
>>> > Ben Bolker
>>>
>>> >
>>> ----------------------------------------------------------------------
>>> > Index: add.R
>>> > ===================================================================
>>> > --- add.R (revision 54562) +++ add.R (working copy) @@
>>> > -330,7 +330,7 @@ drop1.default <- function(object, scope,
>>> > scale = 0,
>>> test=c("none", "Chisq"),
>>> > k = 2, trace = FALSE, ...) { - tl <- attr(object$terms,
>>> > "term.labels") + tl <- attr(terms(object), "term.labels")
>>> > if(missing(scope)) scope <- drop.scope(object) else {
>>> > if(!is.character(scope)) @@ -344,7 +344,7 @@ ans <-
>>> > matrix(nrow = ns + 1L, ncol = 2L, dimnames =
>>> > list(c("<none>", scope), c("df", "AIC"))) ans[1, ] <-
>>> > extractAIC(object, scale, k = k, ...) - n0 <-
>>> > length(object$residuals) + n0 <- length(residuals(object))
>>> > env <- environment(formula(object)) for(i in seq(ns)) { tt
>>> > <- scope[i] @@ -356,7 +356,7 @@ evaluate = FALSE) nfit <-
>>> > eval(nfit, envir=env) # was eval.parent(nfit) ans[i+1, ] <-
>>> > extractAIC(nfit, scale, k = k, ...) -
>>> > if(length(nfit$residuals) != n0) +
>>> > if(length(residuals(nfit)) != n0) stop("number of rows in
>>> > use has changed: remove missing values?") } dfs <- ans[1L ,
>>> > 1L] - ans[, 1L]
>>>
>>> >
>>> ----------------------------------------------------------------------
>>> > ______________________________________________
>>> > R-devel at r-project.org mailing list
>>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel