Skip to content

Possible bug in termplot function (stats package) ?

6 messages · Peter Dalgaard, Joris Meys, Thomas Lumley

#
Hi all,

I noticed some very odd behaviour in the termplot function of the
stats package due to the following lines :

18.    if (is.null(data))
19.       data <- eval(model$call$data, envir)

This one will look in the global environment, and renders the two
lines after this

20.   if (is.null(data))
21.        data <- mf

completely obsolete. If nothing is found, an error is returned. If
anything is found, data won't be NULL, so line 20, when reached, will
always return FALSE. Can it be that lines 18 and 19 should be removed
from the function?

This gives especially problems when called from other plot functions
on models made with wrapper functions. One example :

Data <- data.frame(
 x1=rnorm(100),
 x2=rnorm(100,3,2),
 y=rnorm(100)
)
form <- as.formula(y~x1+x2)
test <- lm(form, data=Data)
termplot(test)

wrapper <- function(ff,x){
 tt <- lm(ff,data=x)
}
test2 <- wrapper(form,Data)
termplot(test2)

For the non-smooth terms, termplot is called. In the first example,
this works perfectly well. In the second example, it either returns "x
not found" (when there is no x variable in the global) or "x2 not
found" when there is an x variable.

If both lines mentioned earlier are erased from the function, it works
as expected in this example code. Using the model frame seems the
logic choice here, I have no clue why one would want to look in the
global environment for the data related to a model.

Cheers
Joris
#
On Jun 6, 2011, at 17:15 , Joris Meys wrote:

            
I think this is a false assumption. What keeps model$call$data from being NULL? 

No comments on the remainder, except that it wouldn't be the first time a wrapper function got into trouble with environments and modelling functions...

  
    
#
On Mon, Jun 6, 2011 at 6:29 PM, peter dalgaard <pdalgd at gmail.com> wrote:
**snip**
Apart from a dataframe that is explicitly assigned NULL, I can't
imagine a case where model$call$data would be NULL. If it's not found,
the statement returns an error. If it is found and it is NULL, your
model call will have thrown an error earlier, so you won't even have
an object to plot. If you can give me one example where that code
actually makes sense, I'll be very happy. But right now, it doesn't
make any sense at all to me.
My wrapper function returns a completely sound lm-object. Why wouldn't
I expect a function built to work on lm-objects to work on an
lm-object? At least the help files should note that the dataframe will
be sought in the calling environment, or it won't work on your fitted
objects.

Right now I have to do something like :

termplot.wrapper <- function(x,...){
    x$call$data <- NULL
    termplot(x)
}

which seems at least a tiny bit awkward...

Cheers
Joris
#
On Jun 6, 2011, at 20:38 , Joris Meys wrote:

            
I'd say that the burden of proof is really on your side, but how hard can it be:
lm(formula = y ~ x)
NULL
#
On Mon, Jun 6, 2011 at 9:15 PM, peter dalgaard <pdalgd at gmail.com> wrote:
I see... indeed, thx for the answer and sorry for my stupidity. Should
have thought about that case. Still, I wonder why it is necessary to
go look for the data in a calling environment if it should be
contained in the model frame of the fitted object. Or am I again wrong
in assuming that's always the case?

Cheers
Joris
#
On Tue, Jun 7, 2011 at 7:30 AM, Joris Meys <jorismeys at gmail.com> wrote:
You are again wrong.   Life would be much simpler if the data were
always available like that, but there are at least two problems.

1) There need not be a model frame in the fitted object. (it's optional)

2) More importantly, if you have y~sin(x), the model frame will
contain sin(x), not x. For what termplot() does, it has to be able to
reconstruct 'x', which isn't possible without the original data.

It's quite possible that termplot() could be rewritten to be simpler
and more general, but I don't think minor editing will do it.

    -thomas