Skip to content

Bug in points.formula (PR#6652)

3 messages · Martin Maechler, Brian Ripley, Peter Dalgaard

#
BeT> Dear all, I noticed the following bug in points.formula

    <...>
    BeT> Error in if (length(x) == l) x[s] else x : argument is
    BeT> of length zero

Thank you, Berwin.
The bug is still present in current "R-1.9.0 alpha"
and can be demonstrated with the slightly simpler code

roller <-
    data.frame(weight = c(1.9, 3.1, 3.3, 4.8, 5.3, 6.1, 6.4, 7.6, 9.8, 12.4),
               depression = c(2, 1, 5, 5, 20, 20, 23, 10, 30, 25))
plot( depression ~ weight, data=roller, type="n")
with(roller, points( depression~weight, subset=1:7 )) # ok
with(roller, points( depression~weight, subset=8:10)) # ok
with(roller, points( depression~weight, subset=8:10, col=2))## --> Error

    BeT> This seems to be due to the fact that the data used was
    BeT> attached before the call and not specified in the call
    BeT> via the "data" argument. 

Yes, using "data =" doesn't show the problem,
but as you see, the problem also shows for the more recommended
way of using with() instead of attach()..

    BeT> The following patch seems to fix the problem.

*** /home/berwin/lang/R/Courses/3S6/plot.R.orig	Tue Sep  2 23:43:42 2003
--- /home/berwin/lang/R/Courses/3S6/plot.R	Mon Mar  8 16:06:32 2004
***************
*** 312,318 ****
      mf <- eval(m, parent.frame())
      if (!missing(subset)) {
  	s <- eval(m$subset, data, parent.frame())
! 	l <- nrow(data)
  	dosub <- function(x) if (length(x) == l) x[s] else x
  	dots <- lapply(dots, dosub)
      }
--- 312,320 ----
      mf <- eval(m, parent.frame())
      if (!missing(subset)) {
  	s <- eval(m$subset, data, parent.frame())
!         mtmp <- m
!         mtmp$subset <- NULL
! 	l <- nrow(eval(mtmp, parent.frame()))
  	dosub <- function(x) if (length(x) == l) x[s] else x
  	dots <- lapply(dots, dosub)
      }

-------------------

Yes, but the resulting code becomes even more ugly.
doing a second  eval(m, ..) just with a slightly modified 'm'...

Though I'll have to leave it to others to find if there is a
better patch...

Thanks again, Berwin!
Martin
#
What I think we need is the number of rows of the model frame *before
subsetting*, so that any further arguments (and there are none such in
Martin's example) of the same length as the response get subsetted.

Berwin's fix is ugly, and I think it can be avoided by using model.frame
to do the work by manipulating ... rather carefully so e.g. col gets read 
from the model frame and hence subsetted -- that would also allow col to 
be written in terms of other variables in the model frame.

I need to spend more time thinking about this in detail.

Brian
On Mon, 8 Mar 2004 maechler@stat.math.ethz.ch wrote:

            

  
    
#
Prof Brian Ripley <ripley@stats.ox.ac.uk> writes:
It's a can of worms... It opens up the whole nonstandard evaluation
mess again. One problem is that a "col" (or pch or cex, etc) of
length one might not be in "data" (I remember Brian at the time being
a source of good arguments as to why you might want it to be so).
Another is that short vectors may be intended to be recycled (this is
a bit tenuous, but it was put forward at the time and is the reason
why the code doesn't just test for length one args). 

Actually Berwin's fix is not too bad (at least it is to the point),
and I suggest we use it for now rather than try to wrestle the entire
beast (umm, can of worms...). If we want to keep some clarity, we can
keep what we have and special-case is.null(l).