Skip to content
Back to formatted view

Raw Message

Message-ID: <AANLkTimysHdZQZiKeTPpCQFQ5Uxs_VOWsSLTjcANRADJ@mail.gmail.com>
Date: 2011-01-26T19:31:56Z
From: Gabor Grothendieck
Subject: aggregate(as.formula("some formula"), data, function) error when called from in a function
In-Reply-To: <99034C83-582C-481B-A91F-60004971AFFB@umd.edu>

On Wed, Jan 26, 2011 at 2:04 PM, Paul Bailey <pdbailey at umd.edu> wrote:
> I'm having a problem with aggregate.formula when I call it in a function and the function is converted from a string in the funtion
>
> I think my problem may also only occur when the left hand side of the formula is cbind(...)
>
> Here is example code that generates a dataset and then the error.
>
> The first function "agg2" fails
>
>> agg2(FALSE)
> do agg 2
> Error in m[[2L]][[2L]] : object of type 'symbol' is not subsettable
>
> but, if I run it have it return what it is going to pass to aggregate and pass it myself, it works. I can use this for a workaround (agg3) where one function does this itself.
>
> I'm confused by the behavior. Is there some way to not have to use a separate function to make the call ?
>
>
> ======================
> # start R code
> # idea: in a function, count the number of instances
> # of some factor (y) associated with another
> # factor (x). aggregate.formula appears to be
> # able to do this... but I have a problem if all of the following:
> # (1) It is called in a function
> # (2) the formula is created using as.formula(character)
> # calling aggregate with the same formula (created with as.formula)
> # outside the function works fine.
> agg2 <- function(test=FALSE) {
> ?# create a factor y
> ?dat <- data.frame(y=sample(LETTERS[1:3],100,replace=TRUE))
> ?# create a factor x
> ?dat$x <- sample(letters[1:4],100,replace=TRUE)
> ?# make a column of 1s and zeros
> ?# 1 when that row has that level of y
> ?# 0 otherwise
> ?lvls <- levels(dat$y)
> ?dat$ya <- 1*(dat[,1] == lvls[1])
> ?dat$yb <- 1*(dat[,1] == lvls[2])
> ?dat$yc <- 1*(dat[,1] == lvls[3])
> ?# this works fine if you give the exact function
> ?agg1 <- aggregate(cbind(ya,yb,yc)~x,data=dat,sum)
> ?# and fine if you accept
> ?fo <- as.formula("cbind(ya,yb,yc)~x")
> ?if(test) {
> ? ? ? ?return(list(fo=fo,data=dat))
> ?}
> ?cat("do agg 2\n")
> ?agg2 <- aggregate(fo,data=dat,sum)
> ?list(agg1,agg2)
> }
> agg2(FALSE)
> ag <- agg2(TRUE)
> ag$fo
> aggregate(ag$fo,ag$data,sum)
>
>
> agg3 <- function() {
> ?ag <- agg2(TRUE)
> ?ag$fo
> ?aggregate(ag$fo,ag$data,sum)
> }
> agg3()
>
> # end R code
> ==============
> Paul Bailey
> University of Maryland

The problem is that the aggregate statement:

agg2 <- aggregate(fo, data = dat, sum)

is using non-standard evaluation and is literally looking at fo rather
than fo's value.  This may be a bug in aggregate.formula but at any
rate you could try replacing that statement with the following to
force fo to be evaluated:

agg2 <- do.call(aggregate, list(fo, data = dat, FUN = sum))

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com