Skip to content

aggregate.formula implicitly removes rows containing NA

4 messages · Dickison, Daniel, David Winsemius, Peter Ehlers

#
The documentation for `aggregate` makes it sound like aggregate.formula should behave identically to aggregate.data.frame (apart from the way the parameters are passed).  But it looks like aggregate.formula is quietly removing rows where any of the "output" variables (those on the LHS of the formula) are NA.  This differs from how aggregate.data.frame works.  Is this expected behavior?

Here are a couple of examples:
+                 b=c(1,2,NA,3))
a   b
1 1 1.5
2 2  NA
a   b
1 1 1.5
2 2 3.0

It's removing whole rows even if just one of the columns is NA, i.e.:
+                 b=c(1,2,NA,3),
+                 c=c(NA,2,3,NA))
a b c
1 1 2 2

Daniel
#
On Jan 11, 2011, at 5:41 PM, Dickison, Daniel wrote:

            
The help page for aggregate gives the calling defaults for  
aggregate.formula as:
## S3 method for class 'formula' aggregate(formula, data, FUN, ...,  
subset, na.action = na.omit)
So the description you give seems to be adhering to what I would have  
expected (had I initially read the help page.)
#
On 2011-01-11 14:41, Dickison, Daniel wrote:
Try setting na.acton = na.pass.

Peter Ehlers
#
Oh wow, that would be it. Not sure how I missed that. Thanks for the tip.

Sent from my iPhone
On Jan 11, 2011, at 18:56, "David Winsemius" <dwinsemius at comcast.net> wrote: