Back to formatted view
Raw Message

Message-ID: <492484067.5300589.1459115133468.JavaMail.zimbra@arpa.veneto.it>
Date: 2016-03-27T21:45:33Z
From: Massimo Bressan
Subject: 'split-lapply' vs. 'aggregate'

this might be a trivial question (eventually sorry for that!) but I definitely can not catch the problem here... 

please consider the following reproducible example: why of different results through 'split-lapply' vs. 'aggregate'? 
I've been also through a check against different methods (e.g. data.table, dplyr) and the results were always consistent with 'split-lapply' but apparently not with 'aggregate' 

I must be certainly wrong! 
could someone point me in the right direction? 

thanks 

## 

s <- split(airquality, airquality$Month) 
ls <- lapply(s, function(x) {colMeans(x[c("Ozone", "Solar.R", "Wind")], na.rm = TRUE)}) 
do.call(rbind, ls) 

# slightly different results with 
aggregate(.~ Month, airquality[-c(4,6)], mean, na.rm=TRUE) 

## 

	[[alternative HTML version deleted]]