Summing by index
On Jul 30, 2010, at 2:41 PM, steven mosher wrote:
# build a sample data frame illustrating the problem
ids<-c(rep(1234,5),rep(5436,3),rep(7864,4))
years<-c(seq(1990,1994,by=1),seq(1991,1993,by=1),seq(1990,1993,by=1))
data<-seq(14,25,by=1)
data[6]<-NA
DF<-data.frame(Id=ids,Year=years,Data=data)
DF
Id Year Data
1 1234 1990 14
2 1234 1991 15
3 1234 1992 16
4 1234 1993 17
5 1234 1994 18
6 5436 1991 NA
7 5436 1992 20
8 5436 1993 21
9 7864 1990 22
10 7864 1991 23
11 7864 1992 24
12 7864 1993 25
# The result wanted is a sum of DF$Data, by DF$Id. collect the sum
of $Data
for each $Id
# the result would take the form
# Id, sum for each Id
# Try using BY
result<-by(DF$Data,INDICES=Data$Id,FUN=sum,na.rm=T)
Try instead: result<-by(DF$Data,INDICES=DF$Id,FUN=sum,na.rm=T)
David. > Error in names(IND) <- deparse(substitute(INDICES))[1L] : > 'names' attribute [1] must be the same length as the vector [0] > idx<-as.list(Data$Id) > > > idx2<- > list(1234,1234,1234,1234,1234,5436,5436,5436,7864,7864,7864,7864) > result<-by(DF$Data,INDICES=idx,FUN=sum,na.rm=T) > result > [1] 215 > result<-by(DF$Data,INDICES=idx2,FUN=sum,na.rm=T) > Error in tapply(1L:12L, list(1234, 1234, 1234, 1234, 1234, 5436, > 5436, : > arguments must have same length >> idx > list() >> idx[1] > [[1]] > NULL > >> idx2 > [[1]] > [1] 1234 > > [[2]] > [1] 1234 > > [[3]] > [1] 1234 > > [[4]] > [1] 1234 > > [[5]] > [1] 1234 > > [[6]] > [1] 5436 > > [[7]] > [1] 5436 > > [[8]] > [1] 5436 > > [[9]] > [1] 7864 > > [[10]] > [1] 7864 > > [[11]] > [1] 7864 > > [[12]] > [1] 7864 > > aggregate(DF$Data, by=idx2,sum,na.rm=T) > Error in aggregate.data.frame(as.data.frame(x), ...) : > arguments must have same length > > ################################ > > The instruction that the INDICES must have the same length is > confusing me. > the number of indices will always be less than the number of rows > because > the indices are repeated, we want to sum over multiple instances of > the > indices > to collect the Sum by index. I'm confused. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT