Skip to content
Prev 325712 / 398503 Next

adding by categories in a data frame???

Hi,
dat1<- read.table(text="
name?? number
a??????????? 2
a??????????? 3
b??????????? 5
b??????????? 7
c???????????? 9
c???????????? 1
",sep="",header=TRUE,stringsAsFactors=FALSE) 


aggregate(number~name,data=dat1,sum)
#? name number
#1??? a????? 5
#2??? b???? 12
#3??? c???? 10

#or
library(plyr)
ddply(dat1,.(name),summarize,Sum_Number=sum(number))
#? name Sum_Number
#1??? a????????? 5
#2??? b???????? 12
#3??? c???????? 10

#or
library(data.table)
dt1<- data.table(dat1)

dt1[,list(Sum_Number=sum(number)),by=name]
#?? name Sum_Number
#1:??? a????????? 5
#2:??? b???????? 12
#3:??? c???????? 10


##Speed comparison:
set.seed(1254)
name<- sample(letters,1e6,replace=TRUE)
number<- sample(1:10,1e6,replace=TRUE)

datTest<- data.frame(name,number,stringsAsFactors=FALSE)

system.time(res1<-aggregate(number~name,data=datTest,sum))
# user? system elapsed 
#? 2.184?? 0.000?? 1.772 


system.time(res2<-ddply(datTest,.(name),summarize,Sum_Number=sum(number)))
# user? system elapsed 
#? 0.256?? 0.000?? 0.227?

dtTest<- data.table(datTest)

system.time(res3<- dtTest[,list(Sum_Number=sum(number)),by=name])
#user? system elapsed 
#? 0.084?? 0.000?? 0.066 
?names(res1)[2]<- names(res2)[2]
?identical(res1,res2)
#[1] TRUE
?res3New<- res3[order(name),]
identical(res1,as.data.frame(res3New))
#[1] TRUE




#to get descriptive statistics
by(dat1[,2],dat1[,1],summary)

#or
library(psych)
?describeBy(dat1[,2],dat1[,1],mat=TRUE)

A.K.


Hello. I have a big table and need to have descriptive statistics by sub-groups of data. 
For example: 
name ? number 
a ? ? ? ? ? ?2 
a ? ? ? ? ? ?3 
b ? ? ? ? ? ?5 
b ? ? ? ? ? ?7 
c ? ? ? ? ? ? 9 
c ? ? ? ? ? ? 1 
How can I get/print a table that show the sum of numbers for each name? 
a = 5 
b = 12 
c = 10 
?Thank you!!!