Skip to content

"By" function Frame Conversion (with Multiple Indices)

3 messages · Ray DiGiacomo, Jr., David Winsemius, arun

#
On Jan 3, 2013, at 9:00 PM, Ray DiGiacomo, Jr. wrote:

            
I named the dataframe "dat"

aggregate( dat[,c("weight", "height")],
              list( age=dat$age, gender=dat$gender), FUN=mean,  
na.rm=TRUE)
   age gender weight height
1  12            98     66
2   5      f     40     40
3   6      f     42    NaN
4  13      f    100     67
5  22      m    180     72
6  50      m    255     60
#
Hi,
You could try this:
dat1<-read.table(text=" 
id,age,weight,height,gender 
1,22,180,72,m 
2,13,100,67,f 
3,5,40,40,f 
4,6,42,,f 
5,12,98,66, 
6,50,255,60,m 
",sep=",",header=TRUE,stringsAsFactors=FALSE,na.strings="") 
list1<-by(dat1[c("weight","height")],dat1[c("age","gender")],colMeans,na.rm=TRUE,simplify=FALSE) 
?list2<-split(dat1,list(dat1$age,dat1$gender)) 
names(list1)<-names(list2) 

res<-do.call(rbind,list1) 
res2<-cbind(read.table(text=row.names(res),sep=".",header=FALSE,stringsAsFactors=FALSE),res)
?colnames(res2)[1:2]<-c("age","gender")
?row.names(res2)<-1:nrow(res2)
?res2
#? age gender weight height
#1?? 5????? f???? 40???? 40
#2?? 6????? f???? 42??? NaN
#3? 13????? f??? 100???? 67
#4? 22????? m??? 180???? 72
#5? 50????? m??? 255???? 60


library(plyr) 
ddply(dat1,.(age,gender),colwise(mean,c("weight","height")),na.rm=TRUE) 

# age gender weight height
#1?? 5????? f???? 40???? 40
#2?? 6????? f???? 42??? NaN
#3? 12?? <NA>???? 98???? 66 #prints groups which are missing
#4? 13????? f??? 100???? 67
#5? 22????? m??? 180???? 72
#6? 50????? m??? 255???? 60
A.K.




----- Original Message -----
From: "Ray DiGiacomo, Jr." <rayd at liondatasystems.com>
To: R Help <r-help at r-project.org>
Cc: 
Sent: Friday, January 4, 2013 12:00 AM
Subject: [R] "By" function Frame Conversion (with Multiple Indices)

Hello,

I have the following dataset.? Please note that there are missing values on
records 4 and 5:

id,age,weight,height,gender
1,22,180,72,m
2,13,100,67,f
3,5,40,40,f
4,6,42,,f
5,12,98,66,
6,50,255,60,m

I'm using the "By" function like this:

list1 <- by(dataset[c("weight", "height")],
? ? ? ? ? ?  dataset[c("age", "gender")],
? ? ? ? ? ?  colMeans,
? ? ? ? ? ? ? ? ? ? ? ? ? na.rm = TRUE)

I then convert the list above to a frame like this:

as.data.frame( do.call(rbind, list1) )

I get this output from the code above:

? ? weight height
1? ?  40? ?  40
2? ?  42? ? NaN
3? ? 100? ?  67
4? ? 180? ?  72
5? ? 255? ?  60

I want to get the output above, but I also want two additional columns
named "age" and "gender" (with the age and gender values from the "By"
function output).

How would I do this?

Best Regards,

Ray DiGiacomo, Jr.
Healthcare Predictive Analytics Specialist
President, Lion Data Systems LLC
President, The Orange County R User Group
Board Member, TDWI
rayd at liondatasystems.com
(m) 408-425-7851
San Juan Capistrano, California USA
twitter.com/liondatasystems
linkedin.com/in/raydigiacomojr
youtube.com/user/liondatasystems/videos
liondatasystems.com/courses

??? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.