remove rows in data frame by average

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130221/f5ebb357/attachment.pl>
Hi,

May be this helps:

dat1<- read.table(text="
Subject Block Trial Feature1 Feature2
1? 1? 1? 48? 40
1? 1? 2?? 62? 18
1 2? 1??? 34? 43
1? 2? 2?? 51 34
1? 3? 1?? 64? 14
",sep="",header=TRUE)

?res1<-do.call(rbind,lapply(split(dat1,dat1$Block),function(x) data.frame(unique(x[,1:2]),t(colMeans(x[,-c(1:3)])))))
?res1
#? Subject Block Feature1 Feature2
#1?????? 1???? 1???? 55.0???? 29.0
#2?????? 1???? 2???? 42.5???? 38.5
#3?????? 1???? 3???? 64.0???? 14.0

#With multiple subjects:
dat2<- read.table(text="
Subject Block Trial Feature1 Feature2
1? 1? 1? 48? 40
1? 1? 2?? 62? 18
1 2? 1??? 34? 43
1? 2? 2?? 51 34
1? 3? 1?? 64? 14
2? 1? 1?? 48? 35
2? 1? 2?? 54? 15
2? 2? 1?? 49? 50
2? 2? 2?? 64? 40
2? 3? 1?? 38? 28
",sep="",header=TRUE)

?res2<-do.call(rbind,lapply(split(dat2,list(dat2$Subject,dat2$Block)),function(x) data.frame(unique(x[,1:2]),t(colMeans(x[,-c(1:3)])))))
res2<-do.call(rbind,split(res2,res2$Subject))
res2
?# Subject Block Feature1 Feature2
#1?????? 1???? 1???? 55.0???? 29.0
#2?????? 1???? 2???? 42.5???? 38.5
#3?????? 1???? 3???? 64.0???? 14.0
#4?????? 2???? 1???? 51.0???? 25.0
#5?????? 2???? 2???? 56.5???? 45.0
#6?????? 2???? 3???? 38.0???? 28.0

A.K.

----- Original Message -----
From: Johannes Brand <brandjohannes at gmx.de>
To: r-help at r-project.org
Cc: 
Sent: Thursday, February 21, 2013 12:02 PM
Subject: [R] remove rows in data frame by average

Dear all,

I have a data frame, which looks like this:

Subject | Block | Trial | Feature1 | Feature2 ....
1 | 1 | 1 | ... | ...
1 | 1 | 2 | ... | ...
1 | 2 | 1 | ... | ...
1 | 2 | 2 | ... | ...
1 | 3 | 1 | ... | ...
...| ...| ...| ... | ...

Can I remove the "Trial" column by averaging all the rows and without using
a "for loop"?

At the end my data frame should look like this:

Subject | Block | Feature1 | Feature2 ....
1 | 1 | ... | ...
1 | 2 | ... | ...
1 | 3 | ... | ...
...| ...| ... | ...

Thank you a lot for your help.

Best,
Johannes

??? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130221/8382dc50/attachment.pl>
Many find the functions in the plyr package more convenient to use than the
do.call(rbind, lapply(split(...),...) business:

  > library(plyr)
  > ddply(dat1, .(Subject,Block),  summarize, MeanFeature1=mean(Feature1), MeanFeature2=mean(Feature2))
    Subject Block MeanFeature1 MeanFeature2
  1       1     1         55.0         29.0
  2       1     2         42.5         38.5
  3       1     3         64.0         14.0

Change the calls to 'mean' to calls to other summary functions like 'sum' or 'max' as you wish. 

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
Of arun
Sent: Thursday, February 21, 2013 1:45 PM
To: Johannes Brand
Cc: R help
Subject: Re: [R] remove rows in data frame by average

Hi,

May be this helps:

dat1<- read.table(text="
Subject Block Trial Feature1 Feature2
1? 1? 1? 48? 40
1? 1? 2?? 62? 18
1 2? 1??? 34? 43
1? 2? 2?? 51 34
1? 3? 1?? 64? 14
",sep="",header=TRUE)

?res1<-do.call(rbind,lapply(split(dat1,dat1$Block),function(x)
data.frame(unique(x[,1:2]),t(colMeans(x[,-c(1:3)])))))
?res1
#? Subject Block Feature1 Feature2
#1?????? 1???? 1???? 55.0???? 29.0
#2?????? 1???? 2???? 42.5???? 38.5
#3?????? 1???? 3???? 64.0???? 14.0

#With multiple subjects:
dat2<- read.table(text="
Subject Block Trial Feature1 Feature2
1? 1? 1? 48? 40
1? 1? 2?? 62? 18
1 2? 1??? 34? 43
1? 2? 2?? 51 34
1? 3? 1?? 64? 14
2? 1? 1?? 48? 35
2? 1? 2?? 54? 15
2? 2? 1?? 49? 50
2? 2? 2?? 64? 40
2? 3? 1?? 38? 28
",sep="",header=TRUE)

?res2<-do.call(rbind,lapply(split(dat2,list(dat2$Subject,dat2$Block)),function(x)
data.frame(unique(x[,1:2]),t(colMeans(x[,-c(1:3)])))))
res2<-do.call(rbind,split(res2,res2$Subject))
res2
?# Subject Block Feature1 Feature2
#1?????? 1???? 1???? 55.0???? 29.0
#2?????? 1???? 2???? 42.5???? 38.5
#3?????? 1???? 3???? 64.0???? 14.0
#4?????? 2???? 1???? 51.0???? 25.0
#5?????? 2???? 2???? 56.5???? 45.0
#6?????? 2???? 3???? 38.0???? 28.0

A.K.

----- Original Message -----
From: Johannes Brand <brandjohannes at gmx.de>
To: r-help at r-project.org
Cc:
Sent: Thursday, February 21, 2013 12:02 PM
Subject: [R] remove rows in data frame by average

Dear all,

I have a data frame, which looks like this:

Subject | Block | Trial | Feature1 | Feature2 ....
1 | 1 | 1 | ... | ...
1 | 1 | 2 | ... | ...
1 | 2 | 1 | ... | ...
1 | 2 | 2 | ... | ...
1 | 3 | 1 | ... | ...
...| ...| ...| ... | ...

Can I remove the "Trial" column by averaging all the rows and without using
a "for loop"?

At the end my data frame should look like this:

Subject | Block | Feature1 | Feature2 ....
1 | 1 | ... | ...
1 | 2 | ... | ...
1 | 3 | ... | ...
...| ...| ... | ...

Thank you a lot for your help.

Best,
Johannes

??? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Many find the functions in the plyr package more convenient to use than the
do.call(rbind, lapply(split(...),...) business:

library(plyr)
ddply(dat1, .(Subject,Block),  summarize, MeanFeature1=mean(Feature1), MeanFeature2=mean(Feature2))
   Subject Block MeanFeature1 MeanFeature2
 1       1     1         55.0         29.0
 2       1     2         42.5         38.5
 3       1     3         64.0         14.0

Change the calls to 'mean' to calls to other summary functions like 'sum' or 'max' as you wish. 
Apropos something less complex than "the do.call( lapply( split...)) business": 
The same sort of operation is provided by `aggregate` when the function to be applied on all columns is the same:
aggregate(dat1[, c('Feature1', 'Feature2')] , dat1[, c("Subject", "Block")], FUN=mean)
Subject Block Feature1 Feature2
1       1     1     55.0     29.0
2       1     2     42.5     38.5
3       1     3     64.0     14.0
David

> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
> 
> 
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
>> Of arun
>> Sent: Thursday, February 21, 2013 1:45 PM
>> To: Johannes Brand
>> Cc: R help
>> Subject: Re: [R] remove rows in data frame by average
>> 
>> Hi,
>> 
>> May be this helps:
>> 
>> dat1<- read.table(text="
>> Subject Block Trial Feature1 Feature2
>> 1  1  1  48  40
>> 1  1  2   62  18
>> 1 2  1    34  43
>> 1  2  2   51 34
>> 1  3  1   64  14
>> ",sep="",header=TRUE)
>> 
>> 
>>  res1<-do.call(rbind,lapply(split(dat1,dat1$Block),function(x)
>> data.frame(unique(x[,1:2]),t(colMeans(x[,-c(1:3)])))))
>>  res1
>> #  Subject Block Feature1 Feature2
>> #1       1     1     55.0     29.0
>> #2       1     2     42.5     38.5
>> #3       1     3     64.0     14.0
>> 
>> 
>> #With multiple subjects:
>> dat2<- read.table(text="
>> Subject Block Trial Feature1 Feature2
>> 1  1  1  48  40
>> 1  1  2   62  18
>> 1 2  1    34  43
>> 1  2  2   51 34
>> 1  3  1   64  14
>> 2  1  1   48  35
>> 2  1  2   54  15
>> 2  2  1   49  50
>> 2  2  2   64  40
>> 2  3  1   38  28
>> ",sep="",header=TRUE)
>> 
>>  res2<-do.call(rbind,lapply(split(dat2,list(dat2$Subject,dat2$Block)),function(x)
>> data.frame(unique(x[,1:2]),t(colMeans(x[,-c(1:3)])))))
>> res2<-do.call(rbind,split(res2,res2$Subject))
>> res2
>>  # Subject Block Feature1 Feature2
>> #1       1     1     55.0     29.0
>> #2       1     2     42.5     38.5
>> #3       1     3     64.0     14.0
>> #4       2     1     51.0     25.0
>> #5       2     2     56.5     45.0
>> #6       2     3     38.0     28.0
>> 
>> 
>> 
>> A.K.
>> 
>> 
>> 
>> ----- Original Message -----
>> From: Johannes Brand <brandjohannes at gmx.de>
>> To: r-help at r-project.org
>> Cc:
>> Sent: Thursday, February 21, 2013 12:02 PM
>> Subject: [R] remove rows in data frame by average
>> 
>> Dear all,
>> 
>> I have a data frame, which looks like this:
>> 
>> Subject | Block | Trial | Feature1 | Feature2 ....
>> 1 | 1 | 1 | ... | ...
>> 1 | 1 | 2 | ... | ...
>> 1 | 2 | 1 | ... | ...
>> 1 | 2 | 2 | ... | ...
>> 1 | 3 | 1 | ... | ...
>> ...| ...| ...| ... | ...
>> 
>> Can I remove the "Trial" column by averaging all the rows and without using
>> a "for loop"?
>> 
>> At the end my data frame should look like this:
>> 
>> Subject | Block | Feature1 | Feature2 ....
>> 1 | 1 | ... | ...
>> 1 | 2 | ... | ...
>> 1 | 3 | ... | ...
>> ...| ...| ... | ...
>> 
>> Thank you a lot for your help.
>> 
>> Best,
>> Johannes
>> 

David Winsemius
Alameda, CA, USA