Skip to content
Back to formatted view

Raw Message

Message-ID: <971536df1001032014x705e5adjeabd2ee6b97cce43@mail.gmail.com>
Date: 2010-01-04T04:14:38Z
From: Gabor Grothendieck
Subject: function in aggregate applied to specific columns only
In-Reply-To: <5E722D67-76E1-49A7-AE5B-590090A37CD2@acad.umass.edu>

Here are 6 ways:

1. aggregate

> aggregate(basicSub["score"], basicSub["student"], mean)
  student score
1       1  55.0
2       2  60.0
3       3  67.5

2. tapply

> with(basicSub, tapply(score, student, mean))
   1    2    3
55.0 60.0 67.5

3. summaryBy in doBy package

> library(doBy)
> summaryBy(. ~ student, basicSub)
  student score.mean
1       1       55.0
2       2       60.0
3       3       67.5

4. sqldf in sqldf package.  Uses SQL:

> library(sqldf)
> sqldf("select student, avg(score) from basicSub group by student")
  student avg(score)
1       1       55.0
2       2       60.0
3       3       67.5

5. summary.formula in Hmisc

> summary(score ~ student, basicSub)
score    N=5

+-------+-+-+-----+
|       | |N|score|
+-------+-+-+-----+
|student|1|2|55.0 |
|       |2|1|60.0 |
|       |3|2|67.5 |
+-------+-+-+-----+
|Overall| |5|61.0 |
+-------+-+-+-----+

6. plyr (see Dennis Murphy's solution in this thread)


On Sun, Jan 3, 2010 at 10:46 PM, david hilton shanabrook
<dhshanab at acad.umass.edu> wrote:
> I want to use aggregate with the mean function on specific columns
>
> gender <- factor(c("m", "m", "f", "f", "m"))
> student <- c(0001, 0002, 0003, 0003, 0001)
> score <- c(50, 60, 70, 65, 60)
> basicSub <- data.frame(student, gender, score)
> basicSubMean <- aggregate(basicSub, by=list(basicSub$student), FUN=mean, na.rm=TRUE)
>
> This doesn't work, one cannot take the mean of a factor (gender). ?Is there any way of specifying which columns to use for the mean? ?I want to aggregate by student, obtaining mean scores, and assume any other factors are unchanging in a specific student, ie. gender.
>
> Thanks
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>