mean for subset
Here is the solution using sqldf which can do it in one statement:
# read in data Lines <- "OBS NAME SCORE
+ 1 Tom 92 + 2 Tom 88 + 3 Tom 56 + 4 James 85 + 5 James 75 + 6 James 32 + 7 Dawn 56 + 8 Dawn 91 + 9 Clara 95 + 10 Clara 84"
DF <- read.table(textConnection(Lines), header = TRUE)
# run
library(sqldf)
sqldf("select NAME, avg(SCORE) from DF group by NAME having count(*) = 3")
NAME avg(SCORE) 1 James 64.00000 2 Tom 78.66667 On Tue, Jan 5, 2010 at 2:03 PM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
Have a look at this post and the rest of that thread: https://stat.ethz.ch/pipermail/r-help/2010-January/223420.html On Tue, Jan 5, 2010 at 1:29 PM, Geoffrey Smith <gps at asu.edu> wrote:
Hello, does anyone know how to take the mean for a subset of observations? For example, suppose my data looks like this: OBS ? ? NAME ? SCORE 1 ? ? ? ? ?Tom ? ? ? 92 2 ? ? ? ? ?Tom ? ? ? 88 3 ? ? ? ? ?Tom ? ? ? 56 4 ? ? ? ? ?James ? ?85 5 ? ? ? ? ?James ? ?75 6 ? ? ? ? ?James ? ?32 7 ? ? ? ? ?Dawn ? ? 56 8 ? ? ? ? ?Dawn ? ? 91 9 ? ? ? ? ?Clara ? ? 95 10 ? ? ? ?Clara ? ? 84 Is there a way to get the mean of the SCORE variable by NAME but only when the number of observations is equal to 3? ?In other words, is there a way to get the mean of the SCORE variable for Tom and James, but not for Dawn and Clara? ?Thank you. -- Geoffrey Smith Visiting Assistant Professor Department of Finance W. P. Carey School of Business Arizona State University ? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.