Skip to content

How to calculate the stratified means in a data frame?

4 messages · Frank Duan, Brian Ripley, Peter Dalgaard +1 more

#
Dear R people,

I have a simple question to ask. Suppose I have a data.frame with two
variables: one factor (x) and one numeric (y), I want to calculate the
mean of y for each value of x. Although it's easy to do it within a
for a loop, I believe there may be a concise way by using some kinds
of "apply" functions. Could anyone tell me how to do that? Thank you.

Frank
#
On Thu, 18 Nov 2004, Frank Duan wrote:

            
tapply(y, x, mean)  # which _is_ in `An Introduction to R', BTW

?by
?aggregate

for more sophisticated packaging of such ideas.
#
Frank Duan <fhduan at gmail.com> writes:
tapply() will do that. (help(tapply), look at the "presidents" example).
#
On Thu, 2004-11-18 at 15:34 -0500, Frank Duan wrote:
One way is to use by(). Using the 'iris' dataset to get the means for
Sepal.Length by Species:
INDICES: setosa
[1] 5.006
------------------------------------------------------ 
INDICES: versicolor
[1] 5.936
------------------------------------------------------ 
INDICES: virginica
[1] 6.588

See ?by, also ?tapply and ?aggregate.

Note also the use of with() as a wrapper, in lieu of attach() here.

HTH,

Marc Schwartz