Skip to content

perhaps 'aggregate()' (was: How to write efficient R code)

1 message · Tom Blackwell

#
Sebastian and Andy  -

Yes, Andy has read the question correctly.  A similar task that
I do quite often is to subtract the mean of a class from all of
the members of the class, and do this within every column of a
(numeric) data frame.  Kurt Hornik's function  aggregate()  is
the one to use.  Here's an example using the same data set that
he uses in the example on the help page.  (Only the commands are
shown here.  You'll have to try them to see the output, because
I cannot cut and paste easily into my email.)

data(state)
ls()
	#  This data set puts individual columns into your workspace,
	#  rather than making a data frame of them.

example <- data.frame(state.abb, state.name, state.region, state.x77)
str(example)
means   <- aggregate(example[ ,3+seq(8)], list(example[ ,3]), mean)
str(means)
residuals <- example[ ,3+seq(8)] - means[as.numeric(example[ ,3]), -1]
result  <- cbind(example[ ,seq(3)], residuals)
str(result)

 -- Ah, I think this example would be easier to read if I had used
the columns from the workspace directly, rather than packaging them
into a data frame 'example' first, the using numeric subscripts on
the data frame.  But, at least this illustrates some common ways of
subscripting subsets of columns from a data frame.

Again, see  help("aggregate"), help("Subscript")  to see what I am
doing here.

-  best  -  tom blackwell  -  u michigan medical school  -  ann arbor  -

(Ah, I see that Andy has just replied this morning as well.  I'll see
what his reply was as soon as I send off this one.)
On Tue, 17 Feb 2004, Sebastian Luque wrote: