Skip to content

Can the by() function return a single column?

4 messages · Vassilis, Gerrit Eichner, David Winsemius

#
I would like to de-mean the 'vector' column of the following dataframe by
factor:

set.seed(5444)
vector	<- rnorm(1:10)
factor	<- rep(1:2,5)
test.df	<- data.frame(factor, vector)

which is:

   factor     vector
1       1 -0.4963935
2       2 -2.0768182
3       1 -1.5822224
4       2  0.8025474
5       1  0.3504199
6       2  0.2358464
7       1 -0.3989443
8       2 -0.3692544
9       1 -0.3174586
10      2  1.4305431

Using the by() command, I get:
test.df$factor: 1
[1] -0.007473699 -1.093302612  0.839339673  0.089975488  0.171461151
-------------------------------------------------------------------------------------------------- 
test.df$factor: 2
[1] -2.0813911  0.7979745  0.2312735 -0.3738272  1.4259702
My question is: Is there a way to have this output put back to the
dataframe? I.e to make by(), or some other command, return me a vector of
length 10 whose values x' correspond to x'_1 = x_1 - mean(x | factor1), x'_2
= x_2 - mean(x | factor2),...

Thanks in advance for the help, and apologies for the poor notation. 

Vassilis
#
Hello, Vassilis,

maybe
does what you want.


-- Gerrit
On Wed, 15 Dec 2010, Vassilis wrote:

            
#
Hi Gerrit, 

This does exactly what I want, thank you very much! 

Even more, I notice that ave() uses the split/unsplit functions under the
hood, which are very useful tools as they allow to apply even more
complicated functions on a factor-by-factor basis. 

best,

Vassilis
#
On Dec 15, 2010, at 11:19 AM, Gerrit Eichner wrote:

            
THere is also the scale function which can be called with the  
parameters set to center but not scale the results:

  with( test.df, ave( vector, factor, FUN=scale, scale=FALSE))  #  
center=TRUE is default.