Grouping and Computing - R-help

Thu, Feb 7, 2002 6:03 AM #

Hi group,

To mention it in advance, I am an R newbie, and most likely, my question is
more a mix of smaller, simpler tasks. Anyway, I got mixed up between by,
select, aggregate, lapply etc.
My problem is as follows : 

I have read data in and transformed them into a matrix for no special reason
so far. This matrix contains a column with regard to which I would like to
group, i.e. one realisation specifies one group. Neither the number of
occurences nor the value of these realisations is known in advance, which seems to
be the mayor problem. For each group separately then, I would like to compute
some aggregation function, namely the sum of a fraction of two columns. These
sums should be kept in form of another vector. 

My two questions are then

- Which object type (matrix, dataframe, list) lends itself to such a
problem?
- Do I have to create different objects for the groups, or can I compute the
vector of sums directly? And how?
 
Thanks in advance

Alexander Hener

GMX - Die Kommunikationsplattform im Internet.
http://www.gmx.net

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Kaspar Pflugshaupt

Thu, Feb 7, 2002 7:30 AM #

On 7.2.2002 15:03 Uhr, alexander.hener at gmx.de wrote:

It's probably easiest with a dataframe, but you can also use a matrix

Do it directly, by all means. You can use any of tapply(), by() or
aggregate():

a           b
1  a  0.23158790
2  a -0.38852120
    [snip]
9  b -1.81645407
10 b -0.44034004

a        b 
1.282057 1.511260

INDICES: a
[1] 1.282057
------------------------------------------------------------
INDICES: b
[1] 1.51126

Group.1        x
1       a 1.282057
2       b 1.511260


See the functions' help texts and examples for further information. For me,
tapply() does all I need.

Cheers

Kaspar Pflugshaupt

Kaspar Pflugshaupt
Geobotanisches Institut
Zuerichbergstr. 38
CH-8044 Zuerich

Tel. ++41 1 632 43 19
Fax  ++41 1 632 12 15

mailto:pflugshaupt at geobot.umnw.ethz.ch
privat:pflugshaupt at mails.ch
http://www.geobot.umnw.ethz.ch

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Joerg Maeder

Thu, Feb 7, 2002 8:23 AM #

Hello Alexander,

the function you are looking for is tapply and it works with dataframes
and matrixs.

here a small example
d <- data.frame(group=c(1,4,5,4,5,2,1),value=c(1,54,2,6,87,4,6))
tapply(d$value,d$group,mean)#arguments: the datas, the groups, the
function

will produce
   1    2    4    5 
 3.5  4.0 30.0 44.5 

you can also use factors to define the group, this has the advantage,
that you can give them a real name
d <-
data.frame(group=c('John','Jack','Joan','Jack','Jack','Joan','John'),value=c(1,54,2,6,87,4,6))
tapply(d$value,d$group,mean)

will produce
Jack Joan John 
49.0  3.0  3.5

alexander.hener at gmx.de wrote:

Joerg Maeder    .:|:||:..:.||.::   maeder at atmos.umnw.ethz.ch
Tel: +41 1 633 36 25   .:|:||:..:.||.::   
http://www.iac.ethz.ch/staff/maeder
PhD student at INSTITUTE FOR ATMOSPHERIC AND CLIMATE SCIENCE (IACETH)
ETH Z?RICH Switzerland
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._