Skip to content
Prev 82831 / 398503 Next

Looking for a sort of tapply() to data frames

Hello again,
On 12/14/05, Thomas Lumley <tlumley at u.washington.edu> wrote:
OK, slowly :-) I don't understand it.

- why df[,-1] and not df? don't we loose the df$Day entries?

(by the way, why does typeof(df) show "list"? I thought that
read.table() returns a data frame?)
Hmmmmm, I tried it and it did not work. That is, it works - but not as
intended :-).

Fake example:
Day val1 val2
1 Tue    1    3
2 Tue    2    6
3 Tue    3    9
4 Wed    4   12
5 Wed    5   15
df$Day: Tue
val1 val2
   2    6
------------------------------------------------------------
df$Day: Wed
val1 val2
 4.5 13.5
NULL
NULL

In real data, instead of "days", I have around 6000 items, so I need
them to be in one column called "Days" (or whatever).  OK. So correct
me if I understand wrongly what is happening here:

by() divides df in data frame subsets and applies a function
(colMeans) to each of them.  The result of colMeans ... manual says
that colMeans returns the following:

     A numeric or complex array of suitable size, or a vector if the
     result is one-dimensional.  The 'dimnames' (or 'names' for a
     vector result) are taken from the original array.

...which doesn't tell me much.  typeof(colMeans(...)) tells me
"double" but I think it lies. OK, lets assume it is a vector (should
be, I assume the result is one-dimensional, as I can hardly imagine a
multidimensional result).

So in the end I have a list with as many columns as I have days, and
in each column I have a vector with N named dimensions, where N is the
numbers of variables in the original data frame bar one.  But what I
would like to have is a data frame with exactly the same column names,
and rows being just a summary.  And no clue how to convert one in the
other :-)
Huh? why is it df[,1] now? I think I'm completly lost.
Probably, yes.  As soon as I figure out how to use it, that is :-) (an
hour later: OK, I got it! yuppie!)  However what I really needed was
smth like this:

ddf <- by(df[,-1], df$Day, function(z) { return(cor(z$val1,z$val2)) ; } )

(but I still don't know how to convert it to a friendly data frame...)

Thanks for the answers!

January

--
------------ January Weiner 3  ---------------------+---------------
Division of Bioinformatics, University of Muenster  |  Schlo??platz 4
(+49)(251)8321634                                   |  D48149 M??nster
http://www.uni-muenster.de/Biologie.Botanik/ebb/    |  Germany