group bunch of lines in a data.frame, an additional requirement
Thanks Gabor, that is much faster than using a loop! I've got a last question: Can you think of a fast way of keeping track of the number of observations collapsed for each entry? i.e. I'd like to end up with: A 2.0 400 ID1 3 (3obs in the first matrix) B 0.7 35 ID2 2 (2obs in the first matrix) C 5.0 70 ID1 1 (1obs in the first matrix) Or is it required to use an temporary matrix that is merged later? (As examplified by Mark in a previous email?) Thanks a lot for your help, Emmanuel
On 9/13/06, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
See below. On 9/13/06, Emmanuel Levy <emmanuel.levy at gmail.com> wrote:
Thanks for pointing me out "aggregate", that works fine! There is one complication though: I have mixed types (numerical and character), So the matrix is of the form: A 1.0 200 ID1 A 3.0 800 ID1 A 2.0 200 ID1 B 0.5 20 ID2 B 0.9 50 ID2 C 5.0 70 ID1 One letter always has the same ID but one ID can be shared by many letters (like ID1) I just want to keep track of the ID, and get a matrix like: A 2.0 400 ID1 B 0.7 35 ID2 C 5.0 70 ID1 Any idea on how to do that without a loop?
If V4 is a function of V1 then you can aggregate by it too and it will appear but have no effect on the classification:
aggregate(DF[2:3], DF[c(1,4)], mean)
V1 V4 V2 V3 1 A ID1 2.0 400 2 C ID1 5.0 70 3 B ID2 0.7 35
Many thanks,
Emmanuel
On 9/12/06, Emmanuel Levy <emmanuel.levy at gmail.com> wrote:
Hello, I'd like to group the lines of a matrix so that: A 1.0 200 A 3.0 800 A 2.0 200 B 0.5 20 B 0.9 50 C 5.0 70 Would give: A 2.0 400 B 0.7 35 C 5.0 70 So all lines corresponding to a letter (level), become a single line where all the values of each column are averaged. I've done that with a loop but it doesn't sound right (it is very slow). I imagine there is a sort of "apply" shortcut but I can't figure it out. Please note that it is not exactly a matrix I'm using, the function "typeof" tells me it's a list, however I access to it like it was a matrix. Could someone help me with the right function to use, a help topic or a piece of code? Thanks, Emmanuel
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.