Group by in R
Assuming DF is your data frame try this: ftable(DF)
In SQL you can get close with:
sqldf("select X1, X2, X3, sum(X4 == 1) `X4=1`, sum(X4 == 2) `X4=2`
from DF group by X1, X2, X3 order by X1, X2, X3")
On Mon, Apr 13, 2009 at 9:56 AM, Nick Angelou <nikolay12 at yahoo.com> wrote:
Gabor Grothendieck wrote:
SQL has the order by clause.
Gabor, thanks for the suggestion. I thought about this but ORDER BY cannot create the tabular structure that I need. Here is more detail about my setting: f1, f2, f3 have unique triplets (each repeating a different number of times). Each of these triplets falls into one of the two categories of f4. Here is a sample:
data
? X1 X2 X3 X4
1 ? 1 ?2 ?2 ?1
2 ? 1 ?1 ?2 ?2
3 ? 1 ?1 ?2 ?2
4 ? 2 ?2 ?1 ?2
5 ? 1 ?1 ?2 ?2
6 ? 2 ?2 ?1 ?2
7 ? 1 ?1 ?2 ?1
8 ? 2 ?2 ?1 ?2
9 ? 1 ?2 ?1 ?1
10 ?1 ?1 ?2 ?2
sqldf("select X1, X2, X3, X4, count(*) CNT from data group by X1, X2, X3, X4
ORDER BY X4, X1, X2, X3")
?X1 X2 X3 X4 CNT
1 ?1 ?1 ?2 ?1 ? 1
2 ?1 ?2 ?1 ?1 ? 1
3 ?1 ?2 ?2 ?1 ? 1
4 ?1 ?1 ?2 ?2 ? 4
5 ?2 ?2 ?1 ?2 ? 3
The counts are fine, though it's not exactly what I need. I need a kind of
contingency table:
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | levels of X4 |
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ---------------
unique triplets of X1:X3 | ?1 ? | ? 2 ? |
-----------------------------------------
? ? ? ? ? ?1 1 1 ? ? ? ? ? ? | ?0 ? ? ? 0
? ? ? ? ? ?1 1 2 ? ? ? ? ? ? | ?1 ? ? ? 4
? ? ? ? ? ?1 2 1 ? ? ? ? ? ? | ?1 ? ? ? 0
? ? ? ? ? ?1 2 2 ? ? ? ? ? ? | ?1 ? ? ? 0
? ? ? ? ? ?2 1 1 ? ? ? ? ? ? | ?0 ? ? ? 0
? ? ? ? ? ?2 1 2 ? ? ? ? ? ? | ?0 ? ? ? 0
? ? ? ? ? ?2 2 1 ? ? ? ? ? ? | ?0 ? ? ? 3
? ? ? ? ? ?2 2 2 ? ? ? ? ? ? | ?0 ? ? ? 0
So the final result should be a table structure like:
0 0
1 4
1 0
1 0
0 0
0 0
0 3
0 0
I guess I could probably do this in SQL with a combination of OUTER JOINs
but I thought
that R might have a more elegant solution based on "factor" and "table".
Thanks,
Nick
--
View this message in context: http://www.nabble.com/Group-by-in-R-tp23020587p23022717.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.