aggregating data
oops last reply was only half the solution:
library(plyr)
df <- data.frame(gene=c('A', 'A', 'E', 'A', 'F', 'F'), probe = c(1,2,3,4,5,6), exp = c(0.34, 0.21, 0.11, 0.21, 0.56, 0.81))
ddply(df, .(gene), function(df)c(length(df$gene), median(df$exp))
gene V1 V2
1 A 3 0.210
2 E 1 0.110
3 F 2 0.685
best
iain
--- On Thu, 30/6/11, Max Mariasegaram <max.mariasegaram at qut.edu.au> wrote:
From: Max Mariasegaram <max.mariasegaram at qut.edu.au> Subject: [R] aggregating data To: "r-help at r-project.org" <r-help at r-project.org> Date: Thursday, 30 June, 2011, 8:28 Hi, I am interested in using the cast function in R to perform some aggregation. I did once manage to get it working, but have now forgotten how I did this. So here is my dilemma. I have several thousands of probes (about 180,000) corresponding to each gene; what I'd like to do is obtain is a frequency count of the various occurrences of each probes for each gene. The data would look something like this: Gene? ???ProbeID? ? ? ? ? ? ???Expression_Level A? ? ? ? ? ???1? ? ? ? ? ? ? 0.34 A? ? ? ? ? ???2? ? ? ? ? ? ? 0.21 E? ? ? ? ? ? ? 3? ? ? ? ? ? ? 0.11 A? ? ? ? ? ???4? ? ? ? ? ? ? 0.21 F? ? ? ? ? ? ? 5? ? ? ? ? ? ? 0.56 F? ? ? ? ? ? ? 6? ? ? ? ? ? ? 0.87 . . . (180000 data points) In each case, the probeID is unique. The output I am looking for is something like this: Gene? ???No.ofprobes? ? ? Mean_expression A? ? ? ? ? ???3? ? ? ? ? ? ? 0.25 Is there an easy way to do this using "cast" or "melt"? Ideally, I would also like to see the unique probes corresponding to each gene in the wide format. Thanks in advance Max Maxy Mariasegaram| Reserach Fellow | Australian Prostate Cancer Research Centre| Level 1, Building 33 | Princess Alexandra Hospital | 199 Ipswich Road, Brisbane QLD 4102 Australia | t: 07 3176 3073| f: 07 3176 7440 | e: mariaseg at qut.edu.au ??? [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.