Does SQL group by have a heavy duty equivalent in R
I converted the whole data frame to character by using
as.matrix
And then using a posting that explained how to get the naming conventions
back (which had been lost when converting to matrix)
Anything that I did not list with the id's it insisted in including them
with the measured variables. In other words it would not let me drop.
despite
melted<-melt(BigDF, id=c("SAMPLE_ID","ASSAY_ID"),
measured=c("GENOTYPE_ID","DESCRIPTION"))
unique(melted$variable)
[1] CUSTOMER PROJECT PLATE EXPERIMENT CHIP
WELL_POSITION GENOTYPE_ID DESCRIPTION ENTRY_OPERATOR
[10] INTERACT PLATEc
Levels: CUSTOMER PROJECT PLATE EXPERIMENT CHIP WELL_POSITION GENOTYPE_ID
DESCRIPTION ENTRY_OPERATOR INTERACT PLATEc
I should have only got GENOTYPE_ID and DESCRIPTION
"hadley wickham" <h.wickham at gmail.com> wrote in message
news:f8e6ff050612310758p11f96c0dl256ac5b15d11dc2c at mail.gmail.com...
nr.attempts <-aggregate(RawSeq$GENOTYPE_ID,list(sample=RawSeq$SAMPLE_ID,assay=RawSeq$ASSAY_ID),length) This was simply to figure out how many times the same piece of information had been obtained. I ran out of patience. It took beyond forever and tapply did not perform much better. The reshape package did not help - it implied one was out of luck if the data was not numeric. All of my data is character or factor.
The reshape package will work if all your data is numeric, or all of it is character - it doesn't work with a mix. I will try and make this more clear in the documentation. However, depending on the size and structure of your data it may not be any faster than tapply or aggregate. Hadley
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.