Skip to content

r-genetics and galaxy

2 messages · Ross, Thomas Lumley

#
Hi Brad

Work on the rgenetics tools in R has more or less come to a halt - I'm
not aware of anyone working on them and I don't think many people are
using them although we tried hard! I think David Clayton's package
snpMatrix can handle large datasets but it's something of a struggle
to fit billions of genotypes efficiently into the R memory model.

For me personally, a bigger problem is enabling the Faculty and
biologists I support to work reproducibly without having to worry
about the technical problems of hundreds of GB of data. Galaxy really
helps with those goals so I've shifted my effort to that framework.
The Galaxy tools do use R and BioC under the hood where they're the
best solution - but for SNP QC, Plink seems more appropriate because
it has the specialized reporting we needed.
On Sat, Sep 11, 2010 at 6:00 AM, <r-sig-genetics-request at r-project.org> wrote:

            
1 day later
#
We have some tools that I think have been put into an R package but not sent to CRAN.  They implement the methods in http://onlinelibrary.wiley.com/resolve/doi/pdf?DOI=10.1002/gepi.20516
This has been used for data sets up to 16000 people x 1M SNPs.

I will check to see the current status of the code -- the plan is to refactor it and combine with our analysis code for a Bioconductor package.

     -thomas
On Sat, 11 Sep 2010, Ross wrote:

            
Thomas Lumley
Professor of Biostatistics
University of Washington, Seattle