Skip to content
Prev 23 / 490 Next

R-sig-genetics Digest, Vol 8, Issue 2

Hi, Fernando,

I think there are some QC tools in David Clayton's snpMatrix package,
but there's no single R package to do all the reports you need AFAIK.
For comprehensive reporting, if you don't mind not using R, one option
is to try the SNP/WGA tools in Galaxy - they do use R for graphics but
you don't need to install anything as it all works through an ordinary
web browser.

Essentially, if you have your genotype and pedigree data in Plink
style linkage format (separate map and ped files), the steps are
something like this:

1. make yourself a new user account at the main Galaxy server
(http://usegalaxy.org) so your histories are preserved between logins

2. From the analysis window, left (tool) pane, click the Get Data tool
group header to expand the group, then click the 'upload file' tool.
A form will appear in the center pane of your browser.

3. Change the file format (first field on the form) from Auto to
"lped" format as autodetect won't work for these multi-part datatypes
4. Make the 'ped' and 'map' file upload fields point to the right map
and ped files on your local machine, set the 'build' to hg18 and
change the name to reflect something informative about your data then
click execute.

5. After the data are uploaded (should only take a minute or two for a
small file) to your history, you can select the SNP/WGA QC LD Plots
tool submenu in the tool pane and then click the QC tool. Another form
will open in the center panel. Your new dataset should be the only one
available in the drop-down list of files to process. Change the QC job
name to a meaningful name, click 'execute'. For a small dataset, the
whole process should run for a few minutes but you can safely log out
and log back in later - your work will all be preserved.

6. The QC tool output (in the right side history pane) has an 'eye'
icon which you can click to open up the report in the center panel -
you should see HWE/missingness/Mendel and all sorts of other useful
plots and there are some tabular files containing summary details by
marker and by sample.

I'm happy to answer any questions you might have - I hope this helps
get you started?

There's a 'clean' tool you can use to remove markers and subjects that
fall below specific thresholds for QC measures and there's a TDT tool
you can use for analysis of family data.
On Fri, Sep 10, 2010 at 6:00 AM, <r-sig-genetics-request at r-project.org> wrote: