FDR analyses: minimum number of features
Have you considered Monte Carlo? From previous work, you could estimate a distribution for the differences to be detected and use that as input to a Monte Carlo, computing thereby a distribution for FDR as a function of distribution of differences and the number of features. From this, you could estimate probabilities for obtaining results that were bogus vs. marginal, barely useful vs. highly accurate and plot them vs. alternative budgets, etc. I hope this comment makes more sense than my earlier nonsense. spencer graves
Dupont, William wrote:
I agree. What is unclear to me is the optimal way of justifying sample size and SNP selection in grant applications that use the FDR approach. -----Original Message----- From: Kjetil Brinchmann Halvorsen [mailto:kjetil at acelerate.com] Sent: Wednesday, September 21, 2005 9:45 PM To: Spencer Graves Cc: Dupont, William; r-help at stat.math.ethz.ch Subject: Re: [R] FDR analyses: minimum number of features Spencer Graves wrote:
Two thoughts on this: 1. Your FDR (Not Franklin Delano Roosevelt) sounds like
another
name for Type I error rate.
It is certainly not the same as type I error rate. Type I error rate is the proportion of true nulls which are rejected, while the FDR is the proportion of rejected null hypothesis which really are true nulls! To me FDR seems like a more promising avenue to multiple testing than the old "familywise error rate". Who knows what is a family? Kjetil
The definition of "reasonably reliable FDRs" would seem to relate to the status of the literature on this issue among researchers in genotyping. As more reports of FRDs in genotyping
are published, I would expect that methodology for estimation and the standard for accuracy would similarly evolve. 2. Have you tried the Bioconductor (www.bioconductor.org/) listserve? They might be able to say something more useful than a general list like this. spencer graves Dupont, William wrote:
Dear List, We are planning a genotyping study to be analyzed using false discovery rates (FDRs) (See Storey and Tibshirani PNAS 2003; 100:9440-5). I am interested in learning if there is any consensus as
to how many features (ie. how many P values) need to be studied before
reasonably reliable FDRs can be derived. Does anyone know of a citation where this is discussed? Bill Dupont William D. Dupont phone: 615-343-4100 URL http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/WilliamDupont
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA spencer.graves at pdf.com www.pdf.com <http://www.pdf.com> Tel: 408-938-4420 Fax: 408-280-7915