Hi all, I'm wondering if someone could put me on the right path to using the "qvalue" package correctly. I have an original p value from an analysis, and I've done 1,000 randomisations of the data set. So I now have an original P value and 1,000 random p values. I want to work out the false discovery rate (FDR) (Q; as described by Storey and Tibshriani in 2003) for my original p value, defined as the number of expected false positives over the number of significant results for my original P value. So, for my original P value, I want one Q value, that has been calculated as described above based on the 1,000 random p values. I wrote this code: pvals <- c(list_of_p_values_obtained_from_randomisations) qobj <-qvalue(p=pvals) r_output1 <- qobj$pvalue r_output2 <- qobj$qvalue r_output1 is the list of 1,000 p values that I put in, and r_output2 is a q value for each of those p values (i.e. so there are 1,000 q values). The problem is I don't want there to be 1,000 Q values (i.e one for each random p value). The Q value should be the false discovery rate (FDR) (Q), defined as the number of expected false positives over the number of significant results. So I want one Q value for my original P value, and to calculate that one Q value using the 1,000 random P values I have generated. Could someone please tell me where I'm going wrong. Thanks Tom
Qvalue package: I am getting back 1, 000 q values when I only want 1 q value.
4 messages · Jim Lemon, Thomas Ryan, Jay Tanzman
Hi Tom,
From a quick scan of the docs, I think you are looking for qobj$pi0.
The vector qobj$qvalue seems to be the local false discovery rate for each of your randomizations. Note that the manual implies that the p values are those of multiple comparisons within a data set, not randomizations of the data, so I'm not sure that your usage is valid for the function.. Jim
On Fri, Jan 13, 2017 at 4:12 AM, Thomas Ryan <tombernardryan at gmail.com> wrote:
Hi all, I'm wondering if someone could put me on the right path to using
the "qvalue" package correctly.
I have an original p value from an analysis, and I've done 1,000
randomisations of the data set. So I now have an original P value and 1,000
random p values. I want to work out the false discovery rate (FDR) (Q; as
described by Storey and Tibshriani in 2003) for my original p value,
defined as the number of expected false positives over the number of
significant results for my original P value.
So, for my original P value, I want one Q value, that has been calculated
as described above based on the 1,000 random p values.
I wrote this code:
pvals <- c(list_of_p_values_obtained_from_randomisations)
qobj <-qvalue(p=pvals)
r_output1 <- qobj$pvalue
r_output2 <- qobj$qvalue
r_output1 is the list of 1,000 p values that I put in, and r_output2 is a q
value for each of those p values (i.e. so there are 1,000 q values).
The problem is I don't want there to be 1,000 Q values (i.e one for each
random p value). The Q value should be the false discovery rate (FDR) (Q),
defined as the number of expected false positives over the number of
significant results. So I want one Q value for my original P value, and to
calculate that one Q value using the 1,000 random P values I have generated.
Could someone please tell me where I'm going wrong.
Thanks
Tom
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Jim,
Thanks for the reply. Yes I'm just playing around with the data at the
minute, but regardless of where the p values actually come from, I can't
seem to get a Q value that makes sense.
For example, in one case, I have an actual P value of 0.05. I have a list
of 1,000 randomised p values: range of these randomised p values is 0.002
to 0.795, average of the randomised p values is 0.399 and the median of the
randomised p values is 0.45.
So I thought it would be reasonable to expect the FDR Q Value (i.e the
number of expected false positives over the number of significant results) to
be at least over 0.05, given that 869 of the randomised p values are >
0.05?
When I run the code:
library(qvalue)
list1 <-scan("ListOfPValues")
qobj <-qvalue(p=list1)
qobj$pi0
The answer is 0.0062. That's why I thought qobj$pi0 isn't the right
variable to be looking at? So my problem (or my mis-understanding) is that
I have an actual P value of 0.05, but then a Q value that is lower, 0.006?
Thanks again for your help,
Tom
On Thu, Jan 12, 2017 at 9:27 PM, Jim Lemon <drjimlemon at gmail.com> wrote:
Hi Tom, From a quick scan of the docs, I think you are looking for qobj$pi0. The vector qobj$qvalue seems to be the local false discovery rate for each of your randomizations. Note that the manual implies that the p values are those of multiple comparisons within a data set, not randomizations of the data, so I'm not sure that your usage is valid for the function.. Jim On Fri, Jan 13, 2017 at 4:12 AM, Thomas Ryan <tombernardryan at gmail.com> wrote:
Hi all, I'm wondering if someone could put me on the right path to using the "qvalue" package correctly. I have an original p value from an analysis, and I've done 1,000 randomisations of the data set. So I now have an original P value and
1,000
random p values. I want to work out the false discovery rate (FDR) (Q; as described by Storey and Tibshriani in 2003) for my original p value, defined as the number of expected false positives over the number of significant results for my original P value. So, for my original P value, I want one Q value, that has been calculated as described above based on the 1,000 random p values. I wrote this code: pvals <- c(list_of_p_values_obtained_from_randomisations) qobj <-qvalue(p=pvals) r_output1 <- qobj$pvalue r_output2 <- qobj$qvalue r_output1 is the list of 1,000 p values that I put in, and r_output2 is
a q
value for each of those p values (i.e. so there are 1,000 q values). The problem is I don't want there to be 1,000 Q values (i.e one for each random p value). The Q value should be the false discovery rate (FDR)
(Q),
defined as the number of expected false positives over the number of significant results. So I want one Q value for my original P value, and
to
calculate that one Q value using the 1,000 random P values I have
generated.
Could someone please tell me where I'm going wrong.
Thanks
Tom
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/
posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
3 days later
What you're doing makes no sense. Given p-values p_i, i=1...n, resulting from hypothesis tests t_i, i=1...n, the q-value of p_i is the expected proportion of false positives among all n tests if the significance level of each test is ?=p_i. Thus a q-value is only defined for an observed p-value. Assuming that you have stored n observed p-values in an R vector P, and the ith p-value P[i]==.05, then the R syntax to obtain the q-value for P[i] is qvalue(P)$qvalues[i]. If, instead (as I suspect), that .05 is not among your observed p-values, but you want to know what the FDR would be, given your sequence of p-values, if the significance level of every test were .05, then the R syntax would be max(qvalue(P)$qvalues[P<=.05]). On Fri, Jan 13, 2017 at 2:08 AM, Thomas Ryan <tombernardryan at gmail.com> wrote:
Jim,
Thanks for the reply. Yes I'm just playing around with the data at the
minute, but regardless of where the p values actually come from, I can't
seem to get a Q value that makes sense.
For example, in one case, I have an actual P value of 0.05. I have a list
of 1,000 randomised p values: range of these randomised p values is 0.002
to 0.795, average of the randomised p values is 0.399 and the median of the
randomised p values is 0.45.
So I thought it would be reasonable to expect the FDR Q Value (i.e the
number of expected false positives over the number of significant results)
to
be at least over 0.05, given that 869 of the randomised p values are >
0.05?
When I run the code:
library(qvalue)
list1 <-scan("ListOfPValues")
qobj <-qvalue(p=list1)
qobj$pi0
The answer is 0.0062. That's why I thought qobj$pi0 isn't the right
variable to be looking at? So my problem (or my mis-understanding) is that
I have an actual P value of 0.05, but then a Q value that is lower, 0.006?
Thanks again for your help,
Tom
On Thu, Jan 12, 2017 at 9:27 PM, Jim Lemon <drjimlemon at gmail.com> wrote:
Hi Tom, From a quick scan of the docs, I think you are looking for qobj$pi0. The vector qobj$qvalue seems to be the local false discovery rate for each of your randomizations. Note that the manual implies that the p values are those of multiple comparisons within a data set, not randomizations of the data, so I'm not sure that your usage is valid for the function.. Jim On Fri, Jan 13, 2017 at 4:12 AM, Thomas Ryan <tombernardryan at gmail.com> wrote:
Hi all, I'm wondering if someone could put me on the right path to
using
the "qvalue" package correctly. I have an original p value from an analysis, and I've done 1,000 randomisations of the data set. So I now have an original P value and
1,000
random p values. I want to work out the false discovery rate (FDR) (Q;
as
described by Storey and Tibshriani in 2003) for my original p value, defined as the number of expected false positives over the number of significant results for my original P value. So, for my original P value, I want one Q value, that has been
calculated
as described above based on the 1,000 random p values. I wrote this code: pvals <- c(list_of_p_values_obtained_from_randomisations) qobj <-qvalue(p=pvals) r_output1 <- qobj$pvalue r_output2 <- qobj$qvalue r_output1 is the list of 1,000 p values that I put in, and r_output2 is
a q
value for each of those p values (i.e. so there are 1,000 q values). The problem is I don't want there to be 1,000 Q values (i.e one for
each
random p value). The Q value should be the false discovery rate (FDR)
(Q),
defined as the number of expected false positives over the number of significant results. So I want one Q value for my original P value, and
to
calculate that one Q value using the 1,000 random P values I have
generated.
Could someone please tell me where I'm going wrong.
Thanks
Tom
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/
posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/ posting-guide.html and provide commented, minimal, self-contained, reproducible code.