Skip to content

BH correction with p.adjust

4 messages · David Winsemius, Scott Robinson, Peter Dalgaard

#
Dear List,

I have been trying to use p.adjust() to do BH multiple test correction and have gotten some unexpected results. I thought that the equation for this was:

pBH = p*n/i

where p is the original p value, n is the number of tests and i is the rank of the p value. However when I try and recreate the corrected p from my most significant value it does not match up to the one computed by the method p.adjust:
cg27433479 
0.05030589
cg27433479 
0.09269194 

I tried to recreate this is a small example of a vector of 5 p values but everything worked as expected there. I was wondering if there is some subtle difference about how p.adjust operates? Is there something more complicated about how to calculate 'n' or 'i' - perhaps due to identical p values being assigned the same rank or something? Does anyone have an idea what might be going on here?

Many thanks,

Scott
#
On Jul 20, 2013, at 10:37 AM, Scott Robinson wrote:

            
Looking at the code for `p.adjust`, you see that the method is picked from a switch function

lp <- length(p)
BH = {
        i <- lp:1L
        o <- order(p, decreasing = TRUE)
        ro <- order(o)
        pmin(1, cummin(n/i * p[o]))[ro]
    }

You may not have sorted the p-values in pList.
No data provided, so unable to pursue this further.

  
    
#
My understanding was that the vector was ranked, the adjusted p vector was calculated and then the vector is returned to the original order - I came across a stack overflow answer saying this:

http://stackoverflow.com/questions/10323817/r-unexpected-results-from-p-adjust-fdr

Although the code there does not appear to be the same as when I type "p.adjust" into the command line. The order shouldn't matter anyway since my data was ordered by p.

Yesterday I tried a short example of 5 numbers and it seemed to work out though today I tried to do another short example to demonstrate that the order in the p vector you input doesn't matter but didn't quite get a working example this time. Maybe due to a rounding to first significant figure or something?
first second   last 
 2e-02  5e-01  3e-04
[1] 0.015
[1] 0.5
[1] 3e-04

In any case I reconstructed a large example which can be run without real data where the figure is way off and definitely not the result of a rounding error:
[1] 9999991
[1] 1e-07
[1] 0.1
[1] 0.9999991

Any help with this would be very much appreciated. It seems like it ought to be such a simple and commonly used method and yet I am struggling and not sure what to do about it.

Thanks,

Scott
#
On Jul 21, 2013, at 15:02 , Scott Robinson wrote:

            
Comparing the same method might help!
[1] 0.0150 0.5000 0.0003
You have the source code, how about reading it?

    }, BH = {
        i <- lp:1L
        o <- order(p, decreasing = TRUE)
        ro <- order(o)
        pmin(1, cummin(n/i * p[o]))[ro]
    }

Notice the cumulative minimum. The first element in n/i*p[o] is going to be p[n] == 0.1 (since your p is in ascending order). So no element of the result is going to be bigger than 0.1. (I presume this is because p-adjustments must be order-preserving.)