How to compute p-Values

8 messages · Andreas Klein, gregor rolshausen, Yinghai Deng +1 more

Original

1

8

Wed, Jan 14, 2009 1:52 AM #

Hello.


How can I compute the Bootstrap p-Value for a one- and two sided test, when I have a bootstrap sample of a statistic of 1000 for example?

My hypothesis are for example:

1. Two-Sided: H0: mean=0 vs. H1: mean!=0
2. One Sided: H0: mean>=0 vs. H1: mean<0



I hope you can help me


Thanks in advance


Regards,
Andreas

gregor rolshausen

Wed, Jan 14, 2009 2:31 AM #

Andreas Klein wrote:

hi,
do you want to test your original t.test against t.tests of bootstrapped 
samples from you data?

if so, you can just write a function creating a vector with the 
statistics (t) of the single t.tests (in your case 1000 t.tests each 
with a bootstrapped sample of your original data -> 1000 simulated 
t-values).
you extract them by:

 > tvalue=t.test(a~factor)$statistic

then just calculate the proportion of t-values from you bootstrapped 
tests that are bigger than your original t-value.

 >p=sum(simualted_tvalue>original_tvalue)/1000


(or did I get the question wrong?)

cheers,
gregor

Wed, Jan 14, 2009 6:23 AM #

Hello.

What I wanted was:

I have a sample of 100 relizations of a random variable and I want a p-Value for the hypothesis, that the the mean of the sample equals zero (H0) or not (H1). That is for a two sampled test.
The same question holds for a one sided version, where I want to know if the mean is bigger than zero (H0) or smaller or equal than zero (H1).

Therfore I draw a bootstrap sample with replacement from the original sample and compute the mean of that bootstrap sample. I repeat this 1000 times and obtain 1000 means.

Now: How can I compute the p-Value for an one sided and two sided test like described above?



Regards,
Andreas


--- gregor rolshausen <gregor.rolshausen at biologie.uni-freiburg.de> schrieb am Mi, 14.1.2009:

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,
reproducible code.

David Winsemius

Wed, Jan 14, 2009 6:35 AM #

I read the problem a bit differently than Andreas. I thought you were  
trying to create a *substitute* for the parametric t-test.

A p-value is not a statement about a group of tests. It is a statement  
about one sample of data in comparison with the theoretical (in the  
case of the parametric test), or on your case, with the bootstrap  
distribution. You want to construct a CDF of your distribution of  
means/s.d. values and package it up in a form that would allow you to  
return the *proportion* of values (the "p-value") above one particular  
new sample value.

?ecdf   #will give you information on how to turn 1000 realizations  
into a function, it's really pretty simple.

If your sample of potentially (but not necessarily) t-like statistics  
is tt then ttCDF <- ecdf(tt) will return nothing, but result in ttCDF  
becoming a function. Then with a sample value mean_a to test, you get  
useful results with:

ttCDF(mean_a)

Turning this into a "test" requires a bit more packaging but it think  
the road is clear ahead.

David Winsemius


On Jan 14, 2009, at 4:52 AM, Andreas Klein wrote:

> Hello.
>
>
> How can I compute the Bootstrap p-Value for a one- and two sided  
> test, when I have a bootstrap sample of a statistic of 1000 for  
> example?
>
> My hypothesis are for example:
>
> 1. Two-Sided: H0: mean=0 vs. H1: mean!=0
> 2. One Sided: H0: mean>=0 vs. H1: mean<0
>
>
>
> I hope you can help me
>
>
> Thanks in advance
>
>
> Regards,
> Andreas
>
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Yinghai Deng

Wed, Jan 14, 2009 7:13 AM #

I think what you need is just count. For example, if you want to know the p
value of the mean bigger than 0 and you have 5 such cases in your draws then
the p value is 5/1000=0.005, right?

HTH
YHDENG

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Andreas Klein
Sent: January 14, 2009 9:23 AM
To: r help
Subject: Re: [R] How to compute p-Values

Hello.

What I wanted was:

I have a sample of 100 relizations of a random variable and I want a p-Value
for the hypothesis, that the the mean of the sample equals zero (H0) or not
(H1). That is for a two sampled test.
The same question holds for a one sided version, where I want to know if the
mean is bigger than zero (H0) or smaller or equal than zero (H1).

Therfore I draw a bootstrap sample with replacement from the original sample
and compute the mean of that bootstrap sample. I repeat this 1000 times and
obtain 1000 means.

Now: How can I compute the p-Value for an one sided and two sided test like
described above?



Regards,
Andreas


--- gregor rolshausen <gregor.rolshausen at biologie.uni-freiburg.de> schrieb
am Mi, 14.1.2009:

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,
reproducible code.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius

Wed, Jan 14, 2009 7:40 AM #

I think we are at the stage where it is your responsibility to provide  
some code to set up the problem.

David Winsemius
On Jan 14, 2009, at 9:23 AM, Andreas Klein wrote:

> Hello.
>
> What I wanted was:
>
> I have a sample of 100 relizations of a random variable and I want a  
> p-Value for the hypothesis, that the the mean of the sample equals  
> zero (H0) or not (H1). That is for a two sampled test.
> The same question holds for a one sided version, where I want to  
> know if the mean is bigger than zero (H0) or smaller or equal than  
> zero (H1).
>
> Therfore I draw a bootstrap sample with replacement from the  
> original sample and compute the mean of that bootstrap sample. I  
> repeat this 1000 times and obtain 1000 means.
>
> Now: How can I compute the p-Value for an one sided and two sided  
> test like described above?
>
>
>
> Regards,
> Andreas
>
>
> --- gregor rolshausen <gregor.rolshausen at biologie.uni-freiburg.de>  
> schrieb am Mi, 14.1.2009:
>
>> Von: gregor rolshausen <gregor.rolshausen at biologie.uni-freiburg.de>
>> Betreff: Re: [R] How to compute p-Values
>> An: "r help" <r-help at r-project.org>
>> Datum: Mittwoch, 14. Januar 2009, 11:31
>> Andreas Klein wrote:
>>> Hello.
>>>
>>>
>>> How can I compute the Bootstrap p-Value for a one- and
>> two sided test, when I have a bootstrap sample of a
>> statistic of 1000 for example?
>>>
>>> My hypothesis are for example:
>>>
>>> 1. Two-Sided: H0: mean=0 vs. H1: mean!=0
>>> 2. One Sided: H0: mean>=0 vs. H1: mean<0
>>>
>>>
>> hi,
>> do you want to test your original t.test against t.tests of
>> bootstrapped samples from you data?
>>
>> if so, you can just write a function creating a vector with
>> the statistics (t) of the single t.tests (in your case 1000
>> t.tests each with a bootstrapped sample of your original
>> data -> 1000 simulated t-values).
>> you extract them by:
>>
>>> tvalue=t.test(a~factor)$statistic
>>
>> then just calculate the proportion of t-values from you
>> bootstrapped tests that are bigger than your original
>> t-value.
>>
>>> p=sum(simualted_tvalue>original_tvalue)/1000
>>
>>
>> (or did I get the question wrong?)
>>
>> cheers,
>> gregor
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained,
>> reproducible code.
>
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Wed, Jan 14, 2009 1:27 PM #

Ok... I set up the problem by some code as requested:

Example:

x <- rnorm(100)

mean_x <- mean(x)

mean_boot <- numeric(1000)

for (i in 1:1000) {

  mean_boot[i] <- mean(sample(x,100,replace=TRUE))

}


How can I compute the p-Value out of mean_boot for the following tests:

1. H0: mean_x = 0 vs. H1: mean_x != 0

2. H0: mean_x >= 0 vs. H1: mean_x < 0



Is there a possibility to construct such p-Values or did I get something wrong?

Someone told, that the p-Value = 2 * min(sum(mean_boot>=0)/1000, sum(mean_boot<0)/1000) for the first (two sided) test is, but didn't get the idea behind it. Maybe someone can explain it, if it is the solution to the problem.

Regards,
Andreas.



--- David Winsemius <dwinsemius at comcast.net> schrieb am Mi, 14.1.2009:

Von: David Winsemius <dwinsemius at comcast.net>
Betreff: Re: [R] How to compute p-Values
An: klein82517 at yahoo.de
CC: "r help" <r-help at r-project.org>
Datum: Mittwoch, 14. Januar 2009, 16:40
I think we are at the stage where it is your responsibility
to provide some code to set up the problem.

--David Winsemius
On Jan 14, 2009, at 9:23 AM, Andreas Klein wrote:

Hello.

What I wanted was:

I have a sample of 100 relizations of a random

variable and I want a p-Value for the hypothesis, that the
the mean of the sample equals zero (H0) or not (H1). That is
for a two sampled test.

The same question holds for a one sided version, where

I want to know if the mean is bigger than zero (H0) or
smaller or equal than zero (H1).

Therfore I draw a bootstrap sample with replacement

from the original sample and compute the mean of that
bootstrap sample. I repeat this 1000 times and obtain 1000
means.

Now: How can I compute the p-Value for an one sided

and two sided test like described above?



Regards,
Andreas


--- gregor rolshausen

<gregor.rolshausen at biologie.uni-freiburg.de> schrieb
am Mi, 14.1.2009:

Von: gregor rolshausen

<gregor.rolshausen at biologie.uni-freiburg.de>

Betreff: Re: [R] How to compute p-Values
An: "r help"

<r-help at r-project.org>

Datum: Mittwoch, 14. Januar 2009, 11:31
Andreas Klein wrote:

Hello.


How can I compute the Bootstrap p-Value for a

one- and

two sided test, when I have a bootstrap sample of

statistic of 1000 for example?

My hypothesis are for example:

1. Two-Sided: H0: mean=0 vs. H1: mean!=0
2. One Sided: H0: mean>=0 vs. H1: mean<0

hi,
do you want to test your original t.test against

t.tests of

bootstrapped samples from you data?

if so, you can just write a function creating a

vector with

the statistics (t) of the single t.tests (in your

case 1000

t.tests each with a bootstrapped sample of your

original

data -> 1000 simulated t-values).
you extract them by:

tvalue=t.test(a~factor)$statistic

then just calculate the proportion of t-values

from you

bootstrapped tests that are bigger than your

original

t-value.

p=sum(simualted_tvalue>original_tvalue)/1000


(or did I get the question wrong?)

cheers,
gregor

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,
reproducible code.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,
reproducible code.

David Winsemius

Wed, Jan 14, 2009 3:06 PM #

On Jan 14, 2009, at 4:27 PM, Andreas Klein wrote:

You should realize that you are conflating p-values and Neyman-Pearson  
hypothesis-testing formalism. It's perfectly possible that your  
textbook is also doing the same sort of conflation.

That does not sound correct. For one thing, no mention is made of  
comparing the mean of a particular sample (of the same size as the  
bootstrap samples) against the distribution of bootstrap means. For  
another thing, you want to know if the sample mean either is less than  
the 0.025 quantile of the boot_mean distribution or is greater than  
the 0.975 quantile. Perhaps your informant meant to construct the test  
as 2* min( sum(mean_boot[i] < mean_x)/1000, sum(mean_boot[i] > mean_x)/ 
1000).

In the HA: mean_x <0 directed one sided case, you are only interested  
in whether the mean is below the 0.05 quantile. For the HA: mean_x >0  
you want to know if mean_x is above the 0.95 quantile

In my realization of the mean_boot I get:

quantile(mean_boot, probs = c(0.025, 0.05, 0.95, 0.975)  )
       2.5%         5%        95%      97.5%
-0.2057778 -0.1643545  0.1562328  0.1825198

Those are going to be your critical points for alpha=0.05 Neyman- 
Pearson tests of the sorts 1 and 2. The outer numbers are for the two- 
sided alternative.

For calculation of the p-values, I still think you need ecdf and  
probably Hmisc:::inverseFuntion as well. For p-values you need to go  
from the observed value back to the proportion that exceed that value.


Regards,
Andreas.



--- David Winsemius <dwinsemius at comcast.net> schrieb am Mi, 14.1.2009:

Von: David Winsemius <dwinsemius at comcast.net>
Betreff: Re: [R] How to compute p-Values
An: klein82517 at yahoo.de
CC: "r help" <r-help at r-project.org>
Datum: Mittwoch, 14. Januar 2009, 16:40
I think we are at the stage where it is your responsibility
to provide some code to set up the problem.

--David Winsemius
On Jan 14, 2009, at 9:23 AM, Andreas Klein wrote:

Hello.

What I wanted was:

I have a sample of 100 relizations of a random

variable and I want a p-Value for the hypothesis, that the
the mean of the sample equals zero (H0) or not (H1). That is
for a two sampled test.

The same question holds for a one sided version, where

I want to know if the mean is bigger than zero (H0) or
smaller or equal than zero (H1).

Therfore I draw a bootstrap sample with replacement

from the original sample and compute the mean of that
bootstrap sample. I repeat this 1000 times and obtain 1000
means.

Now: How can I compute the p-Value for an one sided

and two sided test like described above?



Regards,
Andreas


--- gregor rolshausen

<gregor.rolshausen at biologie.uni-freiburg.de> schrieb
am Mi, 14.1.2009:

Von: gregor rolshausen

<gregor.rolshausen at biologie.uni-freiburg.de>

Betreff: Re: [R] How to compute p-Values
An: "r help"

<r-help at r-project.org>

Datum: Mittwoch, 14. Januar 2009, 11:31
Andreas Klein wrote:

Hello.


How can I compute the Bootstrap p-Value for a

one- and

two sided test, when I have a bootstrap sample of

statistic of 1000 for example?

My hypothesis are for example:

1. Two-Sided: H0: mean=0 vs. H1: mean!=0
2. One Sided: H0: mean>=0 vs. H1: mean<0

hi,
do you want to test your original t.test against

t.tests of

bootstrapped samples from you data?

if so, you can just write a function creating a

vector with

the statistics (t) of the single t.tests (in your

case 1000

t.tests each with a bootstrapped sample of your

original

data -> 1000 simulated t-values).
you extract them by:

tvalue=t.test(a~factor)$statistic

then just calculate the proportion of t-values

from you

bootstrapped tests that are bigger than your

original

t-value.

p=sum(simualted_tvalue>original_tvalue)/1000


(or did I get the question wrong?)

cheers,
gregor

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,
reproducible code.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,
reproducible code.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.