p-val issue for ranked two-group test

On Oct 20, 2011, at 23:48 , Joshua Wiley wrote:

Hi,

It looks like you are trying to manually bootstrap.
Nope. It's a manually performed approximate Wilcoxon test. Which is fair enough if the object is to learn something. (Notice, however, that the ExactRankTests package eats this sort of problem for breakfast.)
You are correct, Peter.  Apologies, Laurel, for a hastie read and
reply to your email.
As for the actual code, the problem seems mainly to be unclear thinking. The most glaring problem is that "exptdiff" is never actually calculated, but even if it were, it wouldn't be ?what the comment says that it is. If it was, it would logically be zero since the "expected mean rank" is the same in both groups (under the null, but what else could be meant?). More likely the intention is to calculate the _observed_ difference in the mean rank, as is done for pdiff (what does "p" stand for??) inside the loop, but based on both.ranks rather than on x. With clarified thinking, I'd expect things to fall into place rather easily.

Take a look at:

require(boot)
?boot

as an added advantage of using boot instead of trying to do it
manually, you can easily parallelize. ?In fact, if you are using one
of the pre-release versions of 2.14.0, the new parallel package is
included by default and you do not even have to go venturing out into
the wide world of CRAN to look. ?That said, there are several aspects
of your code that could be readily vectorized. ?More specific details
supplied if a less homework-like example is provided.

Cheers,

Josh

On Thu, Oct 20, 2011 at 10:17 AM, Laurel Klein Serieys
<laurelklein at ucla.edu> wrote:
Hi-
I'm wondering if anyone can help me with my code. ?I'm coming up dry when I
try to get a p-value from the following code. ?If I make a histogram of my
resampled distribution, I find the difference between by groups to be
significant. ?I've ranked the data since I have outliers in one of my
groups.

mange= c(35, ?60, ?81, 158, 89, 130, ?90, ?38, 119, 137, ?52, ?30, ?27, 115,
123, ?31, 124, ?91)

healthy= c(46, 50, 30, 58, 32, 42, 42, 33, 19, 42, 30, 26, 38, 23, 16, 28,
42, 42, 33, 35, 51, 31, 39, 40 , 42, 38, 36, 39, 38)

l.mange<-length(mange)
l.healthy<-length(healthy)

exptdiff <- mean.mange - mean.healthy #the expected difference between
between the mean of the ranked groups

both.chemistry<-c(mange, healthy) #concatenate two vectors into one in
preparation for resampling the data

both.ranks<-rank(both.chemistry) #rank combined data in the case that there
are outlying values in the data or the dataset is small

reps=1000

z<-rep(NA,reps) # z will the the simulated storage value for the resampling
efforts

for(i in 1:reps){ #create the loop

x<- sample(both.ranks, length(both.ranks),replace=FALSE) #instructions for
how to resample where sample the entire combined data without replacment

p.mange<-mean(x[(1:l.mange)]) ?#create a simulate mean value for the
resampled mange values
p.healthy<-mean(x[(l.mange+1):(l.mange+l.healthy)]) ?#create a simulated
mean value for the resampled healthy values

pdiff<- p.mange-p.healthy #the simulated difference between groups

z[i]<- pdiff ?#the stored list of simulated differences
}
p=mean(z>=exptdiff)*2 #2-tailed test multiply by two
p

hist(z, xlab="Resample Values", main="Distribution for Two-Group BUN Test")
confints=quantile(z, c(0.025,0.975))
abline(v=confints, col="blue") #draw a line for each cutoffs
abline(v=exptdiff, col="red")

Thanks!
L.Serieys

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk ?Priv: PDalgd at gmail.com

p-val issue for ranked two-group test

Thread (4 messages)