p-val issue for ranked two-group test
On Thu, Oct 20, 2011 at 3:42 PM, peter dalgaard <pdalgd at gmail.com> wrote:
On Oct 20, 2011, at 23:48 , Joshua Wiley wrote:
Hi, It looks like you are trying to manually bootstrap.
Nope. It's a manually performed approximate Wilcoxon test. Which is fair enough if the object is to learn something. (Notice, however, that the ExactRankTests package eats this sort of problem for breakfast.)
You are correct, Peter. Apologies, Laurel, for a hastie read and reply to your email.
As for the actual code, the problem seems mainly to be unclear thinking. The most glaring problem is that "exptdiff" is never actually calculated, but even if it were, it wouldn't be ?what the comment says that it is. If it was, it would logically be zero since the "expected mean rank" is the same in both groups (under the null, but what else could be meant?). More likely the intention is to calculate the _observed_ difference in the mean rank, as is done for pdiff (what does "p" stand for??) inside the loop, but based on both.ranks rather than on x. With clarified thinking, I'd expect things to fall into place rather easily.
Take a look at: require(boot) ?boot as an added advantage of using boot instead of trying to do it manually, you can easily parallelize. ?In fact, if you are using one of the pre-release versions of 2.14.0, the new parallel package is included by default and you do not even have to go venturing out into the wide world of CRAN to look. ?That said, there are several aspects of your code that could be readily vectorized. ?More specific details supplied if a less homework-like example is provided. Cheers, Josh On Thu, Oct 20, 2011 at 10:17 AM, Laurel Klein Serieys <laurelklein at ucla.edu> wrote:
Hi-
I'm wondering if anyone can help me with my code. ?I'm coming up dry when I
try to get a p-value from the following code. ?If I make a histogram of my
resampled distribution, I find the difference between by groups to be
significant. ?I've ranked the data since I have outliers in one of my
groups.
mange= c(35, ?60, ?81, 158, 89, 130, ?90, ?38, 119, 137, ?52, ?30, ?27, 115,
123, ?31, 124, ?91)
healthy= c(46, 50, 30, 58, 32, 42, 42, 33, 19, 42, 30, 26, 38, 23, 16, 28,
42, 42, 33, 35, 51, 31, 39, 40 , 42, 38, 36, 39, 38)
l.mange<-length(mange)
l.healthy<-length(healthy)
exptdiff <- mean.mange - mean.healthy #the expected difference between
between the mean of the ranked groups
both.chemistry<-c(mange, healthy) #concatenate two vectors into one in
preparation for resampling the data
both.ranks<-rank(both.chemistry) #rank combined data in the case that there
are outlying values in the data or the dataset is small
reps=1000
z<-rep(NA,reps) # z will the the simulated storage value for the resampling
efforts
for(i in 1:reps){ #create the loop
x<- sample(both.ranks, length(both.ranks),replace=FALSE) #instructions for
how to resample where sample the entire combined data without replacment
p.mange<-mean(x[(1:l.mange)]) ?#create a simulate mean value for the
resampled mange values
p.healthy<-mean(x[(l.mange+1):(l.mange+l.healthy)]) ?#create a simulated
mean value for the resampled healthy values
pdiff<- p.mange-p.healthy #the simulated difference between groups
z[i]<- pdiff ?#the stored list of simulated differences
}
p=mean(z>=exptdiff)*2 #2-tailed test multiply by two
p
hist(z, xlab="Resample Values", main="Distribution for Two-Group BUN Test")
confints=quantile(z, c(0.025,0.975))
abline(v=confints, col="blue") #draw a line for each cutoffs
abline(v=exptdiff, col="red")
Thanks!
L.Serieys
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk ?Priv: PDalgd at gmail.com