Skip to content

help on permutation/randomization test

6 messages · Meyners, Michael, Wenjin Mao, Greg Snow

#
Hi,

I have two groups of data of different size:
   group A: x1, x2, ...., x_n;
   group B: y1, y2, ...., y_m; (m is not equal to n)

The two groups are independent but observations within each group are
not independent,
 i.e., x1, x2, ..., x_n are not independent; but x's are independent from y's

I wonder if randomization test is still applicable to this case. Does
R have any function that can do this test for large m and n? I notice
that "permtest" can only handle small (m+n<22) samples.

Thank you very much,
Wenjin
#
I suspect you need to give more information/background on the data (though this is not primarily an R-related question; you might want to try other resources instead). Unless I'm missing something here, I cannot think of ANY reasonable test: A permutation (using permtest or anything else) would destroy the correlation structure and hence give invalid results, and the assumptions of parametric tests are violated as well. Basically, you only have two observations, one for each group; with some good will you might consider these as repeated measurements, but still on the same subject or whatsoever. Hence no way to discriminate the subject from a treatment effect. There is not enough data to permute or to rely a statistical test on. So unless you can get rid of the dependency within groups (or at least reasonably assume observations to be independent), I'm not very optimistic...
HTH, Michael
#
If the x's that don't enter at the same time can be considered independent of each other, and only clusters that enter at the same time are dependent, then you can still do a permutation test by creating clusters with dependent values within each cluster, but independent between clusters, then permute the clusters rather than the individual data points.  This maintains the dependency.

I don't know of any existing functions that will do the whole thing for you, but this would only be a few lines of R code to do this type of permutation test.  The split function can help with separating the clusters, sample can do the permutations, and unlist or sapply can be used in calculating the statistic of interest.

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Wenjin Mao
Sent: Tuesday, May 24, 2011 11:22 AM
To: Meyners, Michael
Cc: r-help at r-project.org
Subject: Re: [R] help on permutation/randomization test

Thank you, Michael.

I don't think those data for the same group can be treated as repeated
measurements. Let's say I have 1000 observations from group 1 and 1500 obs
from group 2. Some of the 1000 objects of group 1 entered the system at the
same time and may effect each other; same for the other group. It's hard to
measure the heaviness of the dependency.

Even after some twist or transformation, the correlation can be reduced, the
R function "permtest" cannot handle such high sample size. Is there any
other R function I can use?

Thanks,
Wenjin
On Tue, May 24, 2011 at 1:37 AM, Meyners, Michael <meyners.m at pg.com> wrote:

            
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.