Skip to content
Prev 1032 / 7420 Next

advice on joining replicates

On Fri, 2010-01-22 at 11:40 +0100, romunov wrote:
You have samples. Your replicates /are/ samples. The problem comes when
you try to assess the fitted "model" using permutation tests or
parametric theory (if any theory applies). Your data are replicated; you
don't have as many independent samples as the naive permutation test or
theory would presume. The answer is to use a model or a permutation test
that can take account of the dependences between your observations.

In your case, you have 10 stations and 5 replicates within each station.
It is reasonable to assume that the 5 replicate samples within each
station are correlated with one another. So in a permutation test of a
CCA, for example, on these data, we should not permute all the data
freely (i.e exchange samples between stations) because under the null
hypothesis, for your data, the samples aren't freely exchangeable (i.e
unrelated).

What we can do is condition the permutations on Station, so that we can
freely permute the samples with the stations but not exchange samples
between stations. This would be appropriate if you were testing the
effect of a covariate measured at the replicate (sample) level. This is
TEST1.

If you want to test the effect of a variable at the station level, then
we keep the replicates within stations fixed (i.e. we don't permute
those), but we do shuffle the stations. This is only possible if you
have equal numbers of replicates (samples) within stations. This is
TEST2

At the moment in package vegan, we allow for freely exchangeable (which
would be wrong for your data) permutations and for free permutation
within the levels of 'strata' for a CCA/RDA or related model. If you are
unsure, check if the function takes a 'strata' argument for the
permutations. The latter would be suitable for you in the case of TEST1
above. Currently, we don't support TEST2, although we are working on it
and are quite close now to having all the code in place that we need to
do some quite complex permutation designs, along the lines of those
available in Canoco.

So, if TEST1 is applicable to your science question (testing the effect
of a variable at the sample level, i.e. something that is not constant
at the station level), then we could do the following. I use Y as a
species matrix, and x as my explanatory variable (in data frame foo).
Station is a factor variable indicating which station each sample
belongs to. I further assume (when generating Station) that the samples
in Y and X are in the same order and that they are in Station order. If
they aren't in Station order, then you have to generate the Station
factor some other way:

Station <- gl(10,5, labels = paste("Station", 1:10))
mod <- cca(Y ~ x, data = foo)
permutest(mod, strata = Station)

If you need TEST2 or your samples within Stations are not freely
exchangeable (i.e. the replicates form a time series within each station
or are from a transect within the station) then currently you can't do
this in vegan without hacking the code, but this will be available in
the near future.

HTH

G