Skip to content

Small sample Size; repeated measurements binomial glmer

3 messages · Quentin Schorpp, Ben Bolker

#
Hello,

I searched a lot in the internet, but i didn't find sufficient information.
I believe I've got a very simple study design, however there are some 
characteristics taking me to the brink of possible.

I have two sampling campaigns, autumn year 1 and autumn year 2,
I sampled five agricultural fields of different ages, but each age_class 
has got only 3 repetitions.
My response is proportion of fungal feeding species.

I am interested in the effect of age classes (increase over time) and if 
this effect is reflected during the time of sampling.
Since i should be able to observe an increase during a 1 year time 
interval, then.

My Model is:
glmer(response ~ age_class*autumn + (1|field), family="binomial", 
weights=total number of Individuals, data)

However, I have the following problems:

1 - My N = 30, but my N(group) = 3
1 - I don't know the power of my analysis
2 - I'm not able to drop Outliers from the data (or am I?)
3 - my random factor has only 2 levels, so N(random) = 2

I think in Bolker et al. (2008) und Zuur et al. (2009) st. is said about 
that there is no need to use random factors when N(random) = 2

Since I am quite confused about my opportunities to handle patterns in 
residuals of the above model, I'm asking you about your opinions. Have i 
chosen the right Model formulation?

I think I'd feel more confident with a non parametric test, sth. like a 
rank based estimation of mixed effects nested models (rlme package), for 
which i found not a single example how to use them with repeated 
measures (also for PERMANOVA), or sth. else.
At least i need to report an Anova table and pairwise comparisons

yours sincerely,
Quentin
#
On Thu, Nov 5, 2015 at 7:16 AM, Quentin Schorpp
<quentin.schorpp at ti.bund.de> wrote:
Each field has a different, i.e. unique age, e.g. field 1 = 1.5,
field 2 = 2.3, field 3 = 3, field 4= 7, field 5 = 10?   3 samples each
in 2 autumns?  (This would be 5 x 3 x 2 = 30, but I'm not sure if
that's the actual experimental design ...)
That doesn't really matter.
To run a power analysis you need to decide what effect sizes you're
expecting.  There aren't simple canned power  analyses for mixed models
like ?power.t.test in base R, but

library("sos"); findFn("lme4 power analysis")

finds the hamlet, longpower, odprism, multiRR, pamm ... packages ...
or look at https://rpubs.com/bbolker/11703 ...
why not?
I'm confused.  You have 'field' as your grouping variable above.
I thought you said you had 5 fields?
Indeed, if you have fewer than about 5 groups, random effect estimates
are going to be low-power/unreliable (unless you do something fancy like
impose a Bayesian prior on the variance)
#
Hello,

Thank you very much for your answer,
I'm so sorry, what I wrote was misleading: I did not sample five
agricultural fields, but 15 agricultural fields of five different
age_classes, hence each age_class was replicated 3times.

And, yes my total number of samples is 30, each of the 15 fields was
sampled two times repeatedly. Therfore each field (i.e. subject appears
twice in the data)

I thought since I compare five different populations (the age_classes),
and the sample size would be the number of real replicates, in fact 3.
If I would drop an outlier then, I would reduce the number of replicates
(i.e. sample size) to only 2.

But if the total sample size is what matters, then it would be 30 and I'll
have one Problem less. Because I will be able to drop some outliers.

Things become even more complicated, since one category was not sampled
two times repeatedly over time, but the three fields of category five were
switched in the second year to three different fields, so this group has
got 6 replicates indeed. I think this is called partly nested.

I'm going to read your recommendations of literature, I am very curious
about, what degree of trustability can i expect from that kind of
experimental design.

What's your opinion, can I use the model the way it is, or should i
analyse each time-point separately, or build the mean of both?
Do you know a tutoraial or a book about the inclusion of bayesian priors
in binomial glmm's, that you could recommend me?

Again, thank you very much for taking your time and answering my
questions, unless i won't bother you, i'd like to say, that you
contributed more than anybody else to a deeper understanding of glmms for
so many people, by writing comments on stackoverflow. That is incredible,
thank you very much for that.

Quentin