Dear Quentin,
Please keep the mailing list in cc.
Dropping non significant terms from an ordered factor is not ok. That
would change the interpretation of the factor. You wouldn't drop non
significant levels of an unordered factor either.
Ben's solution is about multiple comparisons with random (and fixed)
effects. You're only dealing with multiple comparisons with fixed
effects. So glht() will do the trick.
Try plotting the predicted values for all relevant combinations of the
fixed effects. I find that easier to interpret than just a bunch of
coefficients.
Best regards,
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
To call in the statistician after the experiment is done may be no
more than asking him to perform a post-mortem examination: he may be
able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does
not ensure that a reasonable answer can be extracted from a given body
of data. ~ John Tukey
2015-04-27 9:30 GMT+02:00 Quentin Schorpp <quentin.schorpp at ti.bund.de
<mailto:quentin.schorpp at ti.bund.de>>:
Dear Thierry,
thank you very much for your advice, i didn't hear about the
offset() argument, yet. I'll try this immidiately.
My first intention was, that the large error bars are due to the
high variability and the low number of replicates (3), which stays
true, i think. The point was, that i just couldn't explain, why
the output was so weird, despite that the model stated
significance and 0 was not within the confidence intervals.
Now, it makes me confident to have the advice of using glht from
you. I was kind of unsecure, regarding the glht procedure,
because i read this post on stack overflow by Ben Bolker:
http://stackoverflow.com/questions/25949701/post-hoc-test-in-generalised-linear-mixed-models-how-to-do
Next I'll give up treating age_class as an ordered factor,
although i found the idea quite interesting.
Do you know where to get information about using ordered factors
in a glmm?
The output was not significant for the cubic term in my case, and
i asked myself if i could/should skip it from the model.
Probably i could use samplingCampaign as ordered factor.
One thing I'm still interested in, is a tutorial that shows how to
present results from more than one univariate analysis in a way to
have it ready for publication.
In my opinion eyerthing, even in the books, is about modeling
procedure, validation, cryptic names of important coefficients. I
know cooking with an 0815 recipe is dangerous, but that's not what
I'm looking for.
Thank you again for your help!!
Quentin
Am 26.04.2015 um 20:37 schrieb Thierry Onkelinx:
Dear Quentin,
- You better use an offset if you want to express the model in
terms of m?. Just add offset(log(0.25)) to the model.
- I'd rather treat samplingCamping as a factor.
- You can get post hoc comparisons with the multcomp package. See
the example below.
library(glmmADMB)
Owls$Interaction <- interaction(Owls$FoodTreatment, Owls$SexParent)
om <- glmmadmb(SiblingNegotiation~
Interaction+(1|Nest)+offset(log(BroodSize)),zeroInflation=TRUE,family="nbinom",data=Owls)
library(multcomp)
pairwise <- glht(om, mcp(Interaction = "Tukey"))
pairwise.ci <http://pairwise.ci> <- confint(pairwise)
library(ggplot2)
ggplot(pairwise.ci <http://pairwise.ci>, aes(y = lhs, x =
exp(estimate), xmin = exp(lwr), xmax = exp(upr))) +
geom_errorbarh() + geom_point() + geom_vline(xintercept = 1)
Best regards,
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for
Nature and Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality
Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
To call in the statistician after the experiment is done may be
no more than asking him to perform a post-mortem examination: he
may be able to say what the experiment died of. ~ Sir Ronald
Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer
does not ensure that a reasonable answer can be extracted from a
given body of data. ~ John Tukey
2015-04-24 15:08 GMT+02:00 Quentin Schorpp
<quentin.schorpp at ti.bund.de <mailto:quentin.schorpp at ti.bund.de>>:
Hello Everyone,
I'm using glmmadmb() to model abundance data (counts) of soil
organisms
(e.g. earthworms). My design covers agricultural fields of five
different age classes . Every age-class has three field
replicates.
Additionally every field was sampled 3times during the
investigation
(atumn, spring, autumn -> sampling campaign).
with 4 samples taken randomly at each field (1 Sample = 0.25 m?).
Several environmental parameters were assessed for each field
but never
for one of the four samples explicitly.
Hence the environmental data is often redundant for the samples,
especially when climatic measurements were even similar for
more ththan
one field.
My hypothesis is that abundance increases with the age_class
My model is:
model <- glmmadmb(abundance ~ age_class + samplingCampaign +
environmental1 + env2 + env3 + (1|field), data, family="poisson")
age_class = ordered factor
sampling campaign = continous, difference to the first
sampling in days
(first sampling always 0)
env(1-n) = continous
total number of samples = 180
Dispersion factor is 1.45
I do model validation with
1. plot pearson residuals against fitted values
2. plot pearson residuals against each covariate in the model
3. make a histogram of the residuals
In my opinion everything looks ok.
Now I have the really really big problem: *I just don't know
how to
present the results.*
I'd like to do a barplot with mean abundances for age_class
and standard
errors and do Post Hoc tukey test to look at differences
between the
factor levels. But i just don't know how to to these Post-Hoc
tests.
I've got one approach for extracting predictions and standard
errors for
predictions using a test dataset with mean environmental
variables:
test.data=expand.grid(age_class=levels(data$age_class),
samplingCampaign = data$samcam),
env1 = mean(data$env1),
env2 = max(data$env2))
pred.abundance <- cbind(test.data,
predict(model, test.data,
type="link", se.fit=TRUE),
abundance.response =
predict(model, test.data, type="response"))
pred.anc <- within(pred.abundance, {
anc <- 4*exp(fit)
LL <- 4*exp(fit -
1.96 * se.fit)
UL <- 4*exp(fit +
1.96 *
se.fit) })
Then I make a plot and get INCREDIBLY large standard errors
and in
contrast to the boxplot of the predicitons
(plot(data$age_class,predict(model, type="response")), the
abundance is
not increasing with the age_class. I multiply by 4 since i
want to
present the results per m?
Do you know where the mistake is?
I would appreciate if you could help me with this analysis,
since I'm
trying to learn GLMM for more than a year and i can't ask a
real person
here at this Institution. Thanks in advance,
Quentin
--
Quentin Schorpp, M.Sc.
Th?nen Institute of Biodiversity
Bundesallee 50
38116 Braunschweig (Germany)
Tel: +49 531 596-2524 <tel:%2B49%20531%20596-2524>
Fax: +49 531 596-2599 <tel:%2B49%20531%20596-2599>
Mail: quentin.schorpp at ti.bund.de
<mailto:quentin.schorpp at ti.bund.de>
Web: http://www.ti.bund.de
The Johann Heinrich von Th?nen Institute, Federal Research
Institute for Rural Areas, Forestry and Fisheries ? Th?nen
Institute in brief ?
consists of 15 specialized institutes that carry out research
and provide policy advice in the fields of economy, ecology
and technology.
[[alternative HTML version deleted]]