R-sig-mixed-models Digest, Vol 136, Issue 41
On 27 April 2018 at 22:33, Nicolas Flaibani <n.flaiba at hotmail.com> wrote:
Hi everyone! I would like to join the discussion of the second topic (Specifying Multiple Random Effects in NLME by Dan) but in my case I?m going to talk about the lme4 package, instead of the nmle. I think that the question is still relevant. I?ve had a doubt for a long time about how to investigate the interactions between random and fixed effects. I?ve read a lot of forums, papers and help?s packages and I?ve always concluded that the correct form of testing the interaction between a random and a fixed variable was: Model1 <- lmer (Y ~ X1 + X2 + (X1 | Random Variable) However, I found in some forums and personal communications from several researchers that there is another way to investigate the interaction between random and fixed variables and has the following syntax: Model2 <- lmer (Y ~ X1 + X2 + (1 | Random Variable: X1)
The 'classical' (think DOE and 'variance-components') interaction is Model2 if X1 is categorical/factor and Model1 if X1 is continuous (then Model1 is sometimes called the 'random coefficient model'). If X1 is continuous Model2 doesn't make sense - on the other hand Model1 can be fitted if X1 is a factor. In that case Model1 is rather complex and the number of parameters grows rapidly with the number of levels of X1 - in comparison the random term in Model2 uses just one (variance) parameter. While some people often favour 'Model1 with factor X1'- construction, I often think it is difficult to explain what this model actually means; it is also very often overfitting since it requires a lot of data and special data-generating mechanism to support models of that form.
I understand that this syntax (1|Random Variable/X1) = (1|Random Variable)+(1|Random Variable:X1) specify a nested structure between the variables and this is not the case of interest.
This construction serves two purposes: one is when the random-effect variables have a 'truly' nested structure such as pupils in classes (in schools, in districts, etc). The other purpose often applies to designed experiments where you might have machines (fixed) and operators (random). The main effects model is then Model3 <- lmer(Y ~ machines + (1 | operators)) and a model that includes the interaction reads Model3b <- lmer(Y ~ machines + (1 | operators) + (1 | machines:operators)) Technically the 'machines:operators' combinations are nested in 'operators' but we usually don't think of it that way. The point here is that we need to also consider Model3b as an alternative to Model1 and Model2. Of these models, Model3 _and_ Model2 are the simplest while Model1 is the most complex with Model3b in the middle often serving as an appropriate compromise.
My particular question is whether the syntax of the Model 2 is correct to test interactions between random and fixed variables. If this model is correct, which are the differences with the syntax of Model 1, since the resulting models are clearly different? Besides, in coincidence with the question of Dan (?Are there cases where one vs. the other formulation should absolutely be used? My understanding that for continuous variables, e.g., multiple measurements across multiple days, Days|Object would be the correct syntax. But here we're talking about a factor variable?), I ask if one type of syntax should be used if the fixed variables are continuous or there are factors.
If I compare the summary from a model with the latter syntax (model 2), with the summary of the same analysis made with a statistic program (like Statistica), the results are very similar. That?s not the case with the model 1.
For example, if I analyze a morphological trait with the syntax
M2 <- lmer (Wing ~ Temperature * Sex + Temperature + Sex + (1 | Line) + (1 | Line:Sex:Temperature) + (1 | Line:Sex) + (1 | Line:Temperature))
the summary is the following:
Random effects:
Groups Name Variance Std.Dev.
Line:Sex:Temperature (Intercept) 14.6231 3.8240
Line:Temperature (Intercept) 154.7685 12.4406
Line:Sex (Intercept) 0.6947 0.8335
Line (Intercept) 72.5945 8.5202
Residual 180.0664 13.4189
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 501.141 2.268 96.940 221.009 < 2e-16 ***
Temperature25 -57.960 2.699 54.800 -21.473 < 2e-16 ***
SexM -53.639 1.001 96.260 -53.584 < 2e-16 ***
Temperature25:SexM -6.488 1.391 48.300 -4.663 2.49e-05 ***
I found that the function rand() from the lmerTest package gives me the p values of the random effects if I write the model like this:
rand(M2)
Analysis of Random effects Table:
Chi.sq Chi.DF p.value
Line 4.6152 1 0.03 *
Line:Sex:Temperature 30.8130 1 3e-08 ***
Line:Sex 0.0391 1 0.84
Line:Temperature 112.1539 1 <2e-16 ***
I don?t know if this function is reliable because it is not mentioned for testing the significance of the random effects in the page https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html
Having written rand (=ranova) my views may be biased but I should say that it _is_ reliable. That is, I haven't seen cases where it's not doing what was intended ;-) rand() and ranova() simply compares two models using anova(m1, m2, refit=FALSE) where m2 differs from m1 by having one of its random-effect terms reduced or removed.
The summary of the same analysis made with the Statistica is: Effect SS Degr. of MS Den.Syn. Den.Syn. F p Intercept Fixed 764579151 1 764579151 48,001 12655,91 60412,83 0,000000 Line Random 608181 48 12670 47,800 6489,64 1,95 0,011254 Sex Fixed 3138661 1 3138661 48,038 495,62 6332,81 0,000000 Temperature Fixed 3686660 1 3686660 48,003 6459,62 570,72 0,000000 Line*Sex Random 23808 48 496 48,000 473,30 1,05 0,435866 Line*Temperature Random 310413 48 6467 48,000 473,30 13,66 0,000000 Sex*Temperature Fixed 10075 1 10075 48,040 472,94 21,30 0,000029 Line*Sex*Temperature Random 22718 48 473 3696,000 167,33 2,83 0,000000 Error 618467 3696 167 But if I write the model with the other syntax: M1 <- lmer(Wing ~ Temperature * Sex + (Temperature * Sex | Line))
Writing the model as M1 <- lmer(Wing ~ Temperature * Sex + (0 + Temperature:Sex | Line)) is often preferable as it makes the random-effect variance-covariance matrix easier to interpret.
the summary is the following:
REML criterion at convergence: 31440.9
Random effects:
Groups Name Variance Std.Dev. Corr
Line (Intercept) 266.78 16.333
Temperature25 398.27 19.957 -0.60
SexM 41.54 6.446 -0.56 0.46
Temperature25:SexM 61.34 7.832 0.56 -0.61 -0.80
Residual 167.33 12.936
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 501.603 2.371 48.046 211.586 < 2e-16 ***
Temperature25 -58.423 2.911 48.027 -20.070 < 2e-16 ***
SexM -53.659 1.095 47.964 -49.023 < 2e-16 ***
Temperature 25:SexM -6.470 1.393 48.278 -4.644 2.66e-05 ***
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
In addition, if I apply the ?rand function? for this syntax (M1), it doesn?t retourn the whole p-values of the random effects (Do not give me p value for line and line*Temperature*Sex)
Use lmerTest::ranova(M1, reduce.terms=FALSE) to achieve a test for the entire random-effect term (and make sure you have lmerTest version >= 3.0-0 installed).
Analysis of Random effects Table:
Chi.sq Chi.DF p.value
Temperatura:L?nea 0.00e+00 0 1
Sexo:L?nea 1.46e-10 0 <2e-16 ***
I really appreciate your time and dedication for answering this questions. Thank you for trying to help us understand a little more about the syntax of these complex models and thus better understand their correct approach.
You are not alone if you think this is complicated. My students are for the most part challenged in simply writing up the mathematical formula that correctly represents models such as Model1 - transitioning from scalar random-effect terms (e.g. Model3b) to vector-valued random-effect terms (e.g. Model1) often takes time and dedication. Cheers Rune
Thank you very much for your time everyone.
Greetings,
Nicolas
----------------------------------------------------------------------------------------
Message: 2
Date: Wed, 25 Apr 2018 16:11:38 -0400
From: Dan <ieshan at gmail.com>
To: "R-SIG-Mixed-Models at R-project.org"
<R-sig-mixed-models at r-project.org>, Ben Bolker <bbolker at gmail.com>
Subject: [R-sig-ME] Specifying Multiple Random Effects in NLME
Message-ID:
<CAET4i1f-SH5zA6xcCxNXc091HCk+snMv+rFm0tf995yYukiCOw at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi all:
I am curating an off-list thread about specifying multiple random effects
in NLME.
1. If it's (1|Object) + (1|Object:Coating) that you want then
you should be able to use a nested specification (which nlme *can*
handle relatively easily), i.e. something like
random=a+b+c~1|Object/Coating
Although (Coating|Object) and (1|Object:Coating) both in some sense
represent "interactions" the latter is *much* simpler/more parsimonious.
If you're OK with 1|Object:Coating rather than Coating|Object it
should be *much* faster. If you don't understand the distinction (which
would be absolutely fine and understandable) can we resume the
discussion on r-sig-mixed-models ... ?
-----------
So:
random=a+b+c~Coating|Object
does not fit.
But:
random=a+b+c~Object/Coating
fits.
Can you better explain the distinction here? I have sometimes used the
1|Object:Coating + 1|Object syntax and sometimes the Coating|Object syntax
in other models. My experience/understanding is that the former syntax with
multiple "within subject" variables produces exactly matching output to the
standard "repeated measures ANOVA" with the lmer assumption of compound
symmetry.
Are there cases where one vs. the other formulation should absolutely be
used? My understanding that for continuous variables, e.g., multiple
measurements across multiple days, Days|Object would be the correct syntax.
But here we're talking about a factor variable.
2. I'm trying to read the "random" section for nlme right now but it's
kind of making my head explode (and I think there's a typo: " the same
as the order of the order of the elements in the list"). It *sounds*
like (1) explicitly creating an interaction
ObjCoating=interaction(Object,Coating) and (2) using something like
list(ObjCoating=a~1,Object=b~1,Object=c~1)
should work (grouping factors as names, then [right-hand-side variable
name]~[random effects model], but I'm worried about the phrase "The
order of nesting will be assumed the same as the order of the elements
in the list": what nesting?
-----------
I think that formulation is explicitly in order. I replaced your first
ObjCoating with simply Object, just to test what would happen:
Random effects:
Formula: a ~ 1 | Object
a.(Intercept)
StdDev: 1.305816
Formula: b ~ 1 | Object %in% Object
b.(Intercept)
StdDev: 0.01576521
Formula: c ~ 1 | Object %in% Object %in% Object
c.(Intercept) Residual
StdDev: 2.677883 2.219676
[[alternative HTML version deleted]]
*****
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models