Skip to content

R-sig-mixed-models Digest, Vol 136, Issue 41

2 messages · Nicolas Flaibani, Rune Haubo

#
Hi everyone!


I would like to join the discussion of the second topic (Specifying Multiple Random Effects in NLME by Dan) but in my case I?m going to talk about the lme4 package, instead of the nmle. I think that the question is still relevant.

I?ve had a doubt for a long time about how to investigate the interactions between random and fixed effects. I?ve read a lot of forums, papers and help?s packages and I?ve always concluded that the correct form of testing the interaction between a random and a fixed variable was:


Model1 <- lmer (Y ~ X1 + X2 + (X1 | Random Variable)


However, I found in some forums and personal communications from several researchers that there is another way to investigate the interaction between random and fixed variables and has the following syntax:


Model2 <- lmer (Y ~ X1 + X2 + (1 | Random Variable: X1)


I understand that this syntax


(1|Random Variable/X1) = (1|Random Variable)+(1|Random Variable:X1)


specify a nested structure between the variables and this is not the case of interest.

My particular question is whether the syntax of the Model 2 is correct to test interactions between random and fixed variables. If this model is correct, which are the differences with the syntax of Model 1, since the resulting models are clearly different? Besides, in coincidence with the question of Dan (?Are there cases where one vs. the other formulation should absolutely be used? My understanding that for continuous variables, e.g., multiple measurements across multiple days, Days|Object would be the correct syntax. But here we're talking about a factor variable?), I ask if one type of syntax should be used if the fixed variables are continuous or there are factors.

If I compare the summary from a model with the latter syntax (model 2), with the summary of the same analysis made with a statistic program (like Statistica), the results are very similar. That?s not the case with the model 1.

For example, if I analyze a morphological trait with the syntax


M2 <- lmer (Wing ~ Temperature * Sex + Temperature + Sex + (1 | Line) +   (1 | Line:Sex:Temperature) + (1 | Line:Sex) + (1 | Line:Temperature))


the summary is the following:


Random effects:

 Groups                               Name        Variance Std.Dev.

 Line:Sex:Temperature  (Intercept)  14.6231  3.8240

 Line:Temperature          (Intercept) 154.7685 12.4406

 Line:Sex                         (Intercept)   0.6947  0.8335

 Line                                 (Intercept)  72.5945  8.5202

 Residual                                             180.0664 13.4189


Fixed effects:

                    Estimate Std. Error      df t value                         Pr(>|t|)

(Intercept)                  501.141      2.268  96.940 221.009       < 2e-16 ***

Temperature25        -57.960      2.699  54.800 -21.473         < 2e-16 ***

SexM                         -53.639      1.001  96.260 -53.584         < 2e-16 ***

Temperature25:SexM   -6.488      1.391  48.300          -4.663 2.49e-05 ***


I found that the function rand() from the lmerTest package gives me the p values of the random effects if I write the model like this:
Analysis of Random effects Table:


                         Chi.sq Chi.DF p.value

Line                    4.6152      1    0.03 *

Line:Sex:Temperature  30.8130      1   3e-08 ***

Line:Sex                             0.0391      1    0.84

Line:Temperature          112.1539      1  <2e-16 ***


I don?t know if this function is reliable because it is not mentioned for testing the significance of the random effects in the page https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html

The summary of the same analysis made with the Statistica is:




Effect

SS

Degr. of

MS

Den.Syn.

Den.Syn.

F

p

Intercept

Fixed

764579151

1

764579151

48,001

12655,91

60412,83

0,000000

Line

Random

608181

48

12670

47,800

6489,64

1,95

0,011254

Sex

Fixed

3138661

1

3138661

48,038

495,62

6332,81

0,000000

Temperature

Fixed

3686660

1

3686660

48,003

6459,62

570,72

0,000000

Line*Sex

Random

23808

48

496

48,000

473,30

1,05

0,435866

Line*Temperature

Random

310413

48

6467

48,000

473,30

13,66

0,000000

Sex*Temperature

Fixed

10075

1

10075

48,040

472,94

21,30

0,000029

Line*Sex*Temperature

Random

22718

48

473

3696,000

167,33

2,83

0,000000

Error

618467

3696

167




But if I write the model with the other syntax:

M1 <- lmer(Wing ~ Temperature * Sex + (Temperature * Sex | Line))



the summary is the following:



REML criterion at convergence: 31440.9



Random effects:

 Groups   Name                Variance Std.Dev. Corr

 Line    (Intercept)           266.78   16.333

  Temperature25             398.27   19.957   -0.60

   SexM                                41.54    6.446   -0.56  0.46

  Temperature25:SexM  61.34    7.832    0.56 -0.61 -0.80

 Residual                             167.33   12.936



Fixed effects:

                                             Estimate Std. Error         df t value             Pr(>|t|)

(Intercept)                          501.603      2.371            48.046 211.586  < 2e-16 ***

Temperature25               -58.423      2.911               48.027 -20.070  < 2e-16 ***

SexM                                 -53.659      1.095              47.964 -49.023  < 2e-16 ***

Temperature                    25:SexM   -6.470               1.393  48.278  -4.644 2.66e-05 ***

---

Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1



In addition, if I apply the ?rand function? for this syntax (M1), it doesn?t retourn the whole p-values of the random effects (Do not give me p value for line and line*Temperature*Sex)

Analysis of Random effects Table:

                                             Chi.sq Chi.DF p.value

Temperatura:L?nea 0.00e+00      0       1

Sexo:L?nea             1.46e-10      0  <2e-16 ***


I really appreciate your time and dedication for answering this questions. Thank you for trying to help us understand a little more about the syntax of these complex models and thus better understand their correct approach.

Thank you very much for your time everyone.



Greetings,

Nicolas


----------------------------------------------------------------------------------------
Message: 2
Date: Wed, 25 Apr 2018 16:11:38 -0400
From: Dan <ieshan at gmail.com>
To: "R-SIG-Mixed-Models at R-project.org"
        <R-sig-mixed-models at r-project.org>, Ben Bolker <bbolker at gmail.com>
Subject: [R-sig-ME] Specifying Multiple Random Effects in NLME
Message-ID:
        <CAET4i1f-SH5zA6xcCxNXc091HCk+snMv+rFm0tf995yYukiCOw at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi all:

I am curating an off-list thread about specifying multiple random effects
in NLME.

1.  If it's (1|Object) + (1|Object:Coating) that you want then
you should be able to use a nested specification (which nlme *can*
handle relatively easily), i.e. something like

random=a+b+c~1|Object/Coating


Although (Coating|Object) and (1|Object:Coating) both in some sense
represent "interactions" the latter is *much* simpler/more parsimonious.

If you're OK with 1|Object:Coating rather than Coating|Object it
should be *much* faster.  If you don't understand the distinction (which
would be absolutely fine and understandable) can we resume the
discussion on r-sig-mixed-models ... ?
-----------

So:
random=a+b+c~Coating|Object
does not fit.

But:
random=a+b+c~Object/Coating
fits.

Can you better explain the distinction here? I have sometimes used the
1|Object:Coating + 1|Object syntax and sometimes the Coating|Object syntax
in other models. My experience/understanding is that the former syntax with
multiple "within subject" variables produces exactly matching output to the
standard "repeated measures ANOVA" with the lmer assumption of compound
symmetry.

Are there cases where one vs. the other formulation should absolutely be
used? My understanding that for continuous variables, e.g., multiple
measurements across multiple days, Days|Object would be the correct syntax.
But here we're talking about a factor variable.


2.   I'm trying to read the "random" section for nlme right now but it's
kind of making my head explode (and I think there's a typo: " the same
as the order of the order of the elements in the list").  It *sounds*
like (1) explicitly creating an interaction
ObjCoating=interaction(Object,Coating) and (2) using something like

  list(ObjCoating=a~1,Object=b~1,Object=c~1)

should work (grouping factors as names, then [right-hand-side variable
name]~[random effects model], but I'm worried about the phrase "The
order of nesting will be assumed the same as the order of the elements
in the list": what nesting?
-----------

I think that formulation is explicitly in order. I replaced your first
ObjCoating with simply Object, just to test what would happen:

Random effects:
 Formula: a ~ 1 | Object
        a.(Intercept)
StdDev:      1.305816

 Formula: b ~ 1 | Object %in% Object
        b.(Intercept)
StdDev:    0.01576521

 Formula: c ~ 1 | Object %in% Object %in% Object
        c.(Intercept) Residual
StdDev:      2.677883 2.219676





*****
#
On 27 April 2018 at 22:33, Nicolas Flaibani <n.flaiba at hotmail.com> wrote:
The 'classical' (think DOE and 'variance-components') interaction is
Model2 if X1 is categorical/factor and Model1 if X1 is continuous
(then Model1 is sometimes called the 'random coefficient model').

If X1 is continuous Model2 doesn't make sense - on the other hand
Model1 can be fitted if X1 is a factor. In that case Model1 is rather
complex and the number of parameters grows rapidly with the number of
levels of X1 - in comparison the random term in Model2 uses just one
(variance) parameter. While some people often favour 'Model1 with
factor X1'- construction, I often think it is difficult to explain
what this model actually means; it is also very often overfitting
since it requires a lot of data and special data-generating mechanism
to support models of that form.
This construction serves two purposes: one is when the random-effect
variables have a 'truly' nested structure such as pupils in classes
(in schools, in districts, etc). The other purpose often applies to
designed experiments where you might have machines (fixed) and
operators (random). The main effects model is then

Model3 <- lmer(Y ~ machines + (1 | operators))

and a model that includes the interaction reads

Model3b <- lmer(Y ~ machines + (1 | operators) + (1 | machines:operators))

Technically the 'machines:operators' combinations are nested in
'operators' but we usually don't think of it that way.

The point here is that we need to also consider Model3b as an
alternative to Model1 and Model2. Of these models, Model3 _and_ Model2
are the simplest while Model1 is the most complex with Model3b in the
middle often serving as an appropriate compromise.
Having written rand (=ranova) my views may be biased but I should say
that it _is_ reliable. That is, I haven't seen cases where it's not
doing what was intended ;-) rand() and ranova() simply compares two
models using anova(m1, m2, refit=FALSE) where m2 differs from m1 by
having one of its random-effect terms reduced or removed.
Writing the model as
M1 <- lmer(Wing ~ Temperature * Sex + (0 + Temperature:Sex | Line))
is often preferable as it makes the random-effect variance-covariance
matrix easier to interpret.
Use lmerTest::ranova(M1, reduce.terms=FALSE) to achieve a test for the
entire random-effect term (and make sure you have lmerTest version >=
3.0-0 installed).
You are not alone if you think this is complicated. My students are
for the most part challenged in simply writing up the mathematical
formula that correctly represents models such as Model1 -
transitioning from scalar random-effect terms (e.g. Model3b) to
vector-valued random-effect terms (e.g. Model1) often takes time and
dedication.

Cheers
Rune