Skip to content
Prev 16565 / 20628 Next

R-sig-mixed-models Digest, Vol 136, Issue 41

Hi, Rune,
Thank you, a lot, for answering so fast to my question (R-sig-mixed-models Digest, Vol 136, Issue 41). These days I?ve been reading the mails that you have been sending about how to investigate the interaction between random and fixed effects and how to estimate the variance components in these models.
To put the question and examples in context, I want to briefly comment the research subject of my laboratory. The working line that is developed here study different characters (morphological, physiological and behavioral) in several species of Drosophila. Because of its easy way of maintenance, Drosophila is perfect to generate isolines and have an accurate estimate of genetics components. For example, one of the working lines involucrates the study of the thorax or wing length (like estimators of body size) in fly raised at different temperatures for diverse genetic lines. One of the principal interests is the study of the interaction between the fixed effects variables (i. e. Temperature) and the random effect variable (Line) to study the genotype-phenotype interaction. In this sense, to further compare different populations we are interested in estimate the variance components (by percentages of variance explained by Line and the interaction Line*Temperature, qualitatively).

As you said ??Model1 can be fitted if X1 is a factor. In that case Model1 is rather complex and the number of parameters grows rapidly with the number of levels of X1 - in comparison the random term in Model2 uses just one (variance) parameter?, in the first place I am going to simplify the model to try to better understand.
So, as the first option, I decided to try a model like the Model 1:
mt <-lmer(Torax ~Temperature+Sex+ (Temperature|Line), data)
Random effects:
 Groups   Name          Variance Std.Dev. Corr
 Line     (Intercept)      49.27    7.019
          Temperature25 52.80    7.266    -0.55
 Residual                       19.21    4.383
Number of obs: 780, groups:  Linea, 49

In addition, trying to resume my interest in knowing which is the variance explained by the interaction Line*Temperature and by Line, I?ve arrived to these models:
mtb <- lmer(Torax ~Temperature +Sex+ (0+Temperature|Line), data) AIC=4814,43
Random effects:
 Groups   Name          Variance Std.Dev. Corr
 Line       Temperature17  49.27    7.019
               Temperature25  46.38    6.811    0.45
 Residual                   19.21    4.383
Number of obs: 780, groups:  Linea, 49

mtb2 <- lmer(Torax ~Temperature +Sex+ (Temperature||Line), data) AIC=4816,43
Random effects:
 Groups   Name          Variance Std.Dev. Corr
 Line     (Intercept)      22.41    4.734
 Line1   Temperature17  26.86    5.182
          Temperature25  23.97    4.896    -0.04
 Residual                 19.21    4.383
Number of obs: 780, groups:  Linea, 49
df      AIC
mt    7 4814.428
mtb   7 4814.428
mtb2  8 4816.428

They have the same fixed effects, but the difference resides in how they expressed the variance components. So, if I?m interested in obtain the whole variance components (Line and Line*Temperature), do I need to run the mtb2 model? The other models aren?t useful? The difference between the three models is how we interpret and how they show us the variance?
On one hand, the difference between mtb and mtb2 is the existence of one parameter of variance related to Line? (If I want to estimate the variance for each component it will be Line=22.41 and Line:Temperature(26.86+23.97) or this last sum doesn?t make sense?)
It's very difficult for me to understand how to interpret the variance components in the models mt and mtb. In addition, in your first answer to Jung, you said ?First, '(recipe || replicate)' is the same as/expands to '(1 |replicate) + (0 + recipe | replicate)' which is just an over-parameterized version of '(0 + recipe | replicate)', which again is a re-parameterized version of '(recipe | replicate)'. These are all representing the same model (all on 10 df though lmer() is mislead and thinks that m_zcp has 11 df)??  I don?t understand why you said that the model (recipe || replicate) is over-parameterized, if it has the same parameters that (0 + recipe | replicate). Even if I notice that in the variance components I have different amounts of estimated variances, however I would not be seeing that this has an impact on the model worsening due to excess of estimated parameters. I do not know if I explain myself?
On the other hand, going back to your initial advice ?The point here is that we need to also consider Model3b as an alternative to Model1 and Model2. Of these models, Model3 _and_ Model2 are the simplest while Model1 is the most complex with Model3b in the middle often serving as an appropriate compromise..?  I run a model as
mt3 <- lmer(Torax ~ Sex +Temperature +(1|Line:Temperature),
AIC=4820.112
Random effects:
 Groups            Name        Variance Std.Dev.
 Line:Temperature (Intercept)  47.81    6.915
 Residual                       19.21    4.383
Number of obs: 780, groups:  Line:Temperature, 98

However, this syntax doesn?t let me see all variance components (doesn?t show Line). So, I decided to run another syntax.
mt3b <- lmer(Torax ~Sex +Temperature+ (1 | Line) +(1 | Temperature:Line),data)
Random effects:
Groups                       Name       Variance Std.Dev.
Temperature:Line  (Intercept) 26.40    5.138
Line                           (Intercept)  21.42    4.628
Residual                                         19.21    4.383
Number of obs: 780, groups:  Temperature:Line, 98; Linea, 49

If I compare all models:
           df      AIC
mt      7 4814.428
mtb    7 4814.428
mtb2  8 4816.428
mt3     5 4820.112
mt3b  6 4812.476

and only I focus in those that show me all variance components (Line and Line*Temperature, mtb2 and mt3b), can I choose the best model in function of AIC?
Going back to the general objective of estimate the variance components and evaluate the interaction between the random variables, my question is: the difference between the last models is that mtb2 is estimating 2 parameters more which correspond, on one hand, to the discrimination of the interaction?s variance of Line*Temperature in two components (Line*Temperature17 and Line*Temperature25) and on the other hand, to the correlation between those components. If this is correct, according to my objective, is more parsimonious mt3b? In adittion, the sum of the variance explained for the interaction Line*Temperature17 and Line*Temperature25 (if this make any sense) shouldn?t be similar or identical to the variance explained for Temperature:Line? Because if I compare those values I don?t see that happen.

Finally, in your last mail you enumerate different models that can be compared. I don?t understand the difference between the model fm5 and the model fm6c, I don?t know if you can clarify me a bit.

Thank you so much again for your patience and dedication.
Cheers,
Nicol?s.
Message-ID: <BN6PR1701MB173204741094A070E933AEBB8E9A0@BN6PR1701MB1732.namprd17.prod.outlook.com>
In-Reply-To: <CAG_uk934grrjf-oD8dW1hgSFFgakooULVaSrOWUm7z9WYd2W1g@mail.gmail.com>