correlation between random effects - R-SIG-mixed-models

Tue, Feb 13, 2018 2:43 AM #

Dear Jana,

Please keep the mailing list in cc.

I meant both centering and scaling.

Based on the summary of the model, you have on average 3.7
observations per species, which is a bit small for a random slope
model. What worries me is that the summary of the data indicates
several species with > 20 observation. Hence you will have lot of
species with only 1 or 2 observations. A species with only 2
observations, a small difference in dB1 and a large difference in MC
will likely result in a large random slope for dB1. You'll need to
investigate which species have a strong random slope and why. Most of
the time that is obvious once you plotted the data for that species.
Tip: plot the observations, the fitted values of the model and the
predictions using only the fixed effects.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
AND FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx at inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no
more than asking him to perform a post-mortem examination: he may be
able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does
not ensure that a reasonable answer can be extracted from a given body
of data. ~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////

2018-02-13 11:23 GMT+01:00 Jana Dlouha <jana.dlouha at inra.fr>:

Dear Thierry,

Thanks a lot for your reply. Yes, I have used dB1c also in the random effect.
I have just tried to scale dB1 but I have still the same problem. However, it is possible that I am not doing things well, I am not a statistician and moreover I am just discovering R...
You say that I should provide more information so here is the summary of my data for the two columns I am using in this model:
Species         MCs             dB1
327     :43     Min.    40.05   Min.    :1.050
135     :35     1st Qu. 72.53   1st Qu. :1.400
307     :24     Median  89.11   Median  :1.560
146     :23     Mean    99.56   Mean    :1.671
328     :23     3rd Qu. 116.23  3rd Qu. :1.840
341     :22     Max.    351.49  Max.    :4.220
(Other):2051

How I centered and scaled dB1:
dB1c<-scale(data$dB1,center=TRUE)
dB1s<-scale(data$dB1,center=FALSE, scale=TRUE)

summary of the model without centering or scaling dB1:
Linear mixed model fit by maximum likelihood t-tests use Satterthwaite approximations to
  degrees of freedom [lmerMod]
Formula: MCs ~ dB1 + (1 + dB1 | Species)
   Data: data

     AIC      BIC   logLik deviance df.resid
 10720.4  10754.7  -5354.2  10708.4     2215

Scaled residuals:
    Min      1Q  Median      3Q     Max
-8.7305 -0.3491  0.0651  0.4508  6.6068

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 21.48    4.635
          dB1         11.25    3.355    -1.00
 Residual              6.19    2.488
Number of obs: 2221, groups:  Species, 598

Fixed effects:
            Estimate Std. Error       df t value Pr(>|t|)
(Intercept) -64.2618     0.4249 273.4200  -151.2   <2e-16 ***
dB1          98.0060     0.2800 271.1900   350.0   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
    (Intr)
dB1 -0.987

summary of the centered model m4c:

summary(m4c)

Linear mixed model fit by maximum likelihood t-tests use Satterthwaite approximations to
  degrees of freedom [lmerMod]
Formula: MCs ~ dB1c + (1 + dB1c | Species)
   Data: data

     AIC      BIC   logLik deviance df.resid
 10720.4  10754.7  -5354.2  10708.4     2215

Scaled residuals:
    Min      1Q  Median      3Q     Max
-8.7305 -0.3491  0.0651  0.4508  6.6068

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 1.109    1.053
          dB1c        1.763    1.328    0.94
 Residual             6.190    2.488
Number of obs: 2221, groups:  Species, 598

Fixed effects:
             Estimate Std. Error        df t value Pr(>|t|)
(Intercept)  99.54109    0.08466 290.67000    1176   <2e-16 ***
dB1c         38.78838    0.11081 271.18000     350   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
     (Intr)
dB1c 0.575

and summary of the scaled model m4s:

summary(m4s)

Linear mixed model fit by maximum likelihood t-tests use Satterthwaite approximations to
  degrees of freedom [lmerMod]
Formula: MCs ~ dB1s + (1 + dB1s | Species)
   Data: data

     AIC      BIC   logLik deviance df.resid
 10720.4  10754.7  -5354.2  10708.4     2215

Scaled residuals:
    Min      1Q  Median      3Q     Max
-8.7305 -0.3491  0.0651  0.4508  6.6068

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 21.48    4.635
          dB1s        33.22    5.763    -1.00
 Residual              6.19    2.488
Number of obs: 2221, groups:  Species, 598

Fixed effects:
            Estimate Std. Error       df t value Pr(>|t|)
(Intercept) -64.2618     0.4249 273.4100  -151.2   <2e-16 ***
dB1s        168.3687     0.4810 271.1800   350.0   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
     (Intr)
dB1s -0.987

Thanks in advance for your help!
Best regards
Jana

-----Message d'origine-----
De : Thierry Onkelinx [mailto:thierry.onkelinx at inbo.be]
Envoy? : mardi 13 f?vrier 2018 10:26
? : Jana Dlouha <jana.dlouha at inra.fr>
Cc : r-sig-mixed-models at r-project.org
Objet : Re: [R-sig-ME] correlation between random effects

Dear Jana,

I assume that you uses the centered dB1c both in the random and the fixed effects? Another thing you can try is to scale dB1c. Using sensible units is often sufficient. Don't use large units (e.g.
kilometers) when you are measuring small things (e.g. millimeters).

You'll need to provide more information when you need more feedback.
At least the summary of the data and the model.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance thierry.onkelinx at inbo.be Havenlaan 88 bus 73, 1000 Brussel www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey ///////////////////////////////////////////////////////////////////////////////////////////




2018-02-13 10:00 GMT+01:00 Jana Dlouha <jana.dlouha at inra.fr>:

Hi all,

I have a problem with a correlation between random effects. I have tested several models on my data:
m0<-lm(MCs~ dB1, data)
m1<- lmer(MCs~ dB1 + (1|Species), data, REML=FALSE)
m2 <- lmer(MCs~ dB1 + (-1+dB1|Species), data, REML=FALSE)
m3<- lmer(MCs~ dB1 + (1|Species)+(0+dB1|Species), data, REML=FALSE)
m4<- lmer(MCs ~ dB1 + (1+dB1 |Species), data,REML=FALSE)

and when I compare the AIC criterion, the lowest one is for the model m4:
     m0              m1           m2            m3            m4
11086.51 10948.72 10828.75 10830.75 10720.43

However, in the summary I see that there is a strong correlation between random effects and associated variances are huge:
Random effects:
Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 21.48    4.635
          dB1         11.25    3.355    -1.00
Residual              6.19    2.488
Number of obs: 2221, groups:  Species, 598

For m3, random effect associated with  intercept has very low variance and residual variance is only  a bit higher:
Random effects:
Groups    Name        Variance  Std.Dev.
 Species   (Intercept) 3.419e-14 1.849e-07
Species.1 dB1         7.968e-01 8.927e-01
Residual              6.327e+00 2.515e+00
Number of obs: 2221, groups:  Species, 598

I am tempted to take into account only the randon effect associated with the slope however I don't know if i can do this considering that the AIC is not the lowest one for this model and how to justify it in my paper?
By the way, I don't really understand why the variances associated with the random effects change so much.
I have tried to center the regressor dB1 which removed the correlation between fixed effects and changed the sign of correlation but random effects remain strongly correlated and variances large:

Random effects:
Groups   Name        Variance Std.Dev. Corr
Species  (Intercept)  1.109   1.053
          dB1c        11.255   3.355    0.94
Residual              6.190   2.488
Number of obs: 2221, groups:  Species, 598

Could you please give me some hint to solve my problem? Thanks a lot
in advance

Jana



        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Jana Dlouha

Tue, Feb 13, 2018 7:29 AM #

Dear Thierry,

You are right, some species are represented only by one or two specimens, actually we did not want to use the mixed-effect models but the reviewers of our paper asked us to do that - but if I understand well from what you say, it is maybe not very smart considering the structure of our sample?
I am struggling to know which species is behaving differently - is there any efficient method to visualize that? I have plotted the random effects using plot_model function but not able to change the y_axis in order to be able to read it, with 600 Species everything is overlapped...
Thanks in advance

Jana

-----Message d'origine-----
De?: Thierry Onkelinx [mailto:thierry.onkelinx at inbo.be]
Envoy??: mardi 13 f?vrier 2018 11:44
??: Jana Dlouha <jana.dlouha at inra.fr>
Cc?: r-sig-mixed-models at r-project.org
Objet?: Re: [R-sig-ME] correlation between random effects

Dear Jana,

Please keep the mailing list in cc.

I meant both centering and scaling.

Based on the summary of the model, you have on average 3.7 observations per species, which is a bit small for a random slope model. What worries me is that the summary of the data indicates several species with > 20 observation. Hence you will have lot of species with only 1 or 2 observations. A species with only 2 observations, a small difference in dB1 and a large difference in MC will likely result in a large random slope for dB1. You'll need to investigate which species have a strong random slope and why. Most of the time that is obvious once you plotted the data for that species.
Tip: plot the observations, the fitted values of the model and the predictions using only the fixed effects.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance thierry.onkelinx at inbo.be Havenlaan 88 bus 73, 1000 Brussel www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey ///////////////////////////////////////////////////////////////////////////////////////////

2018-02-13 11:23 GMT+01:00 Jana Dlouha <jana.dlouha at inra.fr>:

Dear Thierry,

Thanks a lot for your reply. Yes, I have used dB1c also in the random effect.
I have just tried to scale dB1 but I have still the same problem. However, it is possible that I am not doing things well, I am not a statistician and moreover I am just discovering R...
You say that I should provide more information so here is the summary of my data for the two columns I am using in this model:
Species         MCs             dB1
327     :43     Min.    40.05   Min.    :1.050
135     :35     1st Qu. 72.53   1st Qu. :1.400
307     :24     Median  89.11   Median  :1.560
146     :23     Mean    99.56   Mean    :1.671
328     :23     3rd Qu. 116.23  3rd Qu. :1.840
341     :22     Max.    351.49  Max.    :4.220
(Other):2051

How I centered and scaled dB1:
dB1c<-scale(data$dB1,center=TRUE)
dB1s<-scale(data$dB1,center=FALSE, scale=TRUE)

summary of the model without centering or scaling dB1:
Linear mixed model fit by maximum likelihood t-tests use Satterthwaite approximations to
  degrees of freedom [lmerMod]
Formula: MCs ~ dB1 + (1 + dB1 | Species)
   Data: data

     AIC      BIC   logLik deviance df.resid
 10720.4  10754.7  -5354.2  10708.4     2215

Scaled residuals:
    Min      1Q  Median      3Q     Max
-8.7305 -0.3491  0.0651  0.4508  6.6068

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 21.48    4.635
          dB1         11.25    3.355    -1.00
 Residual              6.19    2.488
Number of obs: 2221, groups:  Species, 598

Fixed effects:
            Estimate Std. Error       df t value Pr(>|t|)
(Intercept) -64.2618     0.4249 273.4200  -151.2   <2e-16 ***
dB1          98.0060     0.2800 271.1900   350.0   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
    (Intr)
dB1 -0.987

summary of the centered model m4c:

summary(m4c)

Linear mixed model fit by maximum likelihood t-tests use Satterthwaite approximations to
  degrees of freedom [lmerMod]
Formula: MCs ~ dB1c + (1 + dB1c | Species)
   Data: data

     AIC      BIC   logLik deviance df.resid
 10720.4  10754.7  -5354.2  10708.4     2215

Scaled residuals:
    Min      1Q  Median      3Q     Max
-8.7305 -0.3491  0.0651  0.4508  6.6068

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 1.109    1.053
          dB1c        1.763    1.328    0.94
 Residual             6.190    2.488
Number of obs: 2221, groups:  Species, 598

Fixed effects:
             Estimate Std. Error        df t value Pr(>|t|)
(Intercept)  99.54109    0.08466 290.67000    1176   <2e-16 ***
dB1c         38.78838    0.11081 271.18000     350   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
     (Intr)
dB1c 0.575

and summary of the scaled model m4s:

summary(m4s)

Linear mixed model fit by maximum likelihood t-tests use Satterthwaite approximations to
  degrees of freedom [lmerMod]
Formula: MCs ~ dB1s + (1 + dB1s | Species)
   Data: data

     AIC      BIC   logLik deviance df.resid
 10720.4  10754.7  -5354.2  10708.4     2215

Scaled residuals:
    Min      1Q  Median      3Q     Max
-8.7305 -0.3491  0.0651  0.4508  6.6068

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 21.48    4.635
          dB1s        33.22    5.763    -1.00
 Residual              6.19    2.488
Number of obs: 2221, groups:  Species, 598

Fixed effects:
            Estimate Std. Error       df t value Pr(>|t|)
(Intercept) -64.2618     0.4249 273.4100  -151.2   <2e-16 ***
dB1s        168.3687     0.4810 271.1800   350.0   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
     (Intr)
dB1s -0.987

Thanks in advance for your help!
Best regards
Jana

-----Message d'origine-----
De : Thierry Onkelinx [mailto:thierry.onkelinx at inbo.be] Envoy? : mardi 
13 f?vrier 2018 10:26 ? : Jana Dlouha <jana.dlouha at inra.fr> Cc : 
r-sig-mixed-models at r-project.org Objet : Re: [R-sig-ME] correlation 
between random effects

Dear Jana,

I assume that you uses the centered dB1c both in the random and the fixed effects? Another thing you can try is to scale dB1c. Using sensible units is often sufficient. Don't use large units (e.g.
kilometers) when you are measuring small things (e.g. millimeters).

You'll need to provide more information when you need more feedback.
At least the summary of the data and the model.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN 
BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team Biometrie 
& Kwaliteitszorg / Team Biometrics & Quality Assurance 
thierry.onkelinx at inbo.be Havenlaan 88 bus 73, 1000 Brussel www.inbo.be

//////////////////////////////////////////////////////////////////////
///////////////////// To call in the statistician after the experiment 
is done may be no more than asking him to perform a post-mortem 
examination: he may be able to say what the experiment died of. ~ Sir 
Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger 
Brinner The combination of some data and an aching desire for an 
answer does not ensure that a reasonable answer can be extracted from 
a given body of data. ~ John Tukey 
//////////////////////////////////////////////////////////////////////
/////////////////////




2018-02-13 10:00 GMT+01:00 Jana Dlouha <jana.dlouha at inra.fr>:

Hi all,

I have a problem with a correlation between random effects. I have tested several models on my data:
m0<-lm(MCs~ dB1, data)
m1<- lmer(MCs~ dB1 + (1|Species), data, REML=FALSE)
m2 <- lmer(MCs~ dB1 + (-1+dB1|Species), data, REML=FALSE)
m3<- lmer(MCs~ dB1 + (1|Species)+(0+dB1|Species), data, REML=FALSE)
m4<- lmer(MCs ~ dB1 + (1+dB1 |Species), data,REML=FALSE)

and when I compare the AIC criterion, the lowest one is for the model m4:
     m0              m1           m2            m3            m4
11086.51 10948.72 10828.75 10830.75 10720.43

However, in the summary I see that there is a strong correlation between random effects and associated variances are huge:
Random effects:
Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 21.48    4.635
          dB1         11.25    3.355    -1.00
Residual              6.19    2.488
Number of obs: 2221, groups:  Species, 598

For m3, random effect associated with  intercept has very low variance and residual variance is only  a bit higher:
Random effects:
Groups    Name        Variance  Std.Dev.
 Species   (Intercept) 3.419e-14 1.849e-07
Species.1 dB1         7.968e-01 8.927e-01
Residual              6.327e+00 2.515e+00
Number of obs: 2221, groups:  Species, 598

I am tempted to take into account only the randon effect associated with the slope however I don't know if i can do this considering that the AIC is not the lowest one for this model and how to justify it in my paper?
By the way, I don't really understand why the variances associated with the random effects change so much.
I have tried to center the regressor dB1 which removed the correlation between fixed effects and changed the sign of correlation but random effects remain strongly correlated and variances large:

Random effects:
Groups   Name        Variance Std.Dev. Corr
Species  (Intercept)  1.109   1.053
          dB1c        11.255   3.355    0.94
Residual              6.190   2.488
Number of obs: 2221, groups:  Species, 598

Could you please give me some hint to solve my problem? Thanks a lot 
in advance

Jana



        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Ben Bolker

Tue, Feb 13, 2018 9:34 AM #

you can use as.data.frame(ranef(fitted_model)) to extract the random
effects as a data frame, then do anything you want to look at the most
extreme species ...

On Tue, Feb 13, 2018 at 10:29 AM, Jana Dlouha <jana.dlouha at inra.fr> wrote:

Dear Thierry,

You are right, some species are represented only by one or two specimens, actually we did not want to use the mixed-effect models but the reviewers of our paper asked us to do that - but if I understand well from what you say, it is maybe not very smart considering the structure of our sample?
I am struggling to know which species is behaving differently - is there any efficient method to visualize that? I have plotted the random effects using plot_model function but not able to change the y_axis in order to be able to read it, with 600 Species everything is overlapped...
Thanks in advance

Jana

-----Message d'origine-----
De : Thierry Onkelinx [mailto:thierry.onkelinx at inbo.be]
Envoy? : mardi 13 f?vrier 2018 11:44
? : Jana Dlouha <jana.dlouha at inra.fr>
Cc : r-sig-mixed-models at r-project.org
Objet : Re: [R-sig-ME] correlation between random effects

Dear Jana,

Please keep the mailing list in cc.

I meant both centering and scaling.

Based on the summary of the model, you have on average 3.7 observations per species, which is a bit small for a random slope model. What worries me is that the summary of the data indicates several species with > 20 observation. Hence you will have lot of species with only 1 or 2 observations. A species with only 2 observations, a small difference in dB1 and a large difference in MC will likely result in a large random slope for dB1. You'll need to investigate which species have a strong random slope and why. Most of the time that is obvious once you plotted the data for that species.
Tip: plot the observations, the fitted values of the model and the predictions using only the fixed effects.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance thierry.onkelinx at inbo.be Havenlaan 88 bus 73, 1000 Brussel www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey ///////////////////////////////////////////////////////////////////////////////////////////

2018-02-13 11:23 GMT+01:00 Jana Dlouha <jana.dlouha at inra.fr>:

Dear Thierry,

Thanks a lot for your reply. Yes, I have used dB1c also in the random effect.
I have just tried to scale dB1 but I have still the same problem. However, it is possible that I am not doing things well, I am not a statistician and moreover I am just discovering R...
You say that I should provide more information so here is the summary of my data for the two columns I am using in this model:
Species         MCs             dB1
327     :43     Min.    40.05   Min.    :1.050
135     :35     1st Qu. 72.53   1st Qu. :1.400
307     :24     Median  89.11   Median  :1.560
146     :23     Mean    99.56   Mean    :1.671
328     :23     3rd Qu. 116.23  3rd Qu. :1.840
341     :22     Max.    351.49  Max.    :4.220
(Other):2051

How I centered and scaled dB1:
dB1c<-scale(data$dB1,center=TRUE)
dB1s<-scale(data$dB1,center=FALSE, scale=TRUE)

summary of the model without centering or scaling dB1:
Linear mixed model fit by maximum likelihood t-tests use Satterthwaite approximations to
  degrees of freedom [lmerMod]
Formula: MCs ~ dB1 + (1 + dB1 | Species)
   Data: data

     AIC      BIC   logLik deviance df.resid
 10720.4  10754.7  -5354.2  10708.4     2215

Scaled residuals:
    Min      1Q  Median      3Q     Max
-8.7305 -0.3491  0.0651  0.4508  6.6068

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 21.48    4.635
          dB1         11.25    3.355    -1.00
 Residual              6.19    2.488
Number of obs: 2221, groups:  Species, 598

Fixed effects:
            Estimate Std. Error       df t value Pr(>|t|)
(Intercept) -64.2618     0.4249 273.4200  -151.2   <2e-16 ***
dB1          98.0060     0.2800 271.1900   350.0   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
    (Intr)
dB1 -0.987

summary of the centered model m4c:

summary(m4c)

Linear mixed model fit by maximum likelihood t-tests use Satterthwaite approximations to
  degrees of freedom [lmerMod]
Formula: MCs ~ dB1c + (1 + dB1c | Species)
   Data: data

     AIC      BIC   logLik deviance df.resid
 10720.4  10754.7  -5354.2  10708.4     2215

Scaled residuals:
    Min      1Q  Median      3Q     Max
-8.7305 -0.3491  0.0651  0.4508  6.6068

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 1.109    1.053
          dB1c        1.763    1.328    0.94
 Residual             6.190    2.488
Number of obs: 2221, groups:  Species, 598

Fixed effects:
             Estimate Std. Error        df t value Pr(>|t|)
(Intercept)  99.54109    0.08466 290.67000    1176   <2e-16 ***
dB1c         38.78838    0.11081 271.18000     350   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
     (Intr)
dB1c 0.575

and summary of the scaled model m4s:

summary(m4s)

Linear mixed model fit by maximum likelihood t-tests use Satterthwaite approximations to
  degrees of freedom [lmerMod]
Formula: MCs ~ dB1s + (1 + dB1s | Species)
   Data: data

     AIC      BIC   logLik deviance df.resid
 10720.4  10754.7  -5354.2  10708.4     2215

Scaled residuals:
    Min      1Q  Median      3Q     Max
-8.7305 -0.3491  0.0651  0.4508  6.6068

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 21.48    4.635
          dB1s        33.22    5.763    -1.00
 Residual              6.19    2.488
Number of obs: 2221, groups:  Species, 598

Fixed effects:
            Estimate Std. Error       df t value Pr(>|t|)
(Intercept) -64.2618     0.4249 273.4100  -151.2   <2e-16 ***
dB1s        168.3687     0.4810 271.1800   350.0   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
     (Intr)
dB1s -0.987

Thanks in advance for your help!
Best regards
Jana

-----Message d'origine-----
De : Thierry Onkelinx [mailto:thierry.onkelinx at inbo.be] Envoy? : mardi
13 f?vrier 2018 10:26 ? : Jana Dlouha <jana.dlouha at inra.fr> Cc :
r-sig-mixed-models at r-project.org Objet : Re: [R-sig-ME] correlation
between random effects

Dear Jana,

I assume that you uses the centered dB1c both in the random and the fixed effects? Another thing you can try is to scale dB1c. Using sensible units is often sufficient. Don't use large units (e.g.
kilometers) when you are measuring small things (e.g. millimeters).

You'll need to provide more information when you need more feedback.
At least the summary of the data and the model.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN
BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team Biometrie
& Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx at inbo.be Havenlaan 88 bus 73, 1000 Brussel www.inbo.be

//////////////////////////////////////////////////////////////////////
///////////////////// To call in the statistician after the experiment
is done may be no more than asking him to perform a post-mortem
examination: he may be able to say what the experiment died of. ~ Sir
Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger
Brinner The combination of some data and an aching desire for an
answer does not ensure that a reasonable answer can be extracted from
a given body of data. ~ John Tukey
//////////////////////////////////////////////////////////////////////
/////////////////////




2018-02-13 10:00 GMT+01:00 Jana Dlouha <jana.dlouha at inra.fr>:

Hi all,

I have a problem with a correlation between random effects. I have tested several models on my data:
m0<-lm(MCs~ dB1, data)
m1<- lmer(MCs~ dB1 + (1|Species), data, REML=FALSE)
m2 <- lmer(MCs~ dB1 + (-1+dB1|Species), data, REML=FALSE)
m3<- lmer(MCs~ dB1 + (1|Species)+(0+dB1|Species), data, REML=FALSE)
m4<- lmer(MCs ~ dB1 + (1+dB1 |Species), data,REML=FALSE)

and when I compare the AIC criterion, the lowest one is for the model m4:
     m0              m1           m2            m3            m4
11086.51 10948.72 10828.75 10830.75 10720.43

However, in the summary I see that there is a strong correlation between random effects and associated variances are huge:
Random effects:
Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 21.48    4.635
          dB1         11.25    3.355    -1.00
Residual              6.19    2.488
Number of obs: 2221, groups:  Species, 598

For m3, random effect associated with  intercept has very low variance and residual variance is only  a bit higher:
Random effects:
Groups    Name        Variance  Std.Dev.
 Species   (Intercept) 3.419e-14 1.849e-07
Species.1 dB1         7.968e-01 8.927e-01
Residual              6.327e+00 2.515e+00
Number of obs: 2221, groups:  Species, 598

I am tempted to take into account only the randon effect associated with the slope however I don't know if i can do this considering that the AIC is not the lowest one for this model and how to justify it in my paper?
By the way, I don't really understand why the variances associated with the random effects change so much.
I have tried to center the regressor dB1 which removed the correlation between fixed effects and changed the sign of correlation but random effects remain strongly correlated and variances large:

Random effects:
Groups   Name        Variance Std.Dev. Corr
Species  (Intercept)  1.109   1.053
          dB1c        11.255   3.355    0.94
Residual              6.190   2.488
Number of obs: 2221, groups:  Species, 598

Could you please give me some hint to solve my problem? Thanks a lot
in advance

Jana



        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Thierry Onkelinx

Wed, Feb 14, 2018 12:46 AM #

I agree with Ben. Look at 20 species with the most extreme random
effects. Plot those, one at a time. See what is happening for those
and decide what would be a sensible way to deal with it. That could be
to alter the model or to remove data which doesn't contain the
information needed by the model. E.g. all species with less than x
observations.

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
AND FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx at inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no
more than asking him to perform a post-mortem examination: he may be
able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does
not ensure that a reasonable answer can be extracted from a given body
of data. ~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////




2018-02-13 18:34 GMT+01:00 Ben Bolker <bbolker at gmail.com>:

  you can use as.data.frame(ranef(fitted_model)) to extract the random
effects as a data frame, then do anything you want to look at the most
extreme species ...

On Tue, Feb 13, 2018 at 10:29 AM, Jana Dlouha <jana.dlouha at inra.fr> wrote:

Dear Thierry,

You are right, some species are represented only by one or two specimens, actually we did not want to use the mixed-effect models but the reviewers of our paper asked us to do that - but if I understand well from what you say, it is maybe not very smart considering the structure of our sample?
I am struggling to know which species is behaving differently - is there any efficient method to visualize that? I have plotted the random effects using plot_model function but not able to change the y_axis in order to be able to read it, with 600 Species everything is overlapped...
Thanks in advance

Jana

-----Message d'origine-----
De : Thierry Onkelinx [mailto:thierry.onkelinx at inbo.be]
Envoy? : mardi 13 f?vrier 2018 11:44
? : Jana Dlouha <jana.dlouha at inra.fr>
Cc : r-sig-mixed-models at r-project.org
Objet : Re: [R-sig-ME] correlation between random effects

Dear Jana,

Please keep the mailing list in cc.

I meant both centering and scaling.

Based on the summary of the model, you have on average 3.7 observations per species, which is a bit small for a random slope model. What worries me is that the summary of the data indicates several species with > 20 observation. Hence you will have lot of species with only 1 or 2 observations. A species with only 2 observations, a small difference in dB1 and a large difference in MC will likely result in a large random slope for dB1. You'll need to investigate which species have a strong random slope and why. Most of the time that is obvious once you plotted the data for that species.
Tip: plot the observations, the fitted values of the model and the predictions using only the fixed effects.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance thierry.onkelinx at inbo.be Havenlaan 88 bus 73, 1000 Brussel www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey ///////////////////////////////////////////////////////////////////////////////////////////

2018-02-13 11:23 GMT+01:00 Jana Dlouha <jana.dlouha at inra.fr>:

Dear Thierry,

Thanks a lot for your reply. Yes, I have used dB1c also in the random effect.
I have just tried to scale dB1 but I have still the same problem. However, it is possible that I am not doing things well, I am not a statistician and moreover I am just discovering R...
You say that I should provide more information so here is the summary of my data for the two columns I am using in this model:
Species         MCs             dB1
327     :43     Min.    40.05   Min.    :1.050
135     :35     1st Qu. 72.53   1st Qu. :1.400
307     :24     Median  89.11   Median  :1.560
146     :23     Mean    99.56   Mean    :1.671
328     :23     3rd Qu. 116.23  3rd Qu. :1.840
341     :22     Max.    351.49  Max.    :4.220
(Other):2051

How I centered and scaled dB1:
dB1c<-scale(data$dB1,center=TRUE)
dB1s<-scale(data$dB1,center=FALSE, scale=TRUE)

summary of the model without centering or scaling dB1:
Linear mixed model fit by maximum likelihood t-tests use Satterthwaite approximations to
  degrees of freedom [lmerMod]
Formula: MCs ~ dB1 + (1 + dB1 | Species)
   Data: data

     AIC      BIC   logLik deviance df.resid
 10720.4  10754.7  -5354.2  10708.4     2215

Scaled residuals:
    Min      1Q  Median      3Q     Max
-8.7305 -0.3491  0.0651  0.4508  6.6068

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 21.48    4.635
          dB1         11.25    3.355    -1.00
 Residual              6.19    2.488
Number of obs: 2221, groups:  Species, 598

Fixed effects:
            Estimate Std. Error       df t value Pr(>|t|)
(Intercept) -64.2618     0.4249 273.4200  -151.2   <2e-16 ***
dB1          98.0060     0.2800 271.1900   350.0   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
    (Intr)
dB1 -0.987

summary of the centered model m4c:

summary(m4c)

Linear mixed model fit by maximum likelihood t-tests use Satterthwaite approximations to
  degrees of freedom [lmerMod]
Formula: MCs ~ dB1c + (1 + dB1c | Species)
   Data: data

     AIC      BIC   logLik deviance df.resid
 10720.4  10754.7  -5354.2  10708.4     2215

Scaled residuals:
    Min      1Q  Median      3Q     Max
-8.7305 -0.3491  0.0651  0.4508  6.6068

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 1.109    1.053
          dB1c        1.763    1.328    0.94
 Residual             6.190    2.488
Number of obs: 2221, groups:  Species, 598

Fixed effects:
             Estimate Std. Error        df t value Pr(>|t|)
(Intercept)  99.54109    0.08466 290.67000    1176   <2e-16 ***
dB1c         38.78838    0.11081 271.18000     350   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
     (Intr)
dB1c 0.575

and summary of the scaled model m4s:

summary(m4s)

Linear mixed model fit by maximum likelihood t-tests use Satterthwaite approximations to
  degrees of freedom [lmerMod]
Formula: MCs ~ dB1s + (1 + dB1s | Species)
   Data: data

     AIC      BIC   logLik deviance df.resid
 10720.4  10754.7  -5354.2  10708.4     2215

Scaled residuals:
    Min      1Q  Median      3Q     Max
-8.7305 -0.3491  0.0651  0.4508  6.6068

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 21.48    4.635
          dB1s        33.22    5.763    -1.00
 Residual              6.19    2.488
Number of obs: 2221, groups:  Species, 598

Fixed effects:
            Estimate Std. Error       df t value Pr(>|t|)
(Intercept) -64.2618     0.4249 273.4100  -151.2   <2e-16 ***
dB1s        168.3687     0.4810 271.1800   350.0   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
     (Intr)
dB1s -0.987

Thanks in advance for your help!
Best regards
Jana

-----Message d'origine-----
De : Thierry Onkelinx [mailto:thierry.onkelinx at inbo.be] Envoy? : mardi
13 f?vrier 2018 10:26 ? : Jana Dlouha <jana.dlouha at inra.fr> Cc :
r-sig-mixed-models at r-project.org Objet : Re: [R-sig-ME] correlation
between random effects

Dear Jana,

I assume that you uses the centered dB1c both in the random and the fixed effects? Another thing you can try is to scale dB1c. Using sensible units is often sufficient. Don't use large units (e.g.
kilometers) when you are measuring small things (e.g. millimeters).

You'll need to provide more information when you need more feedback.
At least the summary of the data and the model.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN
BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team Biometrie
& Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx at inbo.be Havenlaan 88 bus 73, 1000 Brussel www.inbo.be

//////////////////////////////////////////////////////////////////////
///////////////////// To call in the statistician after the experiment
is done may be no more than asking him to perform a post-mortem
examination: he may be able to say what the experiment died of. ~ Sir
Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger
Brinner The combination of some data and an aching desire for an
answer does not ensure that a reasonable answer can be extracted from
a given body of data. ~ John Tukey
//////////////////////////////////////////////////////////////////////
/////////////////////




2018-02-13 10:00 GMT+01:00 Jana Dlouha <jana.dlouha at inra.fr>:

Hi all,

I have a problem with a correlation between random effects. I have tested several models on my data:
m0<-lm(MCs~ dB1, data)
m1<- lmer(MCs~ dB1 + (1|Species), data, REML=FALSE)
m2 <- lmer(MCs~ dB1 + (-1+dB1|Species), data, REML=FALSE)
m3<- lmer(MCs~ dB1 + (1|Species)+(0+dB1|Species), data, REML=FALSE)
m4<- lmer(MCs ~ dB1 + (1+dB1 |Species), data,REML=FALSE)

and when I compare the AIC criterion, the lowest one is for the model m4:
     m0              m1           m2            m3            m4
11086.51 10948.72 10828.75 10830.75 10720.43

However, in the summary I see that there is a strong correlation between random effects and associated variances are huge:
Random effects:
Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 21.48    4.635
          dB1         11.25    3.355    -1.00
Residual              6.19    2.488
Number of obs: 2221, groups:  Species, 598

For m3, random effect associated with  intercept has very low variance and residual variance is only  a bit higher:
Random effects:
Groups    Name        Variance  Std.Dev.
 Species   (Intercept) 3.419e-14 1.849e-07
Species.1 dB1         7.968e-01 8.927e-01
Residual              6.327e+00 2.515e+00
Number of obs: 2221, groups:  Species, 598

I am tempted to take into account only the randon effect associated with the slope however I don't know if i can do this considering that the AIC is not the lowest one for this model and how to justify it in my paper?
By the way, I don't really understand why the variances associated with the random effects change so much.
I have tried to center the regressor dB1 which removed the correlation between fixed effects and changed the sign of correlation but random effects remain strongly correlated and variances large:

Random effects:
Groups   Name        Variance Std.Dev. Corr
Species  (Intercept)  1.109   1.053
          dB1c        11.255   3.355    0.94
Residual              6.190   2.488
Number of obs: 2221, groups:  Species, 598

Could you please give me some hint to solve my problem? Thanks a lot
in advance

Jana



        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Jana Dlouha

Thu, Feb 15, 2018 4:48 AM #

Hi all again,

Thanks to Thierry and Ben for their advices and sorry to answer so late, I was out of the lab yesterday.

I looked at the 20 species with the most extreme random effects and unfortunately, it is not systematically the least represented species, one has 13 repetitions so I think it is not possible to exclude them based on the quantitative criterion. There are probably some measurement problems but not related specifically to some species or families so...is there any way to handle it? What did you mean by altering the model?

Best regards
Jana

-----Message d'origine-----
De?: Thierry Onkelinx [mailto:thierry.onkelinx at inbo.be]
Envoy??: mercredi 14 f?vrier 2018 09:47
??: Ben Bolker <bbolker at gmail.com>
Cc?: Jana Dlouha <jana.dlouha at inra.fr>; r-sig-mixed-models at r-project.org
Objet?: Re: [R-sig-ME] correlation between random effects

I agree with Ben. Look at 20 species with the most extreme random effects. Plot those, one at a time. See what is happening for those and decide what would be a sensible way to deal with it. That could be to alter the model or to remove data which doesn't contain the information needed by the model. E.g. all species with less than x observations.

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance thierry.onkelinx at inbo.be Havenlaan 88 bus 73, 1000 Brussel www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey ///////////////////////////////////////////////////////////////////////////////////////////

2018-02-13 18:34 GMT+01:00 Ben Bolker <bbolker at gmail.com>:

  you can use as.data.frame(ranef(fitted_model)) to extract the random 
effects as a data frame, then do anything you want to look at the most 
extreme species ...

On Tue, Feb 13, 2018 at 10:29 AM, Jana Dlouha <jana.dlouha at inra.fr> wrote:

Dear Thierry,

You are right, some species are represented only by one or two specimens, actually we did not want to use the mixed-effect models but the reviewers of our paper asked us to do that - but if I understand well from what you say, it is maybe not very smart considering the structure of our sample?
I am struggling to know which species is behaving differently - is there any efficient method to visualize that? I have plotted the random effects using plot_model function but not able to change the y_axis in order to be able to read it, with 600 Species everything is overlapped...
Thanks in advance

Jana

-----Message d'origine-----
De : Thierry Onkelinx [mailto:thierry.onkelinx at inbo.be] Envoy? :
mardi 13 f?vrier 2018 11:44 ? : Jana Dlouha <jana.dlouha at inra.fr> Cc
: r-sig-mixed-models at r-project.org Objet : Re: [R-sig-ME] correlation
between random effects

Dear Jana,

Please keep the mailing list in cc.

I meant both centering and scaling.

Based on the summary of the model, you have on average 3.7 observations per species, which is a bit small for a random slope model. What worries me is that the summary of the data indicates several species with > 20 observation. Hence you will have lot of species with only 1 or 2 observations. A species with only 2 observations, a small difference in dB1 and a large difference in MC will likely result in a large random slope for dB1. You'll need to investigate which species have a strong random slope and why. Most of the time that is obvious once you plotted the data for that species.
Tip: plot the observations, the fitted values of the model and the predictions using only the fixed effects.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN
BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team
Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx at inbo.be Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

/////////////////////////////////////////////////////////////////////
////////////////////// To call in the statistician after the
experiment is done may be no more than asking him to perform a
post-mortem examination: he may be able to say what the experiment
died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not
data. ~ Roger Brinner The combination of some data and an aching
desire for an answer does not ensure that a reasonable answer can be
extracted from a given body of data. ~ John Tukey
/////////////////////////////////////////////////////////////////////
//////////////////////

2018-02-13 11:23 GMT+01:00 Jana Dlouha <jana.dlouha at inra.fr>:

Dear Thierry,

Thanks a lot for your reply. Yes, I have used dB1c also in the random effect.
I have just tried to scale dB1 but I have still the same problem. However, it is possible that I am not doing things well, I am not a statistician and moreover I am just discovering R...
You say that I should provide more information so here is the summary of my data for the two columns I am using in this model:
Species         MCs             dB1
327     :43     Min.    40.05   Min.    :1.050
135     :35     1st Qu. 72.53   1st Qu. :1.400
307     :24     Median  89.11   Median  :1.560
146     :23     Mean    99.56   Mean    :1.671
328     :23     3rd Qu. 116.23  3rd Qu. :1.840
341     :22     Max.    351.49  Max.    :4.220
(Other):2051

How I centered and scaled dB1:
dB1c<-scale(data$dB1,center=TRUE)
dB1s<-scale(data$dB1,center=FALSE, scale=TRUE)

summary of the model without centering or scaling dB1:
Linear mixed model fit by maximum likelihood t-tests use Satterthwaite approximations to
  degrees of freedom [lmerMod]
Formula: MCs ~ dB1 + (1 + dB1 | Species)
   Data: data

     AIC      BIC   logLik deviance df.resid
 10720.4  10754.7  -5354.2  10708.4     2215

Scaled residuals:
    Min      1Q  Median      3Q     Max
-8.7305 -0.3491  0.0651  0.4508  6.6068

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 21.48    4.635
          dB1         11.25    3.355    -1.00
 Residual              6.19    2.488
Number of obs: 2221, groups:  Species, 598

Fixed effects:
            Estimate Std. Error       df t value Pr(>|t|)
(Intercept) -64.2618     0.4249 273.4200  -151.2   <2e-16 ***
dB1          98.0060     0.2800 271.1900   350.0   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
    (Intr)
dB1 -0.987

summary of the centered model m4c:

summary(m4c)

Linear mixed model fit by maximum likelihood t-tests use Satterthwaite approximations to
  degrees of freedom [lmerMod]
Formula: MCs ~ dB1c + (1 + dB1c | Species)
   Data: data

     AIC      BIC   logLik deviance df.resid
 10720.4  10754.7  -5354.2  10708.4     2215

Scaled residuals:
    Min      1Q  Median      3Q     Max
-8.7305 -0.3491  0.0651  0.4508  6.6068

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 1.109    1.053
          dB1c        1.763    1.328    0.94
 Residual             6.190    2.488
Number of obs: 2221, groups:  Species, 598

Fixed effects:
             Estimate Std. Error        df t value Pr(>|t|)
(Intercept)  99.54109    0.08466 290.67000    1176   <2e-16 ***
dB1c         38.78838    0.11081 271.18000     350   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
     (Intr)
dB1c 0.575

and summary of the scaled model m4s:

summary(m4s)

Linear mixed model fit by maximum likelihood t-tests use Satterthwaite approximations to
  degrees of freedom [lmerMod]
Formula: MCs ~ dB1s + (1 + dB1s | Species)
   Data: data

     AIC      BIC   logLik deviance df.resid
 10720.4  10754.7  -5354.2  10708.4     2215

Scaled residuals:
    Min      1Q  Median      3Q     Max
-8.7305 -0.3491  0.0651  0.4508  6.6068

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 21.48    4.635
          dB1s        33.22    5.763    -1.00
 Residual              6.19    2.488
Number of obs: 2221, groups:  Species, 598

Fixed effects:
            Estimate Std. Error       df t value Pr(>|t|)
(Intercept) -64.2618     0.4249 273.4100  -151.2   <2e-16 ***
dB1s        168.3687     0.4810 271.1800   350.0   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
     (Intr)
dB1s -0.987

Thanks in advance for your help!
Best regards
Jana

-----Message d'origine-----
De : Thierry Onkelinx [mailto:thierry.onkelinx at inbo.be] Envoy? : 
mardi
13 f?vrier 2018 10:26 ? : Jana Dlouha <jana.dlouha at inra.fr> Cc :
r-sig-mixed-models at r-project.org Objet : Re: [R-sig-ME] correlation 
between random effects

Dear Jana,

I assume that you uses the centered dB1c both in the random and the fixed effects? Another thing you can try is to scale dB1c. Using sensible units is often sufficient. Don't use large units (e.g.
kilometers) when you are measuring small things (e.g. millimeters).

You'll need to provide more information when you need more feedback.
At least the summary of the data and the model.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN 
BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team 
Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance 
thierry.onkelinx at inbo.be Havenlaan 88 bus 73, 1000 Brussel 
www.inbo.be

////////////////////////////////////////////////////////////////////
// ///////////////////// To call in the statistician after the 
experiment is done may be no more than asking him to perform a 
post-mortem
examination: he may be able to say what the experiment died of. ~ 
Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger 
Brinner The combination of some data and an aching desire for an 
answer does not ensure that a reasonable answer can be extracted 
from a given body of data. ~ John Tukey 
////////////////////////////////////////////////////////////////////
//
/////////////////////




2018-02-13 10:00 GMT+01:00 Jana Dlouha <jana.dlouha at inra.fr>:

Hi all,

I have a problem with a correlation between random effects. I have tested several models on my data:
m0<-lm(MCs~ dB1, data)
m1<- lmer(MCs~ dB1 + (1|Species), data, REML=FALSE)
m2 <- lmer(MCs~ dB1 + (-1+dB1|Species), data, REML=FALSE)
m3<- lmer(MCs~ dB1 + (1|Species)+(0+dB1|Species), data, REML=FALSE)
m4<- lmer(MCs ~ dB1 + (1+dB1 |Species), data,REML=FALSE)

and when I compare the AIC criterion, the lowest one is for the model m4:
     m0              m1           m2            m3            m4
11086.51 10948.72 10828.75 10830.75 10720.43

However, in the summary I see that there is a strong correlation between random effects and associated variances are huge:
Random effects:
Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 21.48    4.635
          dB1         11.25    3.355    -1.00
Residual              6.19    2.488
Number of obs: 2221, groups:  Species, 598

For m3, random effect associated with  intercept has very low variance and residual variance is only  a bit higher:
Random effects:
Groups    Name        Variance  Std.Dev.
 Species   (Intercept) 3.419e-14 1.849e-07
Species.1 dB1         7.968e-01 8.927e-01
Residual              6.327e+00 2.515e+00
Number of obs: 2221, groups:  Species, 598

I am tempted to take into account only the randon effect associated with the slope however I don't know if i can do this considering that the AIC is not the lowest one for this model and how to justify it in my paper?
By the way, I don't really understand why the variances associated with the random effects change so much.
I have tried to center the regressor dB1 which removed the correlation between fixed effects and changed the sign of correlation but random effects remain strongly correlated and variances large:

Random effects:
Groups   Name        Variance Std.Dev. Corr
Species  (Intercept)  1.109   1.053
          dB1c        11.255   3.355    0.94
Residual              6.190   2.488
Number of obs: 2221, groups:  Species, 598

Could you please give me some hint to solve my problem? Thanks a 
lot in advance

Jana



        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Thierry Onkelinx

Thu, Feb 15, 2018 5:38 AM #

Dear Jana,

Can you provide the data? Without the data we can only hypothesise on
what might be wrong. It will be a lot easier when we can look at the
data ourselves. It is OK to anonymize the data. Or send it privately.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
AND FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx at inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no
more than asking him to perform a post-mortem examination: he may be
able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does
not ensure that a reasonable answer can be extracted from a given body
of data. ~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////




2018-02-15 13:48 GMT+01:00 Jana Dlouha <jana.dlouha at inra.fr>:

Hi all again,

Thanks to Thierry and Ben for their advices and sorry to answer so late, I was out of the lab yesterday.

I looked at the 20 species with the most extreme random effects and unfortunately, it is not systematically the least represented species, one has 13 repetitions so I think it is not possible to exclude them based on the quantitative criterion. There are probably some measurement problems but not related specifically to some species or families so...is there any way to handle it? What did you mean by altering the model?

Best regards
Jana

-----Message d'origine-----
De : Thierry Onkelinx [mailto:thierry.onkelinx at inbo.be]
Envoy? : mercredi 14 f?vrier 2018 09:47
? : Ben Bolker <bbolker at gmail.com>
Cc : Jana Dlouha <jana.dlouha at inra.fr>; r-sig-mixed-models at r-project.org
Objet : Re: [R-sig-ME] correlation between random effects

I agree with Ben. Look at 20 species with the most extreme random effects. Plot those, one at a time. See what is happening for those and decide what would be a sensible way to deal with it. That could be to alter the model or to remove data which doesn't contain the information needed by the model. E.g. all species with less than x observations.

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance thierry.onkelinx at inbo.be Havenlaan 88 bus 73, 1000 Brussel www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey ///////////////////////////////////////////////////////////////////////////////////////////

2018-02-13 18:34 GMT+01:00 Ben Bolker <bbolker at gmail.com>:

  you can use as.data.frame(ranef(fitted_model)) to extract the random
effects as a data frame, then do anything you want to look at the most
extreme species ...

On Tue, Feb 13, 2018 at 10:29 AM, Jana Dlouha <jana.dlouha at inra.fr> wrote:

Dear Thierry,

You are right, some species are represented only by one or two specimens, actually we did not want to use the mixed-effect models but the reviewers of our paper asked us to do that - but if I understand well from what you say, it is maybe not very smart considering the structure of our sample?
I am struggling to know which species is behaving differently - is there any efficient method to visualize that? I have plotted the random effects using plot_model function but not able to change the y_axis in order to be able to read it, with 600 Species everything is overlapped...
Thanks in advance

Jana

-----Message d'origine-----
De : Thierry Onkelinx [mailto:thierry.onkelinx at inbo.be] Envoy? :
mardi 13 f?vrier 2018 11:44 ? : Jana Dlouha <jana.dlouha at inra.fr> Cc
: r-sig-mixed-models at r-project.org Objet : Re: [R-sig-ME] correlation
between random effects

Dear Jana,

Please keep the mailing list in cc.

I meant both centering and scaling.

Based on the summary of the model, you have on average 3.7 observations per species, which is a bit small for a random slope model. What worries me is that the summary of the data indicates several species with > 20 observation. Hence you will have lot of species with only 1 or 2 observations. A species with only 2 observations, a small difference in dB1 and a large difference in MC will likely result in a large random slope for dB1. You'll need to investigate which species have a strong random slope and why. Most of the time that is obvious once you plotted the data for that species.
Tip: plot the observations, the fitted values of the model and the predictions using only the fixed effects.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN
BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team
Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx at inbo.be Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

/////////////////////////////////////////////////////////////////////
////////////////////// To call in the statistician after the
experiment is done may be no more than asking him to perform a
post-mortem examination: he may be able to say what the experiment
died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not
data. ~ Roger Brinner The combination of some data and an aching
desire for an answer does not ensure that a reasonable answer can be
extracted from a given body of data. ~ John Tukey
/////////////////////////////////////////////////////////////////////
//////////////////////

2018-02-13 11:23 GMT+01:00 Jana Dlouha <jana.dlouha at inra.fr>:

Dear Thierry,

Thanks a lot for your reply. Yes, I have used dB1c also in the random effect.
I have just tried to scale dB1 but I have still the same problem. However, it is possible that I am not doing things well, I am not a statistician and moreover I am just discovering R...
You say that I should provide more information so here is the summary of my data for the two columns I am using in this model:
Species         MCs             dB1
327     :43     Min.    40.05   Min.    :1.050
135     :35     1st Qu. 72.53   1st Qu. :1.400
307     :24     Median  89.11   Median  :1.560
146     :23     Mean    99.56   Mean    :1.671
328     :23     3rd Qu. 116.23  3rd Qu. :1.840
341     :22     Max.    351.49  Max.    :4.220
(Other):2051

How I centered and scaled dB1:
dB1c<-scale(data$dB1,center=TRUE)
dB1s<-scale(data$dB1,center=FALSE, scale=TRUE)

summary of the model without centering or scaling dB1:
Linear mixed model fit by maximum likelihood t-tests use Satterthwaite approximations to
  degrees of freedom [lmerMod]
Formula: MCs ~ dB1 + (1 + dB1 | Species)
   Data: data

     AIC      BIC   logLik deviance df.resid
 10720.4  10754.7  -5354.2  10708.4     2215

Scaled residuals:
    Min      1Q  Median      3Q     Max
-8.7305 -0.3491  0.0651  0.4508  6.6068

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 21.48    4.635
          dB1         11.25    3.355    -1.00
 Residual              6.19    2.488
Number of obs: 2221, groups:  Species, 598

Fixed effects:
            Estimate Std. Error       df t value Pr(>|t|)
(Intercept) -64.2618     0.4249 273.4200  -151.2   <2e-16 ***
dB1          98.0060     0.2800 271.1900   350.0   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
    (Intr)
dB1 -0.987

summary of the centered model m4c:

summary(m4c)

Linear mixed model fit by maximum likelihood t-tests use Satterthwaite approximations to
  degrees of freedom [lmerMod]
Formula: MCs ~ dB1c + (1 + dB1c | Species)
   Data: data

     AIC      BIC   logLik deviance df.resid
 10720.4  10754.7  -5354.2  10708.4     2215

Scaled residuals:
    Min      1Q  Median      3Q     Max
-8.7305 -0.3491  0.0651  0.4508  6.6068

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 1.109    1.053
          dB1c        1.763    1.328    0.94
 Residual             6.190    2.488
Number of obs: 2221, groups:  Species, 598

Fixed effects:
             Estimate Std. Error        df t value Pr(>|t|)
(Intercept)  99.54109    0.08466 290.67000    1176   <2e-16 ***
dB1c         38.78838    0.11081 271.18000     350   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
     (Intr)
dB1c 0.575

and summary of the scaled model m4s:

summary(m4s)

Linear mixed model fit by maximum likelihood t-tests use Satterthwaite approximations to
  degrees of freedom [lmerMod]
Formula: MCs ~ dB1s + (1 + dB1s | Species)
   Data: data

     AIC      BIC   logLik deviance df.resid
 10720.4  10754.7  -5354.2  10708.4     2215

Scaled residuals:
    Min      1Q  Median      3Q     Max
-8.7305 -0.3491  0.0651  0.4508  6.6068

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 21.48    4.635
          dB1s        33.22    5.763    -1.00
 Residual              6.19    2.488
Number of obs: 2221, groups:  Species, 598

Fixed effects:
            Estimate Std. Error       df t value Pr(>|t|)
(Intercept) -64.2618     0.4249 273.4100  -151.2   <2e-16 ***
dB1s        168.3687     0.4810 271.1800   350.0   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
     (Intr)
dB1s -0.987

Thanks in advance for your help!
Best regards
Jana

-----Message d'origine-----
De : Thierry Onkelinx [mailto:thierry.onkelinx at inbo.be] Envoy? :
mardi
13 f?vrier 2018 10:26 ? : Jana Dlouha <jana.dlouha at inra.fr> Cc :
r-sig-mixed-models at r-project.org Objet : Re: [R-sig-ME] correlation
between random effects

Dear Jana,

I assume that you uses the centered dB1c both in the random and the fixed effects? Another thing you can try is to scale dB1c. Using sensible units is often sufficient. Don't use large units (e.g.
kilometers) when you are measuring small things (e.g. millimeters).

You'll need to provide more information when you need more feedback.
At least the summary of the data and the model.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN
BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team
Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx at inbo.be Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

////////////////////////////////////////////////////////////////////
// ///////////////////// To call in the statistician after the
experiment is done may be no more than asking him to perform a
post-mortem
examination: he may be able to say what the experiment died of. ~
Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger
Brinner The combination of some data and an aching desire for an
answer does not ensure that a reasonable answer can be extracted
from a given body of data. ~ John Tukey
////////////////////////////////////////////////////////////////////
//
/////////////////////




2018-02-13 10:00 GMT+01:00 Jana Dlouha <jana.dlouha at inra.fr>:

Hi all,

I have a problem with a correlation between random effects. I have tested several models on my data:
m0<-lm(MCs~ dB1, data)
m1<- lmer(MCs~ dB1 + (1|Species), data, REML=FALSE)
m2 <- lmer(MCs~ dB1 + (-1+dB1|Species), data, REML=FALSE)
m3<- lmer(MCs~ dB1 + (1|Species)+(0+dB1|Species), data, REML=FALSE)
m4<- lmer(MCs ~ dB1 + (1+dB1 |Species), data,REML=FALSE)

and when I compare the AIC criterion, the lowest one is for the model m4:
     m0              m1           m2            m3            m4
11086.51 10948.72 10828.75 10830.75 10720.43

However, in the summary I see that there is a strong correlation between random effects and associated variances are huge:
Random effects:
Groups   Name        Variance Std.Dev. Corr
 Species  (Intercept) 21.48    4.635
          dB1         11.25    3.355    -1.00
Residual              6.19    2.488
Number of obs: 2221, groups:  Species, 598

For m3, random effect associated with  intercept has very low variance and residual variance is only  a bit higher:
Random effects:
Groups    Name        Variance  Std.Dev.
 Species   (Intercept) 3.419e-14 1.849e-07
Species.1 dB1         7.968e-01 8.927e-01
Residual              6.327e+00 2.515e+00
Number of obs: 2221, groups:  Species, 598

I am tempted to take into account only the randon effect associated with the slope however I don't know if i can do this considering that the AIC is not the lowest one for this model and how to justify it in my paper?
By the way, I don't really understand why the variances associated with the random effects change so much.
I have tried to center the regressor dB1 which removed the correlation between fixed effects and changed the sign of correlation but random effects remain strongly correlated and variances large:

Random effects:
Groups   Name        Variance Std.Dev. Corr
Species  (Intercept)  1.109   1.053
          dB1c        11.255   3.355    0.94
Residual              6.190   2.488
Number of obs: 2221, groups:  Species, 598

Could you please give me some hint to solve my problem? Thanks a
lot in advance

Jana



        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models