glmer and nAGQ

6 messages · Peter R Law, Ben Bolker

Original

1

6

Peter R Law

Fri, Jun 18, 2021 12:52 PM #

I am re-sending this query that I originally emailed June 13'th. I did not receive a copy of the sent email as I have with previous postings and my query doesn't appear in the archive of postings so does not seem to have been received.

Peter R Law

Sent with [ProtonMail](https://protonmail.com/) Secure Email.

??????? Original Message ???????

On Monday, June 14th, 2021 at 10:48 PM, Peter R Law <prldb at protonmail.com> wrote:

Any help with the following query is much appreciated.

I used some simulated data (not generated under any specific distributional assumption but all responses are positive quantities) to investigate the nAGQ argument in glmer, running a Gamma-distribution model. With nAAGQ=2 the logLik is dramatically different to the default value of nAGQ=1, while nAGQ=5 returned minus infinity for the logLik, but the estimates of the fixed effect parameters are somewhat consistent across each computation. Are the differences in the estimated logLik surprising or do they reflect the warnings glmer returns for this attempted modelling? I got similar results for a real dataset too.

data.frame':500 obs. of8 variables:

$ IBI: num25.5 25.4 25.2 25.6 25.8 ...

$ MatID: Factor w/ 99 levels "M1","M10","M11",..: 1 1 1 1 1 12 12 12 12 12 ...

$ Pop: Factor w/ 5 levels "P1","P2","P3",..: 1 1 1 1 1 1 1 1 1 1 ...

$ density : int11 13 15 19 28 11 13 15 19 28 ...

$ rain: num41.1 36.6 31.6 40 40.6 ...

$ normIBI : num-1.28 -1.29 -1.31 -1.26 -1.24 ...

$ normDens: num-1.72 -1.66 -1.61 -1.49 -1.23 ...

$ normRain: num-0.249 -0.64 -1.073 -0.345 -0.287 ...

M61 <- glmer(IBI~normDens+normRain + (1|MatID), family=Gamma, data=Sim)

Warning message:

In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,:

Model is nearly unidentifiable: very large eigenvalue

- Rescale variables?

summary(M61)

Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']

Family: Gamma( inverse )

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AICBIClogLik deviance df.resid

1297.61318.7-643.81287.6495

Scaled residuals:

Min1QMedian3QMax

-1.58054 -0.306190.041690.367441.31012

Random effects:

GroupsNameVarianceStd.Dev.

MatID(Intercept) 5.802e-06 0.002409

Residual1.368e-03 0.036989

Number of obs: 500, groups:MatID, 99

Fixed effects:

Estimate Std. Error t value Pr(>|z|)

(Intercept)3.083e-026.090e-0450.63<2e-16 ***

normDens -1.720e-037.946e-05-21.65<2e-16 ***

normRain4.889e-042.463e-0519.85<2e-16 ***

---

Signif. codes:0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:

(Intr) nrmDns

normDens0.028

normRain0.000 -0.044

convergence code: 0

Model is nearly unidentifiable: very large eigenvalue

- Rescale variables?

What kind of scaling is being suggested in the case of the default value of nAGQ? The predictors are already normalized.

M62 <- glmer(IBI~normDens+normRain + (1|MatID), family=Gamma, nAGQ=2,data=Sim)

boundary (singular) fit: see ?isSingular

summary(M62)

Generalized linear mixed model fit by maximum likelihood (Adaptive Gauss-Hermite Quadrature, nAGQ = 2) ['glmerMod']

Family: Gamma( inverse )

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AICBIClogLik deviance df.resid

21.742.8-5.911.7495

Scaled residuals:

Min1QMedian3QMax

-2.3532 -0.6617 -0.13050.51013.4180

Random effects:

GroupsNameVariance Std.Dev.

MatID(Intercept) 0.000000.0000

Residual0.024160.1554

Number of obs: 500, groups:MatID, 99

Fixed effects:

Estimate Std. Error t value Pr(>|z|)

(Intercept)0.02882820.001300822.161< 2e-16 ***

normDens-0.00392740.0012153-3.2320.00123 **

normRain0.00027830.00129940.2140.83042

---

Signif. codes:0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:

(Intr) nrmDns

normDens -0.275

normRain0.019 -0.019

convergence code: 0

boundary (singular) fit: see ?isSingular

M65 <- glmer(IBI~normDens+normRain + (1|MatID), family=Gamma, nAGQ=5,data=Sim)

summary(M65)

Generalized linear mixed model fit by maximum likelihood (Adaptive Gauss-Hermite Quadrature, nAGQ = 5) ['glmerMod']

Family: Gamma( inverse )

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AICBIClogLik deviance df.resid

InfInf-InfInf495

Scaled residuals:

Min1QMedian3QMax

-2.2488 -0.39020.03160.34851.7127

Random effects:

GroupsNameVarianceStd.Dev.

MatID(Intercept) 6.007e-06 0.002451

Residual8.831e-04 0.029716

Number of obs: 500, groups:MatID, 99

Fixed effects:

Estimate Std. Error t value Pr(>|z|)

(Intercept)2.936e-022.495e-04117.68<2e-16 ***

normDens-2.182e-031.163e-04-18.76<2e-16 ***

normRain4.883e-043.973e-0512.29<2e-16 ***

---

Signif. codes:0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:

(Intr) nrmDns

normDens -0.013

normRain0.004 -0.041

convergence code: 0

Gradient contains NAs

Warning messages:

1: In vcov.merMod(object, use.hessian = use.hessian) :

variance-covariance matrix computed from finite-difference Hessian is

not positive definite or contains NA values: falling back to var-cov estimated from RX

2: In vcov.merMod(object, correlation = correlation, sigm = sig) :

variance-covariance matrix computed from finite-difference Hessian is

not positive definite or contains NA values: falling back to var-cov estimated from RX

For comparison, the linear model seems to be well behaved:

M1 <- lmer(IBI~normDens+normRain +(1|MatID), REML=FALSE, data=Sim)

summary(M1)

Linear mixed model fit by maximum likelihood

['lmerMod']

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AIC

BIC

logLik deviance df.resid

1804.2

1825.2

-897.1

1794.2

495

Scaled residuals:

Min

1Q

Median

3Q

Max

-3.2970 -0.5304

0.0153

0.5484

3.4798

Random effects:

Groups

Name

Variance Std.Dev.

MatID

(Intercept) 43.5694

6.6007

Residual

0.6737

0.8208

Number of obs: 500, groups:

MatID, 99

Fixed effects:

Estimate Std. Error t value

(Intercept) 35.47121

0.66443

53.39

normDens

2.00732

0.11816

16.99

normRain

-0.71747

0.04237

-16.93

Correlation of Fixed Effects:

(Intr) nrmDns

normDens -0.002

normRain

0.000 -0.044

Should I just conclude that the data is not well modelled by a Gamma GLMM?

Peter R Law
Research Associate
Center for African Conservation Ecology
Nelson Mandela University
South Africa

Sent with [ProtonMail](https://protonmail.com/) Secure Email.

Ben Bolker

Fri, Jun 18, 2021 2:44 PM #

This is kind of hard to read, but I'll try.

   First of all, log-likelihoods with nAGQ>1 (adaptive Gauss-Hermite 
quadrature) **are not currently commensurate with log-likelihoods with 
nAGQ==1 (Laplace approximation). This is hinted at in the "Deviance and 
log-likelihood of GLMMs" section under "Deviance and log-likelihood of 
GLMMs".

   In any case, the log-likelihoods computed under different numbers of 
quadrature points aren't comparable as though they were different 
statistical models (AIC, likelihood ratio tests, etc.). Rather, they are 
based on **different approximations** to the same model, and we know 
that larger numbers of quadrature points (increasing nAGQ) are *more 
accurate* (although slower) approximations.  The main thing is to see if 
*results* (parameter estimates etc.) change as nAGQ increases by a 
magnitude that is important for the current analysis ... if they do, 
then you should use an nAGQ that is large enough that it approximates 
"infinity", i.e. increasing nAGQ further doesn't change the answers by 
an important amount.

   In general if it is not important to you to use the inverse link (for 
interpretation/because you think this will be the natural scale on which 
to measure effects of continuous parameters and/or interactions), 
Gamma(link="log") is generally more robust.

   The "rescale variables?" suggestion is just that; if the parameters 
are already rescaled, then it's not going to help.  A large eigenvalue 
may not actually be a problem, it just indicates the *possibility* of 
numerical instability. In this case my guess is that the very small 
random effects variance might be responsible.

  It's also a little surprising that the normRain coefficient is very 
small in magnitude and yet but has a very large Z-statistic, given this 
sample size.

  The results with increasing AGQ do indeed look kind of wonky.  If this 
were my problem I would want to look more carefully at the data/think 
about where they came from.  How do the model diagnostics look?

On 6/18/21 3:52 PM, Peter R Law via R-sig-mixed-models wrote:

I am re-sending this query that I originally emailed June 13'th. I did not receive a copy of the sent email as I have with previous postings and my query doesn't appear in the archive of postings so does not seem to have been received.

Peter R Law

Sent with [ProtonMail](https://protonmail.com/) Secure Email.

??????? Original Message ???????
On Monday, June 14th, 2021 at 10:48 PM, Peter R Law <prldb at protonmail.com> wrote:

Any help with the following query is much appreciated.

I used some simulated data (not generated under any specific distributional assumption but all responses are positive quantities) to investigate the nAGQ argument in glmer, running a Gamma-distribution model. With nAAGQ=2 the logLik is dramatically different to the default value of nAGQ=1, while nAGQ=5 returned minus infinity for the logLik, but the estimates of the fixed effect parameters are somewhat consistent across each computation. Are the differences in the estimated logLik surprising or do they reflect the warnings glmer returns for this attempted modelling? I got similar results for a real dataset too.

data.frame':500 obs. of8 variables:

$ IBI: num25.5 25.4 25.2 25.6 25.8 ...

$ MatID: Factor w/ 99 levels "M1","M10","M11",..: 1 1 1 1 1 12 12 12 12 12 ...

$ Pop: Factor w/ 5 levels "P1","P2","P3",..: 1 1 1 1 1 1 1 1 1 1 ...

$ density : int11 13 15 19 28 11 13 15 19 28 ...

$ rain: num41.1 36.6 31.6 40 40.6 ...

$ normIBI : num-1.28 -1.29 -1.31 -1.26 -1.24 ...

$ normDens: num-1.72 -1.66 -1.61 -1.49 -1.23 ...

$ normRain: num-0.249 -0.64 -1.073 -0.345 -0.287 ...

M61 <- glmer(IBI~normDens+normRain + (1|MatID), family=Gamma, data=Sim)

Warning message:

In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,:

Model is nearly unidentifiable: very large eigenvalue

- Rescale variables?

summary(M61)

Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']

Family: Gamma( inverse )

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AICBIClogLik deviance df.resid

1297.61318.7-643.81287.6495

Scaled residuals:

Min1QMedian3QMax

-1.58054 -0.306190.041690.367441.31012

Random effects:

GroupsNameVarianceStd.Dev.

MatID(Intercept) 5.802e-06 0.002409

Residual1.368e-03 0.036989

Number of obs: 500, groups:MatID, 99

Fixed effects:

Estimate Std. Error t value Pr(>|z|)

(Intercept)3.083e-026.090e-0450.63<2e-16 ***

normDens -1.720e-037.946e-05-21.65<2e-16 ***

normRain4.889e-042.463e-0519.85<2e-16 ***

---

Signif. codes:0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:

(Intr) nrmDns

normDens0.028

normRain0.000 -0.044

convergence code: 0

Model is nearly unidentifiable: very large eigenvalue

- Rescale variables?

What kind of scaling is being suggested in the case of the default value of nAGQ? The predictors are already normalized.

M62 <- glmer(IBI~normDens+normRain + (1|MatID), family=Gamma, nAGQ=2,data=Sim)

boundary (singular) fit: see ?isSingular

summary(M62)

Generalized linear mixed model fit by maximum likelihood (Adaptive Gauss-Hermite Quadrature, nAGQ = 2) ['glmerMod']

Family: Gamma( inverse )

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AICBIClogLik deviance df.resid

21.742.8-5.911.7495

Scaled residuals:

Min1QMedian3QMax

-2.3532 -0.6617 -0.13050.51013.4180

Random effects:

GroupsNameVariance Std.Dev.

MatID(Intercept) 0.000000.0000

Residual0.024160.1554

Number of obs: 500, groups:MatID, 99

Fixed effects:

Estimate Std. Error t value Pr(>|z|)

(Intercept)0.02882820.001300822.161< 2e-16 ***

normDens-0.00392740.0012153-3.2320.00123 **

normRain0.00027830.00129940.2140.83042

---

Signif. codes:0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:

(Intr) nrmDns

normDens -0.275

normRain0.019 -0.019

convergence code: 0

boundary (singular) fit: see ?isSingular

M65 <- glmer(IBI~normDens+normRain + (1|MatID), family=Gamma, nAGQ=5,data=Sim)

summary(M65)

Generalized linear mixed model fit by maximum likelihood (Adaptive Gauss-Hermite Quadrature, nAGQ = 5) ['glmerMod']

Family: Gamma( inverse )

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AICBIClogLik deviance df.resid

InfInf-InfInf495

Scaled residuals:

Min1QMedian3QMax

-2.2488 -0.39020.03160.34851.7127

Random effects:

GroupsNameVarianceStd.Dev.

MatID(Intercept) 6.007e-06 0.002451

Residual8.831e-04 0.029716

Number of obs: 500, groups:MatID, 99

Fixed effects:

Estimate Std. Error t value Pr(>|z|)

(Intercept)2.936e-022.495e-04117.68<2e-16 ***

normDens-2.182e-031.163e-04-18.76<2e-16 ***

normRain4.883e-043.973e-0512.29<2e-16 ***

---

Signif. codes:0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:

(Intr) nrmDns

normDens -0.013

normRain0.004 -0.041

convergence code: 0

Gradient contains NAs

Warning messages:

1: In vcov.merMod(object, use.hessian = use.hessian) :

variance-covariance matrix computed from finite-difference Hessian is

not positive definite or contains NA values: falling back to var-cov estimated from RX

2: In vcov.merMod(object, correlation = correlation, sigm = sig) :

variance-covariance matrix computed from finite-difference Hessian is

not positive definite or contains NA values: falling back to var-cov estimated from RX

For comparison, the linear model seems to be well behaved:

M1 <- lmer(IBI~normDens+normRain +(1|MatID), REML=FALSE, data=Sim)

summary(M1)

Linear mixed model fit by maximum likelihood

['lmerMod']

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AIC

BIC

logLik deviance df.resid

1804.2

1825.2

-897.1

1794.2

495

Scaled residuals:

Min

1Q

Median

3Q

Max

-3.2970 -0.5304

0.0153

0.5484

3.4798

Random effects:

Groups

Name

Variance Std.Dev.

MatID

(Intercept) 43.5694

6.6007

Residual

0.6737

0.8208

Number of obs: 500, groups:

MatID, 99

Fixed effects:

Estimate Std. Error t value

(Intercept) 35.47121

0.66443

53.39

normDens

2.00732

0.11816

16.99

normRain

-0.71747

0.04237

-16.93

Correlation of Fixed Effects:

(Intr) nrmDns

normDens -0.002

normRain

0.000 -0.044

Should I just conclude that the data is not well modelled by a Gamma GLMM?

Peter R Law
Research Associate
Center for African Conservation Ecology
Nelson Mandela University
South Africa

Sent with [ProtonMail](https://protonmail.com/) Secure Email.

	[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Peter R Law

Fri, Jun 18, 2021 3:39 PM #

Thanks for your response. I'm sorry the formatting of the R output got messed up and hard to read when I forwarded the original message. I do understand that changing the value of nAGQ is changing the approximation of the likelihood, not changing the model itself. I wondered whether the change in likelihood from -643.8 (n=1) to -5.9 (n=2) to -inf (n=3) was a reason to suspect the model was badly behaved in that none of these approximations of the logLik should be trusted. The parameter estimates for the fixed effects don't change that much, basically just the logLik and consequently the AIC. I wanted to compare the glmer with a Gamma distribution to the lmer to see whether AIC favoured one over the other. My intent was to start with the glmer with default n=1, then increase n until model parameters seemed stable, then use that n value for the glmer model to obtain a logLik and then AIC to compare to the lmer version. But the dramatic differences in logLik in going from n=1 to n= 2 and the negative infinite value already for n=3 I thought might suggest that none of these logLiks should be trusted and the the Gamma model simply abandoned. Does that sound plausible in light of the warning messages? In particular, that the negative infinity answer is a failuer in the estimation procedure rather than an indication that that the Gamma model has zero likelihood?

I will repeat the analyses with the log link. As far as the data is concerned, it's simulated data I created without using any distributional assumptions but rather based on ecological assumptions for how the response might depend on the fixed predictors. My purpose here is to explore the glmer function in preparation for a study with real data (still being collected). Because the responses are positive values, I thought it would be worth investigating how a Gamma glmer compares with an lmer (and compare both with a log-normal model, after making the adjustment in the latter so that its AIC is comparable to the others, as was pointed out in a previous posting).

Peter

Sent with ProtonMail Secure Email.

??????? Original Message ???????

On Friday, June 18th, 2021 at 5:44 PM, Ben Bolker <bbolker at gmail.com> wrote:

This is kind of hard to read, but I'll try.

First of all, log-likelihoods with nAGQ>1 (adaptive Gauss-Hermite

quadrature) **are not currently commensurate with log-likelihoods with

nAGQ==1 (Laplace approximation). This is hinted at in the "Deviance and

log-likelihood of GLMMs" section under "Deviance and log-likelihood of

GLMMs".

In any case, the log-likelihoods computed under different numbers of

quadrature points aren't comparable as though they were different

statistical models (AIC, likelihood ratio tests, etc.). Rather, they are

based on different approximations to the same model, and we know

that larger numbers of quadrature points (increasing nAGQ) are more

accurate (although slower) approximations. The main thing is to see if

results (parameter estimates etc.) change as nAGQ increases by a

magnitude that is important for the current analysis ... if they do,

then you should use an nAGQ that is large enough that it approximates

"infinity", i.e. increasing nAGQ further doesn't change the answers by

an important amount.

In general if it is not important to you to use the inverse link (for

interpretation/because you think this will be the natural scale on which

to measure effects of continuous parameters and/or interactions),

Gamma(link="log") is generally more robust.

The "rescale variables?" suggestion is just that; if the parameters

are already rescaled, then it's not going to help. A large eigenvalue

may not actually be a problem, it just indicates the possibility of

numerical instability. In this case my guess is that the very small

random effects variance might be responsible.

It's also a little surprising that the normRain coefficient is very

small in magnitude and yet but has a very large Z-statistic, given this

sample size.

The results with increasing AGQ do indeed look kind of wonky. If this

were my problem I would want to look more carefully at the data/think

about where they came from. How do the model diagnostics look?

On 6/18/21 3:52 PM, Peter R Law via R-sig-mixed-models wrote:

I am re-sending this query that I originally emailed June 13'th. I did not receive a copy of the sent email as I have with previous postings and my query doesn't appear in the archive of postings so does not seem to have been received.

Peter R Law

Sent with ProtonMail Secure Email.

??????? Original Message ???????

On Monday, June 14th, 2021 at 10:48 PM, Peter R Law prldb at protonmail.com wrote:

Any help with the following query is much appreciated.

I used some simulated data (not generated under any specific distributional assumption but all responses are positive quantities) to investigate the nAGQ argument in glmer, running a Gamma-distribution model. With nAAGQ=2 the logLik is dramatically different to the default value of nAGQ=1, while nAGQ=5 returned minus infinity for the logLik, but the estimates of the fixed effect parameters are somewhat consistent across each computation. Are the differences in the estimated logLik surprising or do they reflect the warnings glmer returns for this attempted modelling? I got similar results for a real dataset too.

data.frame':500 obs. of8 variables:

$ IBI: num25.5 25.4 25.2 25.6 25.8 ...

$ MatID: Factor w/ 99 levels "M1","M10","M11",..: 1 1 1 1 1 12 12 12 12 12 ...

$ Pop: Factor w/ 5 levels "P1","P2","P3",..: 1 1 1 1 1 1 1 1 1 1 ...

$ density : int11 13 15 19 28 11 13 15 19 28 ...

$ rain: num41.1 36.6 31.6 40 40.6 ...

$ normIBI : num-1.28 -1.29 -1.31 -1.26 -1.24 ...

$ normDens: num-1.72 -1.66 -1.61 -1.49 -1.23 ...

$ normRain: num-0.249 -0.64 -1.073 -0.345 -0.287 ...

M61 <- glmer(IBI~normDens+normRain + (1|MatID), family=Gamma, data=Sim)

Warning message:

In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,:

Model is nearly unidentifiable: very large eigenvalue

-   Rescale variables?

summary(M61)

Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']

Family: Gamma( inverse )

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AICBIClogLik deviance df.resid

1297.61318.7-643.81287.6495

Scaled residuals:

Min1QMedian3QMax

-1.58054 -0.306190.041690.367441.31012

Random effects:

GroupsNameVarianceStd.Dev.

MatID(Intercept) 5.802e-06 0.002409

Residual1.368e-03 0.036989

Number of obs: 500, groups:MatID, 99

Fixed effects:

Estimate Std. Error t value Pr(>|z|)

(Intercept)3.083e-026.090e-0450.63<2e-16 ***

normDens -1.720e-037.946e-05-21.65<2e-16 ***

normRain4.889e-042.463e-0519.85<2e-16 ***

Signif. codes:0 ?? 0.001 ?? 0.01 ?? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:

(Intr) nrmDns

normDens0.028

normRain0.000 -0.044

convergence code: 0

Model is nearly unidentifiable: very large eigenvalue

-   Rescale variables?

What kind of scaling is being suggested in the case of the default value of nAGQ? The predictors are already normalized.

M62 <- glmer(IBI~normDens+normRain + (1|MatID), family=Gamma, nAGQ=2,data=Sim)

boundary (singular) fit: see ?isSingular

summary(M62)

Generalized linear mixed model fit by maximum likelihood (Adaptive Gauss-Hermite Quadrature, nAGQ = 2) ['glmerMod']

Family: Gamma( inverse )

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AICBIClogLik deviance df.resid

21.742.8-5.911.7495

Scaled residuals:

Min1QMedian3QMax

-2.3532 -0.6617 -0.13050.51013.4180

Random effects:

GroupsNameVariance Std.Dev.

MatID(Intercept) 0.000000.0000

Residual0.024160.1554

Number of obs: 500, groups:MatID, 99

Fixed effects:

Estimate Std. Error t value Pr(>|z|)

(Intercept)0.02882820.001300822.161< 2e-16 ***

normDens-0.00392740.0012153-3.2320.00123 **

normRain0.00027830.00129940.2140.83042

Signif. codes:0 ?? 0.001 ?? 0.01 ?? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:

(Intr) nrmDns

normDens -0.275

normRain0.019 -0.019

convergence code: 0

boundary (singular) fit: see ?isSingular

M65 <- glmer(IBI~normDens+normRain + (1|MatID), family=Gamma, nAGQ=5,data=Sim)

summary(M65)

Generalized linear mixed model fit by maximum likelihood (Adaptive Gauss-Hermite Quadrature, nAGQ = 5) ['glmerMod']

Family: Gamma( inverse )

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AICBIClogLik deviance df.resid

InfInf-InfInf495

Scaled residuals:

Min1QMedian3QMax

-2.2488 -0.39020.03160.34851.7127

Random effects:

GroupsNameVarianceStd.Dev.

MatID(Intercept) 6.007e-06 0.002451

Residual8.831e-04 0.029716

Number of obs: 500, groups:MatID, 99

Fixed effects:

Estimate Std. Error t value Pr(>|z|)

(Intercept)2.936e-022.495e-04117.68<2e-16 ***

normDens-2.182e-031.163e-04-18.76<2e-16 ***

normRain4.883e-043.973e-0512.29<2e-16 ***

Signif. codes:0 ?? 0.001 ?? 0.01 ?? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:

(Intr) nrmDns

normDens -0.013

normRain0.004 -0.041

convergence code: 0

Gradient contains NAs

Warning messages:

1: In vcov.merMod(object, use.hessian = use.hessian) :

variance-covariance matrix computed from finite-difference Hessian is

not positive definite or contains NA values: falling back to var-cov estimated from RX

2: In vcov.merMod(object, correlation = correlation, sigm = sig) :

variance-covariance matrix computed from finite-difference Hessian is

not positive definite or contains NA values: falling back to var-cov estimated from RX

For comparison, the linear model seems to be well behaved:

M1 <- lmer(IBI~normDens+normRain +(1|MatID), REML=FALSE, data=Sim)

summary(M1)

Linear mixed model fit by maximum likelihood

['lmerMod']

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AIC

BIC

logLik deviance df.resid

1804.2

1825.2

-897.1

1794.2

495

Scaled residuals:

Min

1Q

Median

3Q

Max

-3.2970 -0.5304

0.0153

0.5484

3.4798

Random effects:

Groups

Name

Variance Std.Dev.

MatID

(Intercept) 43.5694

6.6007

Residual

0.6737

0.8208

Number of obs: 500, groups:

MatID, 99

Fixed effects:

Estimate Std. Error t value

(Intercept) 35.47121

0.66443

53.39

normDens

2.00732

0.11816

16.99

normRain

-0.71747

0.04237

-16.93

Correlation of Fixed Effects:

(Intr) nrmDns

normDens -0.002

normRain

0.000 -0.044

Should I just conclude that the data is not well modelled by a Gamma GLMM?

Peter R Law

Research Associate

Center for African Conservation Ecology

Nelson Mandela University

South Africa

Sent with ProtonMail Secure Email.

[[alternative HTML version deleted]]

R-sig-mixed-models at r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Ben Bolker

Fri, Jun 18, 2021 5:16 PM #

On 6/18/21 6:39 PM, Peter R Law wrote:

As I suggested in my answer, you shouldn't worry about the big jump 
in log-likelihood from nAGQ=1 to nAGQ=2. I would try n=5, 10, 15, 20 and 
see how it goes (unless n=20 is too slow to be practical). (Yes, I think 
-Inf is a problem with the fitting procedure)

Fair enough; "without any distributional assumptions" is a bit odd 
though.  Unless your ecological model is stochastic (in which case the 
conditional distribution will emerge naturally from the model), you have 
to make *some* assumptions about how the observations will vary around 
the expected mean ...

   cheers
    Ben Bolker

Peter


Sent with ProtonMail Secure Email.

??????? Original Message ???????

On Friday, June 18th, 2021 at 5:44 PM, Ben Bolker <bbolker at gmail.com> wrote:

This is kind of hard to read, but I'll try.

First of all, log-likelihoods with nAGQ>1 (adaptive Gauss-Hermite

quadrature) **are not currently commensurate with log-likelihoods with

nAGQ==1 (Laplace approximation). This is hinted at in the "Deviance and

log-likelihood of GLMMs" section under "Deviance and log-likelihood of

GLMMs".

In any case, the log-likelihoods computed under different numbers of

quadrature points aren't comparable as though they were different

statistical models (AIC, likelihood ratio tests, etc.). Rather, they are

based on different approximations to the same model, and we know

that larger numbers of quadrature points (increasing nAGQ) are more

accurate (although slower) approximations. The main thing is to see if

results (parameter estimates etc.) change as nAGQ increases by a

magnitude that is important for the current analysis ... if they do,

then you should use an nAGQ that is large enough that it approximates

"infinity", i.e. increasing nAGQ further doesn't change the answers by

an important amount.

In general if it is not important to you to use the inverse link (for

interpretation/because you think this will be the natural scale on which

to measure effects of continuous parameters and/or interactions),

Gamma(link="log") is generally more robust.

The "rescale variables?" suggestion is just that; if the parameters

are already rescaled, then it's not going to help. A large eigenvalue

may not actually be a problem, it just indicates the possibility of

numerical instability. In this case my guess is that the very small

random effects variance might be responsible.

It's also a little surprising that the normRain coefficient is very

small in magnitude and yet but has a very large Z-statistic, given this

sample size.

The results with increasing AGQ do indeed look kind of wonky. If this

were my problem I would want to look more carefully at the data/think

about where they came from. How do the model diagnostics look?

On 6/18/21 3:52 PM, Peter R Law via R-sig-mixed-models wrote:

I am re-sending this query that I originally emailed June 13'th. I did not receive a copy of the sent email as I have with previous postings and my query doesn't appear in the archive of postings so does not seem to have been received.

Peter R Law

Sent with ProtonMail Secure Email.

??????? Original Message ???????

On Monday, June 14th, 2021 at 10:48 PM, Peter R Law prldb at protonmail.com wrote:

Any help with the following query is much appreciated.

I used some simulated data (not generated under any specific distributional assumption but all responses are positive quantities) to investigate the nAGQ argument in glmer, running a Gamma-distribution model. With nAAGQ=2 the logLik is dramatically different to the default value of nAGQ=1, while nAGQ=5 returned minus infinity for the logLik, but the estimates of the fixed effect parameters are somewhat consistent across each computation. Are the differences in the estimated logLik surprising or do they reflect the warnings glmer returns for this attempted modelling? I got similar results for a real dataset too.

data.frame':500 obs. of8 variables:

$ IBI: num25.5 25.4 25.2 25.6 25.8 ...

$ MatID: Factor w/ 99 levels "M1","M10","M11",..: 1 1 1 1 1 12 12 12 12 12 ...

$ Pop: Factor w/ 5 levels "P1","P2","P3",..: 1 1 1 1 1 1 1 1 1 1 ...

$ density : int11 13 15 19 28 11 13 15 19 28 ...

$ rain: num41.1 36.6 31.6 40 40.6 ...

$ normIBI : num-1.28 -1.29 -1.31 -1.26 -1.24 ...

$ normDens: num-1.72 -1.66 -1.61 -1.49 -1.23 ...

$ normRain: num-0.249 -0.64 -1.073 -0.345 -0.287 ...

M61 <- glmer(IBI~normDens+normRain + (1|MatID), family=Gamma, data=Sim)

Warning message:

In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,:

Model is nearly unidentifiable: very large eigenvalue

-   Rescale variables?

summary(M61)

Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']

Family: Gamma( inverse )

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AICBIClogLik deviance df.resid

1297.61318.7-643.81287.6495

Scaled residuals:

Min1QMedian3QMax

-1.58054 -0.306190.041690.367441.31012

Random effects:

GroupsNameVarianceStd.Dev.

MatID(Intercept) 5.802e-06 0.002409

Residual1.368e-03 0.036989

Number of obs: 500, groups:MatID, 99

Fixed effects:

Estimate Std. Error t value Pr(>|z|)

(Intercept)3.083e-026.090e-0450.63<2e-16 ***

normDens -1.720e-037.946e-05-21.65<2e-16 ***

normRain4.889e-042.463e-0519.85<2e-16 ***

Signif. codes:0 ?? 0.001 ?? 0.01 ?? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:

(Intr) nrmDns

normDens0.028

normRain0.000 -0.044

convergence code: 0

Model is nearly unidentifiable: very large eigenvalue

-   Rescale variables?

What kind of scaling is being suggested in the case of the default value of nAGQ? The predictors are already normalized.

M62 <- glmer(IBI~normDens+normRain + (1|MatID), family=Gamma, nAGQ=2,data=Sim)

boundary (singular) fit: see ?isSingular

summary(M62)

Generalized linear mixed model fit by maximum likelihood (Adaptive Gauss-Hermite Quadrature, nAGQ = 2) ['glmerMod']

Family: Gamma( inverse )

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AICBIClogLik deviance df.resid

21.742.8-5.911.7495

Scaled residuals:

Min1QMedian3QMax

-2.3532 -0.6617 -0.13050.51013.4180

Random effects:

GroupsNameVariance Std.Dev.

MatID(Intercept) 0.000000.0000

Residual0.024160.1554

Number of obs: 500, groups:MatID, 99

Fixed effects:

Estimate Std. Error t value Pr(>|z|)

(Intercept)0.02882820.001300822.161< 2e-16 ***

normDens-0.00392740.0012153-3.2320.00123 **

normRain0.00027830.00129940.2140.83042

Signif. codes:0 ?? 0.001 ?? 0.01 ?? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:

(Intr) nrmDns

normDens -0.275

normRain0.019 -0.019

convergence code: 0

boundary (singular) fit: see ?isSingular

M65 <- glmer(IBI~normDens+normRain + (1|MatID), family=Gamma, nAGQ=5,data=Sim)

summary(M65)

Generalized linear mixed model fit by maximum likelihood (Adaptive Gauss-Hermite Quadrature, nAGQ = 5) ['glmerMod']

Family: Gamma( inverse )

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AICBIClogLik deviance df.resid

InfInf-InfInf495

Scaled residuals:

Min1QMedian3QMax

-2.2488 -0.39020.03160.34851.7127

Random effects:

GroupsNameVarianceStd.Dev.

MatID(Intercept) 6.007e-06 0.002451

Residual8.831e-04 0.029716

Number of obs: 500, groups:MatID, 99

Fixed effects:

Estimate Std. Error t value Pr(>|z|)

(Intercept)2.936e-022.495e-04117.68<2e-16 ***

normDens-2.182e-031.163e-04-18.76<2e-16 ***

normRain4.883e-043.973e-0512.29<2e-16 ***

Signif. codes:0 ?? 0.001 ?? 0.01 ?? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:

(Intr) nrmDns

normDens -0.013

normRain0.004 -0.041

convergence code: 0

Gradient contains NAs

Warning messages:

1: In vcov.merMod(object, use.hessian = use.hessian) :

variance-covariance matrix computed from finite-difference Hessian is

not positive definite or contains NA values: falling back to var-cov estimated from RX

2: In vcov.merMod(object, correlation = correlation, sigm = sig) :

variance-covariance matrix computed from finite-difference Hessian is

not positive definite or contains NA values: falling back to var-cov estimated from RX

For comparison, the linear model seems to be well behaved:

M1 <- lmer(IBI~normDens+normRain +(1|MatID), REML=FALSE, data=Sim)

summary(M1)

Linear mixed model fit by maximum likelihood

['lmerMod']

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AIC

BIC

logLik deviance df.resid

1804.2

1825.2

-897.1

1794.2

495

Scaled residuals:

Min

1Q

Median

3Q

Max

-3.2970 -0.5304

0.0153

0.5484

3.4798

Random effects:

Groups

Name

Variance Std.Dev.

MatID

(Intercept) 43.5694

6.6007

Residual

0.6737

0.8208

Number of obs: 500, groups:

MatID, 99

Fixed effects:

Estimate Std. Error t value

(Intercept) 35.47121

0.66443

53.39

normDens

2.00732

0.11816

16.99

normRain

-0.71747

0.04237

-16.93

Correlation of Fixed Effects:

(Intr) nrmDns

normDens -0.002

normRain

0.000 -0.044

Should I just conclude that the data is not well modelled by a Gamma GLMM?

Peter R Law

Research Associate

Center for African Conservation Ecology

Nelson Mandela University

South Africa

Sent with ProtonMail Secure Email.

[[alternative HTML version deleted]]

R-sig-mixed-models at r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

5 days later

Peter R Law

Wed, Jun 23, 2021 7:39 PM #

Thanks again for your feedback. Much appreciated. With the canonical link in the Gamma model, once a neg infinity result is returned for the logLik (n=4 for my simulation data), increasing the value of nAGQ doesn't change that result and the warning message about the estimated variance-covariance matrix not being positive definite is constant also.

If I use the log link, there is a large increase (-678.1 to -5.6) in the LogLik in going from the default value of n=1 to n=2, but it then doesn't change for larger n values and the model results are stable.

So I assume then that the Gamma model with the canonical link cannot be fit to this data while with the log link one does get a robust result?

There was no detectable delay in the time to run the model with nAGQ = 25 versus the default value. Is that suspicious?

Peter



Sent with ProtonMail Secure Email.

??????? Original Message ???????

On Friday, June 18th, 2021 at 8:16 PM, Ben Bolker <bbolker at gmail.com> wrote:

On 6/18/21 6:39 PM, Peter R Law wrote:

Thanks for your response. I'm sorry the formatting of the R output got messed up and hard to read when I forwarded the original message. I do understand that changing the value of nAGQ is changing the approximation of the likelihood, not changing the model itself. I wondered whether the change in likelihood from -643.8 (n=1) to -5.9 (n=2) to -inf (n=3) was a reason to suspect the model was badly behaved in that none of these approximations of the logLik should be trusted. The parameter estimates for the fixed effects don't change that much, basically just the logLik and consequently the AIC. I wanted to compare the glmer with a Gamma distribution to the lmer to see whether AIC favoured one over the other. My intent was to start with the glmer with default n=1, then increase n until model parameters seemed stable, then use that n value for the glmer model to obtain a logLik and then AIC to compare to the lmer version. But the dramatic differences in logLik in going from n=1 to n= 2 and the negative infinite value already for n=3 I thought might suggest that none of these logLiks should be trusted and the the Gamma model simply abandoned. Does that sound plausible in light of the warning messages? In particular, that the negative infinity answer is a failuer in the estimation procedure rather than an indication that that the Gamma model has zero likelihood?

As I suggested in my answer, you shouldn't worry about the big jump

in log-likelihood from nAGQ=1 to nAGQ=2. I would try n=5, 10, 15, 20 and

see how it goes (unless n=20 is too slow to be practical). (Yes, I think

-Inf is a problem with the fitting procedure)

I will repeat the analyses with the log link. As far as the data is concerned, it's simulated data I created without using any distributional assumptions but rather based on ecological assumptions for how the response might depend on the fixed predictors. My purpose here is to explore the glmer function in preparation for a study with real data (still being collected). Because the responses are positive values, I thought it would be worth investigating how a Gamma glmer compares with an lmer (and compare both with a log-normal model, after making the adjustment in the latter so that its AIC is comparable to the others, as was pointed out in a previous posting).

Fair enough; "without any distributional assumptions" is a bit odd

though. Unless your ecological model is stochastic (in which case the

conditional distribution will emerge naturally from the model), you have

to make some assumptions about how the observations will vary around

the expected mean ...

cheers

Ben Bolker

Peter

Sent with ProtonMail Secure Email.

??????? Original Message ???????

On Friday, June 18th, 2021 at 5:44 PM, Ben Bolker bbolker at gmail.com wrote:

This is kind of hard to read, but I'll try.

First of all, log-likelihoods with nAGQ>1 (adaptive Gauss-Hermite

quadrature) **are not currently commensurate with log-likelihoods with

nAGQ==1 (Laplace approximation). This is hinted at in the "Deviance and

log-likelihood of GLMMs" section under "Deviance and log-likelihood of

GLMMs".

In any case, the log-likelihoods computed under different numbers of

quadrature points aren't comparable as though they were different

statistical models (AIC, likelihood ratio tests, etc.). Rather, they are

based on different approximations to the same model, and we know

that larger numbers of quadrature points (increasing nAGQ) are more

accurate (although slower) approximations. The main thing is to see if

results (parameter estimates etc.) change as nAGQ increases by a

magnitude that is important for the current analysis ... if they do,

then you should use an nAGQ that is large enough that it approximates

"infinity", i.e. increasing nAGQ further doesn't change the answers by

an important amount.

In general if it is not important to you to use the inverse link (for

interpretation/because you think this will be the natural scale on which

to measure effects of continuous parameters and/or interactions),

Gamma(link="log") is generally more robust.

The "rescale variables?" suggestion is just that; if the parameters

are already rescaled, then it's not going to help. A large eigenvalue

may not actually be a problem, it just indicates the possibility of

numerical instability. In this case my guess is that the very small

random effects variance might be responsible.

It's also a little surprising that the normRain coefficient is very

small in magnitude and yet but has a very large Z-statistic, given this

sample size.

The results with increasing AGQ do indeed look kind of wonky. If this

were my problem I would want to look more carefully at the data/think

about where they came from. How do the model diagnostics look?

On 6/18/21 3:52 PM, Peter R Law via R-sig-mixed-models wrote:

I am re-sending this query that I originally emailed June 13'th. I did not receive a copy of the sent email as I have with previous postings and my query doesn't appear in the archive of postings so does not seem to have been received.

Peter R Law

Sent with ProtonMail Secure Email.

??????? Original Message ???????

On Monday, June 14th, 2021 at 10:48 PM, Peter R Law prldb at protonmail.com wrote:

Any help with the following query is much appreciated.

I used some simulated data (not generated under any specific distributional assumption but all responses are positive quantities) to investigate the nAGQ argument in glmer, running a Gamma-distribution model. With nAAGQ=2 the logLik is dramatically different to the default value of nAGQ=1, while nAGQ=5 returned minus infinity for the logLik, but the estimates of the fixed effect parameters are somewhat consistent across each computation. Are the differences in the estimated logLik surprising or do they reflect the warnings glmer returns for this attempted modelling? I got similar results for a real dataset too.

data.frame':500 obs. of8 variables:

$ IBI: num25.5 25.4 25.2 25.6 25.8 ...

$ MatID: Factor w/ 99 levels "M1","M10","M11",..: 1 1 1 1 1 12 12 12 12 12 ...

$ Pop: Factor w/ 5 levels "P1","P2","P3",..: 1 1 1 1 1 1 1 1 1 1 ...

$ density : int11 13 15 19 28 11 13 15 19 28 ...

$ rain: num41.1 36.6 31.6 40 40.6 ...

$ normIBI : num-1.28 -1.29 -1.31 -1.26 -1.24 ...

$ normDens: num-1.72 -1.66 -1.61 -1.49 -1.23 ...

$ normRain: num-0.249 -0.64 -1.073 -0.345 -0.287 ...

M61 <- glmer(IBI~normDens+normRain + (1|MatID), family=Gamma, data=Sim)

Warning message:

In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,:

Model is nearly unidentifiable: very large eigenvalue

-   Rescale variables?

summary(M61)

Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']

Family: Gamma( inverse )

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AICBIClogLik deviance df.resid

1297.61318.7-643.81287.6495

Scaled residuals:

Min1QMedian3QMax

-1.58054 -0.306190.041690.367441.31012

Random effects:

GroupsNameVarianceStd.Dev.

MatID(Intercept) 5.802e-06 0.002409

Residual1.368e-03 0.036989

Number of obs: 500, groups:MatID, 99

Fixed effects:

Estimate Std. Error t value Pr(>|z|)

(Intercept)3.083e-026.090e-0450.63<2e-16 ***

normDens -1.720e-037.946e-05-21.65<2e-16 ***

normRain4.889e-042.463e-0519.85<2e-16 ***

Signif. codes:0 ?? 0.001 ?? 0.01 ?? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:

(Intr) nrmDns

normDens0.028

normRain0.000 -0.044

convergence code: 0

Model is nearly unidentifiable: very large eigenvalue

-   Rescale variables?

What kind of scaling is being suggested in the case of the default value of nAGQ? The predictors are already normalized.

M62 <- glmer(IBI~normDens+normRain + (1|MatID), family=Gamma, nAGQ=2,data=Sim)

boundary (singular) fit: see ?isSingular

summary(M62)

Generalized linear mixed model fit by maximum likelihood (Adaptive Gauss-Hermite Quadrature, nAGQ = 2) ['glmerMod']

Family: Gamma( inverse )

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AICBIClogLik deviance df.resid

21.742.8-5.911.7495

Scaled residuals:

Min1QMedian3QMax

-2.3532 -0.6617 -0.13050.51013.4180

Random effects:

GroupsNameVariance Std.Dev.

MatID(Intercept) 0.000000.0000

Residual0.024160.1554

Number of obs: 500, groups:MatID, 99

Fixed effects:

Estimate Std. Error t value Pr(>|z|)

(Intercept)0.02882820.001300822.161< 2e-16 ***

normDens-0.00392740.0012153-3.2320.00123 **

normRain0.00027830.00129940.2140.83042

Signif. codes:0 ?? 0.001 ?? 0.01 ?? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:

(Intr) nrmDns

normDens -0.275

normRain0.019 -0.019

convergence code: 0

boundary (singular) fit: see ?isSingular

M65 <- glmer(IBI~normDens+normRain + (1|MatID), family=Gamma, nAGQ=5,data=Sim)

summary(M65)

Generalized linear mixed model fit by maximum likelihood (Adaptive Gauss-Hermite Quadrature, nAGQ = 5) ['glmerMod']

Family: Gamma( inverse )

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AICBIClogLik deviance df.resid

InfInf-InfInf495

Scaled residuals:

Min1QMedian3QMax

-2.2488 -0.39020.03160.34851.7127

Random effects:

GroupsNameVarianceStd.Dev.

MatID(Intercept) 6.007e-06 0.002451

Residual8.831e-04 0.029716

Number of obs: 500, groups:MatID, 99

Fixed effects:

Estimate Std. Error t value Pr(>|z|)

(Intercept)2.936e-022.495e-04117.68<2e-16 ***

normDens-2.182e-031.163e-04-18.76<2e-16 ***

normRain4.883e-043.973e-0512.29<2e-16 ***

Signif. codes:0 ?? 0.001 ?? 0.01 ?? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:

(Intr) nrmDns

normDens -0.013

normRain0.004 -0.041

convergence code: 0

Gradient contains NAs

Warning messages:

1: In vcov.merMod(object, use.hessian = use.hessian) :

variance-covariance matrix computed from finite-difference Hessian is

not positive definite or contains NA values: falling back to var-cov estimated from RX

2: In vcov.merMod(object, correlation = correlation, sigm = sig) :

variance-covariance matrix computed from finite-difference Hessian is

not positive definite or contains NA values: falling back to var-cov estimated from RX

For comparison, the linear model seems to be well behaved:

M1 <- lmer(IBI~normDens+normRain +(1|MatID), REML=FALSE, data=Sim)

summary(M1)

Linear mixed model fit by maximum likelihood

['lmerMod']

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AIC

BIC

logLik deviance df.resid

1804.2

1825.2

-897.1

1794.2

495

Scaled residuals:

Min

1Q

Median

3Q

Max

-3.2970 -0.5304

0.0153

0.5484

3.4798

Random effects:

Groups

Name

Variance Std.Dev.

MatID

(Intercept) 43.5694

6.6007

Residual

0.6737

0.8208

Number of obs: 500, groups:

MatID, 99

Fixed effects:

Estimate Std. Error t value

(Intercept) 35.47121

0.66443

53.39

normDens

2.00732

0.11816

16.99

normRain

-0.71747

0.04237

-16.93

Correlation of Fixed Effects:

(Intr) nrmDns

normDens -0.002

normRain

0.000 -0.044

Should I just conclude that the data is not well modelled by a Gamma GLMM?

Peter R Law

Research Associate

Center for African Conservation Ecology

Nelson Mandela University

South Africa

Sent with ProtonMail Secure Email.

[[alternative HTML version deleted]]

R-sig-mixed-models at r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Ben Bolker

Thu, Jun 24, 2021 7:56 AM #

It's hard to make many more conclusions without digging into the 
particular data set.

   It might be worth comparing results from other packages, although 
GLMMadaptive is the only other package that I'm aware of that 
robustly/flexibly provides adaptive Gauss-Hermite quadrature.

   I also note lots of 'singular fit' messages below; that doesn't mean 
the fits are wrong, but does mean that you might need to consider 
whether you have enough signal to estimate the random effect variance ...

On 6/23/21 10:39 PM, Peter R Law wrote:

Thanks again for your feedback. Much appreciated. With the canonical link in the Gamma model, once a neg infinity result is returned for the logLik (n=4 for my simulation data), increasing the value of nAGQ doesn't change that result and the warning message about the estimated variance-covariance matrix not being positive definite is constant also.

If I use the log link, there is a large increase (-678.1 to -5.6) in the LogLik in going from the default value of n=1 to n=2, but it then doesn't change for larger n values and the model results are stable.

So I assume then that the Gamma model with the canonical link cannot be fit to this data while with the log link one does get a robust result?

There was no detectable delay in the time to run the model with nAGQ = 25 versus the default value. Is that suspicious?

Peter



Sent with ProtonMail Secure Email.

??????? Original Message ???????

On Friday, June 18th, 2021 at 8:16 PM, Ben Bolker <bbolker at gmail.com> wrote:

On 6/18/21 6:39 PM, Peter R Law wrote:

Thanks for your response. I'm sorry the formatting of the R output got messed up and hard to read when I forwarded the original message. I do understand that changing the value of nAGQ is changing the approximation of the likelihood, not changing the model itself. I wondered whether the change in likelihood from -643.8 (n=1) to -5.9 (n=2) to -inf (n=3) was a reason to suspect the model was badly behaved in that none of these approximations of the logLik should be trusted. The parameter estimates for the fixed effects don't change that much, basically just the logLik and consequently the AIC. I wanted to compare the glmer with a Gamma distribution to the lmer to see whether AIC favoured one over the other. My intent was to start with the glmer with default n=1, then increase n until model parameters seemed stable, then use that n value for the glmer model to obtain a logLik and then AIC to compare to the lmer version. But the dramatic differences in logLik in going from n=1 to n= 2 and the negative infinite value already for n=3 I thought might suggest that none of these logLiks should be trusted and the the Gamma model simply abandoned. Does that sound plausible in light of the warning messages? In particular, that the negative infinity answer is a failuer in the estimation procedure rather than an indication that that the Gamma model has zero likelihood?

As I suggested in my answer, you shouldn't worry about the big jump

in log-likelihood from nAGQ=1 to nAGQ=2. I would try n=5, 10, 15, 20 and

see how it goes (unless n=20 is too slow to be practical). (Yes, I think

-Inf is a problem with the fitting procedure)

I will repeat the analyses with the log link. As far as the data is concerned, it's simulated data I created without using any distributional assumptions but rather based on ecological assumptions for how the response might depend on the fixed predictors. My purpose here is to explore the glmer function in preparation for a study with real data (still being collected). Because the responses are positive values, I thought it would be worth investigating how a Gamma glmer compares with an lmer (and compare both with a log-normal model, after making the adjustment in the latter so that its AIC is comparable to the others, as was pointed out in a previous posting).

Fair enough; "without any distributional assumptions" is a bit odd

though. Unless your ecological model is stochastic (in which case the

conditional distribution will emerge naturally from the model), you have

to make some assumptions about how the observations will vary around

the expected mean ...

cheers

Ben Bolker

Peter

Sent with ProtonMail Secure Email.

??????? Original Message ???????

On Friday, June 18th, 2021 at 5:44 PM, Ben Bolker bbolker at gmail.com wrote:

This is kind of hard to read, but I'll try.

First of all, log-likelihoods with nAGQ>1 (adaptive Gauss-Hermite

quadrature) **are not currently commensurate with log-likelihoods with

nAGQ==1 (Laplace approximation). This is hinted at in the "Deviance and

log-likelihood of GLMMs" section under "Deviance and log-likelihood of

GLMMs".

In any case, the log-likelihoods computed under different numbers of

quadrature points aren't comparable as though they were different

statistical models (AIC, likelihood ratio tests, etc.). Rather, they are

based on different approximations to the same model, and we know

that larger numbers of quadrature points (increasing nAGQ) are more

accurate (although slower) approximations. The main thing is to see if

results (parameter estimates etc.) change as nAGQ increases by a

magnitude that is important for the current analysis ... if they do,

then you should use an nAGQ that is large enough that it approximates

"infinity", i.e. increasing nAGQ further doesn't change the answers by

an important amount.

In general if it is not important to you to use the inverse link (for

interpretation/because you think this will be the natural scale on which

to measure effects of continuous parameters and/or interactions),

Gamma(link="log") is generally more robust.

The "rescale variables?" suggestion is just that; if the parameters

are already rescaled, then it's not going to help. A large eigenvalue

may not actually be a problem, it just indicates the possibility of

numerical instability. In this case my guess is that the very small

random effects variance might be responsible.

It's also a little surprising that the normRain coefficient is very

small in magnitude and yet but has a very large Z-statistic, given this

sample size.

The results with increasing AGQ do indeed look kind of wonky. If this

were my problem I would want to look more carefully at the data/think

about where they came from. How do the model diagnostics look?

On 6/18/21 3:52 PM, Peter R Law via R-sig-mixed-models wrote:

I am re-sending this query that I originally emailed June 13'th. I did not receive a copy of the sent email as I have with previous postings and my query doesn't appear in the archive of postings so does not seem to have been received.

Peter R Law

Sent with ProtonMail Secure Email.

??????? Original Message ???????

On Monday, June 14th, 2021 at 10:48 PM, Peter R Law prldb at protonmail.com wrote:

Any help with the following query is much appreciated.

I used some simulated data (not generated under any specific distributional assumption but all responses are positive quantities) to investigate the nAGQ argument in glmer, running a Gamma-distribution model. With nAAGQ=2 the logLik is dramatically different to the default value of nAGQ=1, while nAGQ=5 returned minus infinity for the logLik, but the estimates of the fixed effect parameters are somewhat consistent across each computation. Are the differences in the estimated logLik surprising or do they reflect the warnings glmer returns for this attempted modelling? I got similar results for a real dataset too.

data.frame':500 obs. of8 variables:

$ IBI: num25.5 25.4 25.2 25.6 25.8 ...

$ MatID: Factor w/ 99 levels "M1","M10","M11",..: 1 1 1 1 1 12 12 12 12 12 ...

$ Pop: Factor w/ 5 levels "P1","P2","P3",..: 1 1 1 1 1 1 1 1 1 1 ...

$ density : int11 13 15 19 28 11 13 15 19 28 ...

$ rain: num41.1 36.6 31.6 40 40.6 ...

$ normIBI : num-1.28 -1.29 -1.31 -1.26 -1.24 ...

$ normDens: num-1.72 -1.66 -1.61 -1.49 -1.23 ...

$ normRain: num-0.249 -0.64 -1.073 -0.345 -0.287 ...

M61 <- glmer(IBI~normDens+normRain + (1|MatID), family=Gamma, data=Sim)

Warning message:

In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,:

Model is nearly unidentifiable: very large eigenvalue

-   Rescale variables?

summary(M61)

Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']

Family: Gamma( inverse )

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AICBIClogLik deviance df.resid

1297.61318.7-643.81287.6495

Scaled residuals:

Min1QMedian3QMax

-1.58054 -0.306190.041690.367441.31012

Random effects:

GroupsNameVarianceStd.Dev.

MatID(Intercept) 5.802e-06 0.002409

Residual1.368e-03 0.036989

Number of obs: 500, groups:MatID, 99

Fixed effects:

Estimate Std. Error t value Pr(>|z|)

(Intercept)3.083e-026.090e-0450.63<2e-16 ***

normDens -1.720e-037.946e-05-21.65<2e-16 ***

normRain4.889e-042.463e-0519.85<2e-16 ***

Signif. codes:0 ?? 0.001 ?? 0.01 ?? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:

(Intr) nrmDns

normDens0.028

normRain0.000 -0.044

convergence code: 0

Model is nearly unidentifiable: very large eigenvalue

-   Rescale variables?

What kind of scaling is being suggested in the case of the default value of nAGQ? The predictors are already normalized.

M62 <- glmer(IBI~normDens+normRain + (1|MatID), family=Gamma, nAGQ=2,data=Sim)

boundary (singular) fit: see ?isSingular

summary(M62)

Generalized linear mixed model fit by maximum likelihood (Adaptive Gauss-Hermite Quadrature, nAGQ = 2) ['glmerMod']

Family: Gamma( inverse )

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AICBIClogLik deviance df.resid

21.742.8-5.911.7495

Scaled residuals:

Min1QMedian3QMax

-2.3532 -0.6617 -0.13050.51013.4180

Random effects:

GroupsNameVariance Std.Dev.

MatID(Intercept) 0.000000.0000

Residual0.024160.1554

Number of obs: 500, groups:MatID, 99

Fixed effects:

Estimate Std. Error t value Pr(>|z|)

(Intercept)0.02882820.001300822.161< 2e-16 ***

normDens-0.00392740.0012153-3.2320.00123 **

normRain0.00027830.00129940.2140.83042

Signif. codes:0 ?? 0.001 ?? 0.01 ?? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:

(Intr) nrmDns

normDens -0.275

normRain0.019 -0.019

convergence code: 0

boundary (singular) fit: see ?isSingular

M65 <- glmer(IBI~normDens+normRain + (1|MatID), family=Gamma, nAGQ=5,data=Sim)

summary(M65)

Generalized linear mixed model fit by maximum likelihood (Adaptive Gauss-Hermite Quadrature, nAGQ = 5) ['glmerMod']

Family: Gamma( inverse )

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AICBIClogLik deviance df.resid

InfInf-InfInf495

Scaled residuals:

Min1QMedian3QMax

-2.2488 -0.39020.03160.34851.7127

Random effects:

GroupsNameVarianceStd.Dev.

MatID(Intercept) 6.007e-06 0.002451

Residual8.831e-04 0.029716

Number of obs: 500, groups:MatID, 99

Fixed effects:

Estimate Std. Error t value Pr(>|z|)

(Intercept)2.936e-022.495e-04117.68<2e-16 ***

normDens-2.182e-031.163e-04-18.76<2e-16 ***

normRain4.883e-043.973e-0512.29<2e-16 ***

Signif. codes:0 ?? 0.001 ?? 0.01 ?? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:

(Intr) nrmDns

normDens -0.013

normRain0.004 -0.041

convergence code: 0

Gradient contains NAs

Warning messages:

1: In vcov.merMod(object, use.hessian = use.hessian) :

variance-covariance matrix computed from finite-difference Hessian is

not positive definite or contains NA values: falling back to var-cov estimated from RX

2: In vcov.merMod(object, correlation = correlation, sigm = sig) :

variance-covariance matrix computed from finite-difference Hessian is

not positive definite or contains NA values: falling back to var-cov estimated from RX

For comparison, the linear model seems to be well behaved:

M1 <- lmer(IBI~normDens+normRain +(1|MatID), REML=FALSE, data=Sim)

summary(M1)

Linear mixed model fit by maximum likelihood

['lmerMod']

Formula: IBI ~ normDens + normRain + (1 | MatID)

Data: Sim

AIC

BIC

logLik deviance df.resid

1804.2

1825.2

-897.1

1794.2

495

Scaled residuals:

Min

1Q

Median

3Q

Max

-3.2970 -0.5304

0.0153

0.5484

3.4798

Random effects:

Groups

Name

Variance Std.Dev.

MatID

(Intercept) 43.5694

6.6007

Residual

0.6737

0.8208

Number of obs: 500, groups:

MatID, 99

Fixed effects:

Estimate Std. Error t value

(Intercept) 35.47121

0.66443

53.39

normDens

2.00732

0.11816

16.99

normRain

-0.71747

0.04237

-16.93

Correlation of Fixed Effects:

(Intr) nrmDns

normDens -0.002

normRain

0.000 -0.044

Should I just conclude that the data is not well modelled by a Gamma GLMM?

Peter R Law

Research Associate

Center for African Conservation Ecology

Nelson Mandela University

South Africa

Sent with ProtonMail Secure Email.

[[alternative HTML version deleted]]

R-sig-mixed-models at r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models