Convergence issues running clmm in ordinal package

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20110420/f5109e10/attachment.pl>
Hi all,

I have a data set containing around 8148 individuals nested within approximately 2548 areas (DZID). I have an ordinal response (Num20) with 5 categories (0=None, 1=1 to 3, 2=4 to 11, 3=12 to 19, 4=20 and over) and have been trying to fit multilevel models using the clmm() function in the ordinal package to examine the statistical significance of individual (e.g. Sex, Car, Age, Limitill, Nssec3) and area level predictors (e.g. TotalPA) that I have.

The data looks like this:

DZID Sex Car Age Limitill Nssec3 TotalPA ?Num20
2688 ? 1 ? 1 ? ? ?44 ? ? ? ?1 ? ? ?1 ? ? ? ? ? ? ? ?2 ? ? ? ? ? ? ?4
2688 ? 2 ? 1 ? ? ?42 ? ? ? ?3 ? ? ?2 ? ? ? ? ? ? ? ?2 ? ? ? ? ? ? ?4
2692 ? 1 ? 1 ? ? ?77 ? ? ? ?1 ? ? ?1 ? ? ? ? ? ? ? ?1 ? ? ? ? ? ? ?0
2692 ? 1 ? 1 ? ? ?57 ? ? ? ?3 ? ? ?1 ? ? ? ? ? ? ? ?1 ? ? ? ? ? ? ?4
2692 ? 2 ? 1 ? ? ?52 ? ? ? ?3 ? ? ?1 ? ? ? ? ? ? ? ?1 ? ? ? ? ? ? ?4
2692 ? 2 ? 1 ? ? ?16 ? ? ? ?3 ? ? 99 ? ? ? ? ? ? ? 1 ? ? ? ? ? ? ?4
2672 ? 1 ? 2 ? ? ?28 ? ? ? ?2 ? ? ?1 ? ? ? ? ? ? ? ?4 ? ? ? ? ? ? ?4
2692 ? 1 ? 2 ? ? ?86 ? ? ? ?1 ? ? ?1 ? ? ? ? ? ? ? ?1 ? ? ? ? ? ? ?0
864 ? ? 1 ? ? 1 ? ?22 ? ? ? ?3 ? ? ?1 ? ? ? ? ? ? ? ? 0 ? ? ? ? ? ? 4
864 ? ? 2 ? ? 1 ? ?21 ? ? ? ?2 ? ? ?3 ? ? ? ? ? ? ? ? 0 ? ? ? ? ? ? 4

etc.

summary(Num20)
? 0 ? ?1 ? ?2 ? ?3 ? ?4 NA's
2103 1009 1895 ?869 2244 ? 28

summary(DZID)
? ?689 ? ?1634 ? ?2376 ? ?1598 ? ? ? 4 ? ?1681 ? ?1760 ? ? 683 ? ? 906 ? ?2521 ? ? ?34 ? ? 698
? ? 29 ? ? ?28 ? ? ?26 ? ? ?25 ? ? ?19 ? ? ?19 ? ? ?19 ? ? ?18 ? ? ?18 ? ? ?16 ? ? ?15 ? ? ?15
? 1173 ? ?2108 ? ?2272 ? ? 430 ? ?1263 ? ?1269 ? ?1456 ? ?1538 ? ? ?17 ? ? ?40 ? ? 284 ? ?1630
? ? 14 ? ? ?14 ? ? ?14 ? ? ?13 ? ? ?13 ? ? ?13 ? ? ?13 ? ? ?13 ? ? ?12 ? ? ?12 ? ? ?12 ? ? ?12
? 2202 ? ?2595 ? ? ?31 ? ?1164 ? ?1340 ? ?1775 ? ?2146 ? ?2502 ? ? 359 ? ? 605 ? ?1305 ? ?1319
? ? 12 ? ? ?12 ? ? ?11 ? ? ?11 ? ? ?11 ? ? ?11 ? ? ?11 ? ? ?11 ? ? ?10 ? ? ?10 ? ? ?10 ? ? ?10
? 1354 ? ?1396 ? ?1606 ? ?1960 ? ?1969 ? ?2014 ? ?2063 ? ?2214 ? ?2228 ? ?2459 ? ? 115 ? ? 644
? ? 10 ? ? ?10 ? ? ?10 ? ? ?10 ? ? ?10 ? ? ?10 ? ? ?10 ? ? ?10 ? ? ?10 ? ? ?10 ? ? ? 9 ? ? ? 9
? ?717 ? ? 726 ? ? 843 ? ?1003 ? ?1027 ? ?1470 ? ?1642 ? ?1748 ? ?1896 ? ?2160 ? ?2227 ? ?2360
? ? ?9 ? ? ? 9 ? ? ? 9 ? ? ? 9 ? ? ? 9 ? ? ? 9 ? ? ? 9 ? ? ? 9 ? ? ? 9 ? ? ? 9 ? ? ? 9 ? ? ? 9
? 2423 ? ?2438 ? ?2439 ? ?2555 ? ?2601 ? ? 111 ? ? 199 ? ? 254 ? ? 258 ? ? 321 ? ? 331 ? ? 428
? ? ?9 ? ? ? 9 ? ? ? 9 ? ? ? 9 ? ? ? 9 ? ? ? 8 ? ? ? 8 ? ? ? 8 ? ? ? 8 ? ? ? 8 ? ? ? 8 ? ? ? 8
? ?459 ? ? 490 ? ? 583 ? ? 604 ? ? 775 ? ? 919 ? ? 968 ? ?1049 ? ?1123 ? ?1144 ? ?1309 ? ?1525
? ? ?8 ? ? ? 8 ? ? ? 8 ? ? ? 8 ? ? ? 8 ? ? ? 8 ? ? ? 8 ? ? ? 8 ? ? ? 8 ? ? ? 8 ? ? ? 8 ? ? ? 8
? 1604 ? ?1667 ? ?1688 ? ?1725 ? ?1802 ? ?1804 ? ?1830 ? ?1876 ? ?1889 ? ?1903 ? ?1922 ? ?1991
? ? ?8 ? ? ? 8 ? ? ? 8 ? ? ? 8 ? ? ? 8 ? ? ? 8 ? ? ? 8 ? ? ? 8 ? ? ? 8 ? ? ? 8 ? ? ? 8 ? ? ? 8
? 2053 ? ?2128 (Other) ? ?NA's
? ? ?8 ? ? ? 8 ? ?6869 ? ? 212

I first of all used clm() to assess the association between individual level variables and the ordinal response and had no problem using this function. I then tried to fit a simple random intercept only ordinal multilevel model with DZID as a random effect to assess whether or not there is significant area level variability in the model. Unfortunately I experienced convergence issues:

mod.a <- clmm(Num20~1, random=DZID, na.action=na.omit)
Warning message:
clmm may not have converged:
?optimizer 'ucminf' terminated with max|gradient|: 0.000501749472956048
Observe that this is a warning and not an error message, also it says
that clmm *may* not have converged: whether the optimizer terminated
close enough to the optimum is essentially op to you. The reason you
get the warning is because 5e-4 is larger that 1e-5, which is the
default maximum absolute gradient criterion (the grtol control option
in the ucminf optimizer), however, 5e-4 should be small enough for
most applications, so I would trust the results in this case.

If you change the optimizer and use, e.g. method = "nlminb" or "optim"
I expect you get essentially the same parameter estimates. You could
also (using the default ucminf optimizer) change the maximum absolute
gradient convergence criterion and append
control=clmm.control(grtol=1e-6) to your clmm call and see if it gets
closer to the optimum.

The main message is that you probably do not need to worry in this
case, but if you do, there are control options you can change.

summary(mod.a)
Cumulative Link Mixed Model fitted with the Laplace approximation

Call:
clmm(location = Num20 ~ 1, random = DZID, na.action = na.omit)

Random effects:
? ? ? ? ? Var ? Std.Dev
DZID 0.2638869 0.5136992

No location coefficients

No scale coefficients

Threshold coefficients:
? ?Estimate Std. Error z value
0|1 -1.1123 ? ? ?NaN ? ? ? ?NaN
1|2 -0.5100 ? ? ?NaN ? ? ? ?NaN
2|3 ?0.5009 ? ? ?NaN ? ? ? ?NaN
3|4 ?1.0096 ? ? ?NaN ? ? ? ?NaN

log-likelihood: -12154.35
AIC: 24318.70
Condition number of Hessian: NaN
(239 observations deleted due to missingness)
Warning message:
In summary.clmm(mod.a) :
?Variance-covariance matrix of the parameters is not defined
If you want standard errors, p-values etc. you should add 'Hess =
TRUE' to your clmm call. (I am aware that a more informative warning
message would be nice)
I experience similar convergence issues if trying to include a fixed effect in the model:

mod.b <- clmm(Num20~Sex, random=DZID, na.action=na.omit)
Warning message:
clmm may not have converged:
?optimizer 'ucminf' terminated with max|gradient|: 0.000355076487956175

summary(mod.b)
Cumulative Link Mixed Model fitted with the Laplace approximation

Call:
clmm(location = Num20 ~ Sex, random = DZID, na.action = na.omit)

Random effects:
? ? ? ? ? Var ? Std.Dev
DZID 0.2642084 0.5140121

Location coefficients:
? ? Estimate Std. Error z value Pr(>|z|)
Sex2 -0.1616 ? ? ?NaN ? ? ? ?NaN NA

No scale coefficients

Threshold coefficients:
? ?Estimate Std. Error z value
0|1 -1.2055 ? ? ?NaN ? ? ? ?NaN
1|2 -0.6029 ? ? ?NaN ? ? ? ?NaN
2|3 ?0.4096 ? ? ?NaN ? ? ? ?NaN
3|4 ?0.9193 ? ? ?NaN ? ? ? ?NaN

log-likelihood: -12146.78
AIC: 24305.56
Condition number of Hessian: NaN
(239 observations deleted due to missingness)
Warning message:
In summary.clmm(mod.b) :
?Variance-covariance matrix of the parameters is not defined

I found that someone on the list had experienced a similar problem (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q1/005119.html) and followed guidance proposed by Rune Haubo. That is, adding an nAGQ argument to the function and/or changing maxIter:

mod.a <- clmm(Num20~1, random=DZID, na.action=na.omit, nAGQ = 10)
Warning message:
clmm may not have converged:
?optimizer 'ucminf' terminated with max|gradient|: 0.000104892430129334

and

mod.a <- clmm(Num20~1, random=DZID, na.action=na.omit, nAGQ = 10, control = clmm.control(maxIter = 200,
+ maxLineIter = 200))
Warning message:
clmm may not have converged:
?optimizer 'ucminf' terminated with max|gradient|: 0.000104892430129334
Observe that this is the exact same maximum absolute gradient
indicating that the optimizer took the same path to the optimum and
that maxIter and maxLineIter never came into play.
I tried several values for maxIter and maxLineIter and still experience these convergence issues.

Am I using clmm() incorrectly? Is there a problem due to the fact that I have such a large number of areas to consider in the model? Is there a limit to the number of higher level units that clmm() can deal with? It may be that the higher level variation is not statistically significant. However, I wanted to assess this in the model as I have area level variables. Is there another ordinal multilevel regression approach that anyone can suggest would be suitable for this analysis?
From what you showed us, I don't think there is anything to worry
about with your data. There is no limit to the number of observations
or random effect levels that clmm can cope with - you may run out of
memory at some point or other things can come into play, but that is
not directly related to clmm. So the number of areas in your data does
not seem to be a problem.

I hope I got around to all your questions, but please follow up if I
missed something or you experience additional issues.

Cheers,
Rune
Any suggestions would be greatly appreciated!

Cheers,
Karen

--
Dr Karen Lamb
Statistician/Career Development Fellow
Neighbourhoods and Health
MRC Social and Public Health Sciences Unit
4 Lilybank Gardens
Glasgow
G12 8RZ

Tel: 0141 357 3949
www.sphsu.mrc.ac.uk

? ? ? ?[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Rune Haubo Bojesen Christensen

PhD Student, M.Sc. Eng.
Phone: (+45) 45 25 33 63
Mobile: (+45) 30 26 45 54

DTU Informatics, Section for Statistics
Technical University of Denmark, Build. 305, Room 122,
DK-2800 Kgs. Lyngby, Denmark