Skip to content

Error in phylogenetic ordinal model with MCMCglmm()

8 messages · Dexter Locke, Jarrod Hadfield, Paul Buerkner +2 more

#
Dear list,

I'm using MCMCglmm to run a phylogenetic model where the response is a
3-level ordinal factor (i.e. level 2 is an intermediate phenotype
between 1 and 3), and the predictors include one factorial (foraging
habitat), one ordinal (trophic level), and several continuous
variables.

As far as I know MCMCglmm is the only package that can handle logistic
models for phylogenetically structured multi-level discrete data, but
please correct me if that's not the case.

My problem right now is that I can't get MCMCglmm() to work with the
'family' argument set to "ordinal", although it does work with
"categorical".
Here's the code I'm using:
[1] ?2.25?
[1] "R version 3.4.3 (2017-11-30)"

## model specifications:
## categorical model (unordered response) - runs to completion
+                random = ~ us(trait):Binomial,
+                rcov = ~ us(trait):units,
+                prior = list(R = list(fix=1, V=(1/k) * (I + J), n = k-1),
+                             G = list(G1 = list(V = diag(k-1), n = k-1))),
+                ginverse = list(Binomial=INphylo$Ainv),
+                burnin = 300000,
+                nitt = 3000000,
+                thin = 2000,
+                family = "categorical",
+                data = valid,
+                pl = TRUE)

## ordinal model and error message
+                random = ~ us(trait):Binomial,
+                rcov = ~ us(trait):units,
+                prior = list(R = list(fix=1, V=1, n = k-1),
+                             G = list(G1 = list(V = diag(k-1), n = k-1))),
+                ginverse = list(Binomial=INphylo$Ainv),
+                burnin = 300000,
+                nitt = 3000000,
+                thin = 2000,
+                family = "ordinal",
+                data = valid,
+                pl = TRUE)
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
  contrasts can be applied only to factors with 2 or more levels

## the shape of the data
'data.frame': 1399 obs. of  16 variables:
 $ Binomial           : chr  "Abrocoma_bennettii" "Abrothrix_andinus"
"Abrothrix_jelskii" "Abrothrix_longipilis" ...
 $ Response           : Factor w/ 3 levels "1","2","3": 1 3 2 2 2 3 2 3 1 3 ...
 $ ForagingHab        : Factor w/ 7 levels "1","3","4","5",..: 2 2 2 2
2 2 2 2 2 2 ...
 $ Troph_Lev          : Factor w/ 3 levels "1","2","3": 1 2 2 2 2 3 2 1 1 1 ...
 $ Mass               : num  250.5 24.9 34.5 38.9 24.5 ...
 $ Annual.Mean.Temp   : num  12.42 7.26 9.18 9.9 8.62 ...
 $ Mean.Diur.Range    : num  10.46 13.82 16.16 9.15 7.78 ...
 $ Max.Temp.Warmest.M : num  22 16.6 19.1 19.8 17.2 ...
 $ Min.Temp.Coldest.M : num  3.77 -3.98 -3.24 2.21 1.46 ...
 $ Temp.Annual.Range  : num  18.3 20.6 22.4 17.6 15.7 ...
 $ Mean.Temp.Warm.Q   : num  16 9.2 11.1 13.8 12.2 ...
 $ Mean.Temp.Cold.Q   : num  8.87 4.49 6.39 5.98 4.87 ...
 $ Annual.Precip      : num  166 645 558 903 1665 ...
 $ Precip.Driest.Month: num  1.74 5.99 6.31 31.31 104.7 ...
 $ AET                : num  213 482 704 455 361 ...
 $ PET                : num  1074 1242 1305 677 638 ...


I don't understand what factors the error refers to, because there
sufficient levels in the response even if one is absorbed in the
intercept.

The R-constraint in the prior is specified as suggested in the
MCMCglmm tutorial (fix=1, V=1), but the error message is the same
whether I use this specification or the categorical model
specification (fix=1, V=(1/k)*(I + J)) .

On a side note - what parameters affect the acceptance rates? The
categorical models maintain a rate of around 0.3 so I think the mixing
could be improved.

Any input would be very much appreciated.

Many thanks,
Roi Maor
#
Maybe Response needs to be an ordered factor, not a factor with three levels. Try something like

valid$Response <- as.factor(valid$Response, ordered=T)

See also th clmm package. 

HTH, Dexter
#
Hi, 

There is only one ?trait? in an ordinal model, so the terms with trait in are redundant and causing the problem. You should get rid of all trait and us(trait): terms. With family=?categorical? there are two traits with 3 categories, one for each contrast. Note that is  using family=?threshold? will be more efficient than using ?ordinal? although they are almost the same model.

Cheers,

Jarrod

  
    
#
If it doesn't end up working in MCMCglmm, you may also try the brms
package.

See
https://cran.r-project.org/web/packages/brms/vignettes/brms_phylogenetics.html
for an introduction of
phylogenetic models in brms.

2018-08-02 15:39 GMT+02:00 Dexter Locke <dexter.locke at gmail.com>:

  
  
#
Hi,

Contrary to "categorical", the "ordinal" family is not multivariate on the latent scale, so you need to drop everything related to "trait" in the ordinal model and modify the prior to be univariate as well.

I think MCMCglmm is complaining here because "trait" does not have more than one level due to this.

Hope this helps,
Pierre.

Le jeudi 2 ao?t 2018, 15:24:55 CEST roee maor a ?crit :
#
Thanks everyone for your quick and helpful responses.
Removing the 'trait' terms indeed solved this problem and I have taken
Jarrod's advice to use family="threshold" instead of "ordinal".

Would it be possible to get a quick summary of the main differences
between ordinal and threshold models? Both seem to be rarely used and
I can't find much documentation on their inner workings in the
MCMCglmm vignette, tutorial or course notes.

Cheers,
RoiOn Thu, 2 Aug 2018 at 14:45, Paul Buerkner <paul.buerkner at gmail.com> wrote:
#
Hi,

With three levels you have 4 cut-points, one of which needs estimating (call it cp.est)

cp<-c(-Inf, 0, cp.est, Inf)

If eta is the fixed+random effect linear predictor, Xb+Zu, then the probability of being in category i is

pnorm(cp[i+1], eta , sqrt(Vr)) - pnorm(cp[i], eta, sqrt(Vr))

when family=?threshold?. Vr is the residual (units) variance. When Vr=1 this is standard profit regression. When family=?ordinal? it is:

pnorm(cp[i+1], eta , sqrt(Vr+1)) - pnorm(cp[i], eta, sqrt(Vr+1))

Its not a big deal because the eta cam be rescaled to be equivalent:

eta_ordinal/sqrt(Vr+1) = eta_threshold/sqrt(Vr)

and Vr is fixed.

However, the MCMC algorithm for  family=?threshold? is more efficient.

Cheers,

Jarrod
On 3 Aug 2018, at 12:43, roee maor <roeemaor at gmail.com<mailto:roeemaor at gmail.com>> wrote:
Thanks everyone for your quick and helpful responses.
Removing the 'trait' terms indeed solved this problem and I have taken
Jarrod's advice to use family="threshold" instead of "ordinal".

Would it be possible to get a quick summary of the main differences
between ordinal and threshold models? Both seem to be rarely used and
I can't find much documentation on their inner workings in the
MCMCglmm vignette, tutorial or course notes.

Cheers,
RoiOn Thu, 2 Aug 2018 at 14:45, Paul Buerkner <paul.buerkner at gmail.com<mailto:paul.buerkner at gmail.com>> wrote:
If it doesn't end up working in MCMCglmm, you may also try the brms package.

See https://cran.r-project.org/web/packages/brms/vignettes/brms_phylogenetics.html for an introduction of
phylogenetic models in brms.

2018-08-02 15:39 GMT+02:00 Dexter Locke <dexter.locke at gmail.com<mailto:dexter.locke at gmail.com>>:

Maybe Response needs to be an ordered factor, not a factor with three levels. Try something like

valid$Response <- as.factor(valid$Response, ordered=T)

See also th clmm package.

HTH, Dexter
On Aug 2, 2018, at 9:24 AM, roee maor <roeemaor at gmail.com<mailto:roeemaor at gmail.com>> wrote:
Dear list,

I'm using MCMCglmm to run a phylogenetic model where the response is a
3-level ordinal factor (i.e. level 2 is an intermediate phenotype
between 1 and 3), and the predictors include one factorial (foraging
habitat), one ordinal (trophic level), and several continuous
variables.

As far as I know MCMCglmm is the only package that can handle logistic
models for phylogenetically structured multi-level discrete data, but
please correct me if that's not the case.

My problem right now is that I can't get MCMCglmm() to work with the
'family' argument set to "ordinal", although it does work with
"categorical".
Here's the code I'm using:

packageVersion("MCMCglmm")
[1] ?2.25?
R.version.string
[1] "R version 3.4.3 (2017-11-30)"

## model specifications:
INphylo <- inverseA(mammaltree, nodes="ALL", scale=TRUE)  ## phylogeny with 1399 tips, setting nodes="TIPS" is extremely slow
k <- length(levels(valid$Response))
I <- diag(k-1)
J <- matrix(rep(1, (k-1)^2), c(k-1, k-1))

## categorical model (unordered response) - runs to completion
m1 <- MCMCglmm(Response ~ -1 + trait + ForagingHab + Troph_Lev + Mass + Mean.Diur.Range + Max.Temp.Warmest.M + Temp.Annual.Range + Precip.Driest.Month + PET,
+                random = ~ us(trait):Binomial,
+                rcov = ~ us(trait):units,
+                prior = list(R = list(fix=1, V=(1/k) * (I + J), n = k-1),
+                             G = list(G1 = list(V = diag(k-1), n = k-1))),
+                ginverse = list(Binomial=INphylo$Ainv),
+                burnin = 300000,
+                nitt = 3000000,
+                thin = 2000,
+                family = "categorical",
+                data = valid,
+                pl = TRUE)

## ordinal model and error message
m2 <- MCMCglmm(Response ~ -1 + trait + ForagingHab + Troph_Lev + Mass + Mean.Diur.Range + Max.Temp.Warmest.M + Temp.Annual.Range + Precip.Driest.Month + PET,
+                random = ~ us(trait):Binomial,
+                rcov = ~ us(trait):units,
+                prior = list(R = list(fix=1, V=1, n = k-1),
+                             G = list(G1 = list(V = diag(k-1), n = k-1))),
+                ginverse = list(Binomial=INphylo$Ainv),
+                burnin = 300000,
+                nitt = 3000000,
+                thin = 2000,
+                family = "ordinal",
+                data = valid,
+                pl = TRUE)
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels

## the shape of the data
str(valid)
'data.frame': 1399 obs. of  16 variables:
$ Binomial           : chr  "Abrocoma_bennettii" "Abrothrix_andinus"
"Abrothrix_jelskii" "Abrothrix_longipilis" ...
$ Response           : Factor w/ 3 levels "1","2","3": 1 3 2 2 2 3 2 3 1 3 ...
$ ForagingHab        : Factor w/ 7 levels "1","3","4","5",..: 2 2 2 2
2 2 2 2 2 2 ...
$ Troph_Lev          : Factor w/ 3 levels "1","2","3": 1 2 2 2 2 3 2 1 1 1 ...
$ Mass               : num  250.5 24.9 34.5 38.9 24.5 ...
$ Annual.Mean.Temp   : num  12.42 7.26 9.18 9.9 8.62 ...
$ Mean.Diur.Range    : num  10.46 13.82 16.16 9.15 7.78 ...
$ Max.Temp.Warmest.M : num  22 16.6 19.1 19.8 17.2 ...
$ Min.Temp.Coldest.M : num  3.77 -3.98 -3.24 2.21 1.46 ...
$ Temp.Annual.Range  : num  18.3 20.6 22.4 17.6 15.7 ...
$ Mean.Temp.Warm.Q   : num  16 9.2 11.1 13.8 12.2 ...
$ Mean.Temp.Cold.Q   : num  8.87 4.49 6.39 5.98 4.87 ...
$ Annual.Precip      : num  166 645 558 903 1665 ...
$ Precip.Driest.Month: num  1.74 5.99 6.31 31.31 104.7 ...
$ AET                : num  213 482 704 455 361 ...
$ PET                : num  1074 1242 1305 677 638 ...


I don't understand what factors the error refers to, because there
sufficient levels in the response even if one is absorbed in the
intercept.

The R-constraint in the prior is specified as suggested in the
MCMCglmm tutorial (fix=1, V=1), but the error message is the same
whether I use this specification or the categorical model
specification (fix=1, V=(1/k)*(I + J)) .

On a side note - what parameters affect the acceptance rates? The
categorical models maintain a rate of around 0.3 so I think the mixing
could be improved.

Any input would be very much appreciated.

Many thanks,
Roi Maor

_______________________________________________
R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models



_______________________________________________
R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20180803/3973117b/attachment.ksh>
#
That's great Jarrod,  thanks very much!
Roi
On Fri, 3 Aug 2018 at 12:58, HADFIELD Jarrod <j.hadfield at ed.ac.uk> wrote: