Skip to content

Continuous vs. categorical correlated group effects

6 messages · Drager, Andrea Pilar, Ben Bolker

#
Hi All,

I am having trouble running a Bayesian mixed model in MCMCglmm where I  
have individual-level data for my response variable, and species-level  
data as the random effect (such as "species"), plus any other  
species-level continuous variable, such as abundance, in the model.  
But if the the other species-level variable is categorical--whether  
because I make it a random effect or because it is in fact  
categorical--the model runs! Could someone please explain the stats  
behind this?


prior = list(R = list(V = 1, nu = 0, fix = 1),  G = list(G1=list(V =  
1,nu = 0.002)))

Won't run-->MCMCglmm(binary_individual_repsonse ~ species_abund_continuous,
                      random = ~ species_id_categorical, family =  
"categorical")

             Error : Mixed model equations singular: use a (stronger) prior


Runs-->MCMCglmm(binary_individual_response ~ 1,
                 random = ~ species_abund_categorical +  
species_id_categorical, family = "categorical")

Runs-->MCMCglmm(binary_individual_response  ~ species_id_categorical,
                 random = ~ species_abund_categorical, family= "categorical")


Thanks in advance!
Andrea Pilar Drager
PhD. student
Ecology and Evolutionary Biology, Rice University
#
Can you show us the summary() of your data?
  Is it possible you have complete separation in your continuous predictor?
On 18-01-02 02:38 PM, Drager, Andrea Pilar wrote:
#
summary(flor_data)

  species_id         binary_individual_response

  Length:29609       Min.   :0.00000
  Class :character   1st Qu.:0.00000
  Mode  :character   Median :0.00000
                     Mean   :0.06018
                     3rd Qu.:0.00000
                     Max.   :1.00000

   species_abund
   Min.   :  11.23
   1st Qu.:1996.23
   Median :2548.23
   Mean   :3438.20
   3rd Qu.:5310.23
   Max.   :6116.23


The following is also the case:

Won't run-->glmer(binary_indivdual_response ~ species_abund  
+(1|species_id),family=binomial(link='logit')

Runs-->glm(binary_individual_response ~ species_abund + species_id,  
family=binomial(link='logit')


Quoting Ben Bolker <bbolker at gmail.com>:
Andrea Pilar Drager
PhD. student
Ecology and Evolutionary Biology, Rice University
#
The first thing I would try is rescaling your abundance value.  The
second is to tell us *exactly* what error messages
you get when you run glmer.  Also, how many species do you have?

===
fake_data <- data.frame(
   species_id = rep(outer(LETTERS,LETTERS,paste,sep="/"),40),
   stringsAsFactors=FALSE)
nn <- nrow(fake_data)
set.seed(101)
fake_data$resp <- rbinom(nn,prob=0.06,size=1)
fake_data$abund <- rlnorm(nn,meanlog=log(2500),
                          sdlog=0.75)

library(lme4)
g1 <- glmer(resp ~ abund +(1|species_id),data=fake_data,
      family=binomial(link='logit'))

## produces a fit, but lots of warnings.

fake_data$sc_abund <- scale(fake_data$abund)

update(g1, . ~ . - abund + sc_abund)

## The glm works on the first 1000 rows, but is very slow for the
whole data set (I may have invented too many species)


On Tue, Jan 2, 2018 at 5:21 PM, Drager, Andrea Pilar
<andrea.p.drager at rice.edu> wrote:
1 day later
#
Dear Ben,

Thank you very much! I have only 23 species, and yes, you are right  
that the glmer model did run before but with lots of warnings:
Warning messages:
1: Some predictor variables are on very different scales: consider rescaling
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
   Model is nearly unidentifiable: very large eigenvalue
  - Rescale variables?;Model is nearly unidentifiable: large eigenvalue ratio
  - Rescale variables?

I log-transformed my abundance variable and now both the glmer and  
MCMCglmm models run fine, with no warnings.






Quoting Ben Bolker <bbolker at gmail.com>:
Andrea Pilar Drager
PhD. student
Ecology and Evolutionary Biology, Rice University
#
That's fine.

  Note that linearly scaling your predictor variable (e.g. subtracting
the mean and scaling by the standard deviation, which is what scale()
does) changes only the parameterization and not the underlying
definition of the model (e.g. the likelihood and any inferences drawn
the model will be the same).  In contrast, log-transforming the
predictor changes the meaning of the model -- it might be a more
sensible model, but it will be different from the original model.

  cheers
    Ben Bolker
On 18-01-03 08:07 PM, Drager, Andrea Pilar wrote: