Skip to content

Multi-response MCMCglmm (Gaussian and zapoisson)

3 messages · Susanne Lachmuth, Jarrod Hadfield

#
Dear MCMCglmm users,

I am currently struggling with the specification of a proper prior and model formula for a multi-response MCMCglmm with two of the three response variables being Gaussian and the third being za-poisson. The model includes several fixed effects and three nested random effects.

In general, I would prefer to fit a model with a fixed effect of trait and suppressed intercept for getting trait specific intercepts. Further, I wish to measure covariances between the response variables and consequently to specify completely parameterized covariance matrices. 

My current model looks like this:

prior<-list(R=list(V=diag(4),nu=0.004),G=list(G1=list(V=diag(4)/4,n=4),G2=list(V=diag(4)/4,n=4),G3=list(V=diag(4)/4,n=4)))

m1<-MCMCglmm(fixed=cbind(resp1,resp2,resp3) ~
trait:(expl1 + expl2+ expl3+ expl4+ expl5?.)+trait-1,
random=~us(trait):rand1+us(trait): rand1: rand2+us(trait): rand1: rand2: rand3,
rcov=~us(trait):units,family=c("gaussian","gaussian","zapoisson"),nitt=20000,burnin=5000,thin=10,prior=prior,data=dat)

However, for zero-altered models it is recommended in the MCMCglmm course notes to constrain over-dispersion to be the same for both processes (the zero-alteration and poisson process) by using a trait by unit interaction in the R-structure. Additionally, the intercept should not be suppressed for getting the differences between the zero-altered regression coefficients and the Poisson regression coefficients. This allows identification of zero-inflation or zero-deflation in response to the explanatory variables.

I therefore fit an additional model for the zapoisson response only, looking like this:

prior2<-list(R=list(V=diag(1)/4,nu=0.002),G=list(G1=list(V=diag(1)/4,nu=0.002),G2=list(V=diag(1)/4,nu=0.002),G3=list(V=diag(1)/4,nu=0.002)))

m2<-MCMCglmm(fixed=resp3 ~
trait:(expl1 + expl2+ expl3+ expl4+ expl5?.)+trait,
 random=~trait:rand1+trait: rand1: rand2+ trait: rand1: rand2: rand3,
rcov=~trait:units, family="zapoisson",nitt=20000,burnin=1000,thin=10,data=dat, prior=prior2)

The results show that there is ?significant? zero-inflation and deflation in response to some variables.

My main questions are:

Are the two priors specified correctly?
Does it make sense to include the zapoisson response (resp3) in model m1 and is the model formula (in particular the R-structure) appropriate?
An alternative might be to analyze resp1 and resp2 (both size variables of plants) with model m1 and fit an extra model (like m2) for resp 3 (reproduction) with the size parameters (i.e. responses of m1) as covariates. We are interested in a trade-off of growth and reproduction.


Any help would be greatly appreciated.

Many thanks,

Susanne Lachmuth

-------------------------------
Susanne Lachmuth
Department of Plant Ecolgy
University of Halle-Wittenberg
Germany
#
Hi Susanne,

A multi-response model with two Gaussian and one zero-altered Poisson  
(ZAP) responses cannot be currently fitted in MCMCglmm due to  
restrictions on the R-structure. The residual variance of the ZA part  
of the ZAP cannot be estimated (it is a binary variable) and so needs  
to be set to some value, or as you indicate set to the same residual  
variance as the P part. The covariance between the ZA and the P part  
is also non-identifiable from the data (no datum can be both zero and  
non-zero) and it is customary to set it to zero. A fully parameterised  
residual (co)variance matrix for the model should ideally look like:

   V(1)  C(1,2) C(1,3) C(1,4)
  C(1,2)  V(2)  C(2,3) C(2,4)
  C(1,3) C(2,3)  V(3)    0
  C(1,4) C(2,4)   0    V(3)

Currently this is not possible, and the "us" variance function that  
you use estimates V(4) and C(3,4) for which all the information is  
coming from the prior. You can probably go one step further and  
restrict V(4)=1 using parameter expanded priors, but you are still  
left with estimating C(3,4). There may well be transformations that  
give interpretable marginal distributions that are independent of  
C(2,3) and V(4) but I do not know them.

The (co)variance matrices for the random effects can be fully estimated.

More generally, these models are VERY complex and you would need a lot  
of data, with lots of replication at the appropriate levels.  
Personally, I would try and come up with a simple, biologically  
plausible model first and see how you go from there. The priors are  
likely to be informative and in the case of the residual (co)variance  
matrix are improper. Remember that the marginal prior on an individual  
variance in a covariance matrix has degree of belief parameter  
nu-dim(V)+1 which for your random effect covariance matrices is 1  
(informative) and -2.996 for your residual covariance matrix (improper).

m2 looks more reasonable, but again has many fixed and random effects  
so I would worry whether the data-set is large enough and well  
structured enough to estimate all parameters with reasonable  
precision. You may want to consider parameter expanded priors in this  
case. Note also, that in the first set of CourseNotes with a  
zero-altered Poisson section (v2.05-v2.08, online from Aug-Nov 2010) I  
made a mistake and said that if a model of the form

trait:(expl1 + expl2+ expl3+ expl4+ expl5?.)+trait

is fitted then the significant "za" terms indicate zero-deflation/inflation.

This was wrong and is corrected in subsequent updates (v2.12 is  
current). The "za" terms in teh parameterisation:

trait*(expl1 + expl2+ expl3+ expl4+ expl5?.)

indicate zero-deflation/inflation when significant. The model is the  
same but the interpretation is different. I'm sorry for this.

Cheers,

Jarrod



Quoting Susanne Lachmuth <susanne.lachmuth at botanik.uni-halle.de>:

  
    
2 days later
#
Dear Jarrod,

thank you very much for the immidiate answer and very helpful advice. 

Yes, we are aware that the models are quite compex and probably we have to think about them a little more. 
The data set is not too bad, I hope (909 units for the growth model (m1) and 2457 units for the reproduction model (m2), both well structured). It is individual demographic field data from a large geographic area collected over longer time spans in two years. We therefore need to correct for some things (time, space, plant size, weather...). Actually, some of my models -when it comes to addressing our actual hypotheses- are even more complex at the moment...

I will discuss this with my collaborators...I still might have to come back with one or the other question ...

Thank you very much again!

Best,

Susanne