Skip to content

MCMCglmm 'undefined columns' error

2 messages · Greg Lee, Jarrod Hadfield

#
Dear List,

I am attempting to fit a GLMM using MCMCglmm(), but receive and
'undefined columns' error.

The data are counts of bunches of grapes at bud positions (1:10) on
canes (basal/distal) within vines within (vine type) clones within
(vineyard) sites. Two treatments relating to degree of pruning have
been imposed, and there are 5 replicates at each level combination.
The data are as follows:
site clone treat vine cane bud count
1    M  2051     1    1    b   1     1
2    M  2051     1    2    b   1     2
3    M  2051     1    3    b   1     1
4    M  2051     1    4    b   1     2
5    M  2051     1    5    b   1     1
6    M  2051     2    1    b   1     2
...

with counts taking values 0,1,2 or (very rarely) 3.

Applying a Poisson-based GLM, such as:

glm(count ~ treat + cane + bud, family = poisson, data = budfruit)

suggests serious under-dispersion (Residual deviance: 370.09 on 1184
degrees of freedom), from which I concluded that a multinomial-based
mixed model might be a better option. However, my initial attempt
using MCMCglmm

fit <- MCMCglmm(count ~ site + clone + treat + cane + bud,
                random =~ vine, family = 'multinomial4',
                data = budfruit, verbose = TRUE)

produces the error:

Error in `[.data.frame`(data, , match(response.names[0:nJ + nt],
names(data))) :
  undefined columns selected

Is there a problem with my understanding of the call to MCMCglmm(), or
is this perhaps a bug?

Any insight appreciated. Are there perhaps other packages within which
I could fit this model?

Regards,
Greg
R version 2.9.0 (2009-04-17)
i386-pc-mingw32

locale:
LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.1252;LC_MONETARY=English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
 [1] MCMCglmm_1.09      gtools_2.5.0-1     combinat_0.0-6     orthopolynom_1.0-1
 [5] polynom_1.3-5      pscl_1.03          mvtnorm_0.9-5      ape_2.3
 [9] coda_0.13-4        Matrix_0.999375-26 lattice_0.17-22    tensorA_0.31
[13] corpcor_1.5.2      MASS_7.2-46

loaded via a namespace (and not attached):
[1] gee_4.13-13 grid_2.9.0  nlme_3.1-91


--------------
Greg Lee
Biometrician
Tasmanian Institute of Agricultural Research
New Town Research Laboratories
University of Tasmania
13 St Johns Avenue,
New Town, 7008
Australia
Ph: ?+613 6233 6858
Fax: +613 6233 6145
#
Hi Greg,

There are two ways of fitting multinomial models when the number of  
observations per unit is 1 (as in your case). You specified  
family="multinomial4" so MCMCglmm is expecting a 4 column response  
variable with the counts in each column. In your case each row should  
contain 3 zeros and 1 one.  The response should be specified something  
like:

cbind(category1,category2,category3,category4)~

Alternatively, you can pass a single column of factors as a response  
and use family="categorical" and keep your original  syntax:

count ~

Both models are parameterised as a log-odds ratio against a baseline  
category. In the first model  the final category is the baseline  
category. I've done this because for binomial traits ("multinomial2")  
the standard glm syntax  is to pass cbind(successes, failures) and the  
results are the log odds of succeeding vs failing (ie failing is the  
base-line category).   In the second model  the first factor level is  
the baseline category.  I've done this because for binary traits it is  
natural to have zero's as the base-line and interpret the parameters  
as predicting 1's (successes).

For your data there will be three response variables in both cases  
({cat1-cat4, cat2-cat4, cat3-cat4} or {cat2-cat1, cat3-cat1, cat4- 
cat1}).   You will want to use the reserved variable "trait" so that  
different intercepts and effects are fitted for the three responses.  
For example count~trait-1 or possibly count~trait+trait:cane-1 etc.

A multivariate error structure must also be defined. You could use:

~ idh(trait):units or ~ us(trait):units

However, for multinomial models with a single count you can't estimate  
these from the data so you may as well fix them.  I recommend fixing  
the residual variances to 1, and the covariances to either 0 or 0.5.   
This is specified in the prior:

prior=list(R=list(V=diag(3), n=1, fix=1), G=Gprior)

where V=diag(3) is  3X3 (number of categories -1) identity matrix and  
fix=1 fixes the matrix at these values. The random effects should also  
have priors which are also passed as a list (Gprior)

Its up to you what type of structure you use for vine. The vine  
variance structure is also 3X3 and you could use any of these options:

~ vine                             # 1 parameter    
v1=v2=v3=c12=c13=c23      Gprior=list(G1=list(V=diag(1), n=1))
~ vine:trait                     # 1 parameter   v1=v2=v3   
c12=c13=c23=0  Gprior=list(G1=list(V=diag(1), n=1))
~ vine+vine:trait           # 2 parameters v1=v2=v3  c12=c13=c23        
Gprior=list(G1=list(V=diag(1), n=1)), G2=list(V=diag(1), n=1))
~idh(vine)                     # 3 parameters  v1 v2 v3     
c12=c13=c23=0  Gprior=list(G1=list(V=diag(3), n=3))
~idh(vine)+vine:trait   # 4 parameters  v1 v2 v3    c12=c13=c23        
Gprior=list(G1=list(V=diag(3), n=3), G2=list(V=diag(1), n=1))
~ us(vine)                     # 6 parameters  v1 v2 v3    c12 c13  
c23          Gprior=list(G1=list(V=diag(3), n=3))

where v1 v2 and v3 are the variances for each logg-odds ratio due to  
vine, and  c12 c13 c23  are the covarainces between them.  I've made  
up the prior (co)variances but I've put in the minimum degree of  
belief parameter (n) that is required to make the priors proper.

One obvious problem is that the multinomial model in this form looses  
the information about the ordering 0<1<2<3.

Cheers,

Jarrod
On 11 May 2009, at 06:00, Greg Lee wrote: