[forwarded to r-sig-mixed-models] -------- Original Message -------- Subject: RE: [R] errors with hurdle negative binomial mixed effect models Date: Fri, 9 Aug 2013 19:39:33 +0200 From: Marta Lomas <lomasvega at hotmail.com> To: Ben Bolker <bbolker at gmail.com>
Thanks Ben! Just 2 questions before getting into the site you adviced me:
So if some combinations of the variable categories are missing, meaning that not all the combinations are present in the data set, will not be possible to run these models?
Please see the FAQ I sent you to for information about dealing with interactions of categorical variables with incomplete coverage.
If understand, perfectly multicollinear variables means that they are correlated. But the Pearson coefficients show that they are not. Is it possible that it is because in many observations I have the same category level for those variables? for example, GVG= 1 and sward = 1;
It is possible for a set of more than two variables to be multicollinear (e.g. A+B+C=constant) even when no pair is perfectly correlated, although I don't know if that's the case here ... Ah! the summary is:
summary(SW_GVG)
Year Week Count Sward GVG09 Cluster
2013:510 1:102 Min. : 0.000 0: 5 20,525:325 Min.:1.000
2:102 1st Qu.: 0.000 1:169 28,125:100 1st Qu.:1.000
3:102 Median : 0.000 2:160 34,775: 20 Median:4.000
4:102 Mean : 1.316 3:158 51,375: 5 Mean :3.951
5:102 3rd Qu.: 0.000 4: 18 74,2 : 55 3rd Qu.:6.000
Max. :95.000 78 :5 Max. :8.000
GVG SCluster
Min. :1.000 1:135
1st Qu.:1.000 2: 85
Median :1.000 3: 25
Mean :1.784 4: 35
3rd Qu.:2.000 6:125
Max. :6.000 7: 95
8: 10
I don't know exactly what's going on here, but you should look
at
with(SW_GVG,table(factor(GVG),Sward,Count>0))
I suspect you will find there are empty cells.
Ben Bolker
To: r-help at stat.math.ethz.ch From: bbolker at gmail.com Date: Fri, 9 Aug 2013 17:07:31 +0000 Subject: Re: [R] errors with hurdle negative binomial mixed effect models Marta Lomas <lomasvega <at> hotmail.com> writes:
Hello!
I am new in the mailing list for R help and I hope to be able to formulate a good question easy to understand.
We hope so too :-) [snip] I will take a first crack at this here, but follow-ups should probably be redirected to the r-sig-mixed-models at r-project.org mailing list, which is more appropriate for questions dealing with (G)LMMs.
I am modeling my data set with hurdle negative binomial mixed effects, to find the correlation of some bird counts with environmental (categorical and continuous) variables.
When I run different models I have always an error. For instance: -For the truncated modeling of the non-zero counts:
HURgvgsw <- glmmadmb(count~ GVG*sward + (1|week),
data=subset(SW_GVG,count>0), + family= "truncnbinom") Error en glmmadmb(count ~ GVG * sward + (1 | week), data =
subset(SW_GVG, :
rank of X = 6 < ncol(X) = 10 -Or the binomial part where the zeros are modeled with the non-zeros:
HURgvgsw <- glmmadmb(count~ sward*GVG + (1|week) + (1|cluster),
data=SW_GVG, family= "binomial") Error en glmmadmb(count ~ sward * GVG + (1 | week) + (1 | cluster), data = SW_GVG, : rank of X = 13 < ncol(X) = 15 Would you have the solution to this?
This error message is telling you that some of your fixed-effect variables (which are, internally, combined into the fixed-effect design matrix X) are perfectly multicollinear. This is most likely happening because sward and GVG are categorical variables (or at least are being treated as categorical variables) and some combinations are missing from the data set (for future reference: the output of summary(SW_GVG) is useful for diagnosis). For more information, search http://glmm.wikidot.com/faq for the word 'rank' Good luck Ben Bolker
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.