MCMCglmm phylogenetically controlled categorical R structure and priors help
Hi Ben, If each species can only belong to one category (?) the choice of R-structure is arbitrary since observation-level variation is not identifiable in the likelihood. You probably want to add trait as a main effect, and interact it with your other predictors too. Currently the model assumes that the probability of being Bicoloured/Mottled/Plain compared to the base-line category (not sure what that is) is the same and that they are all affected by diet in the same way. This may make sense depending on what the base-line category is, but I doubt it. The phylogenetic variance looks large, but it is hard to say because you have specified scale=FALSE. If it really is large it is possible that you will run into numerical errors. If you have pl=TRUE in your call to MCMCglmm, and then do a histogram of my_model$Liab, make sure the distribution is contained within +/-25. If this is not so then you will have to use software that assumes the phylogenetic heritability is 1. I have developed algorithms under this scenario but only for ordered categorical traits. I would just use the ginverse argument (the pedigree argument is redundant) and use the default scale=TRUE and nodes="ALL" not nodes="TIPS". The model is the same, but the former allows sparse linear solvers to exploit the special sparse structure of phylogenies. It will probably be orders of magnitude faster for large phylogenies. Note that this type of analysis is VERY data demanding and you will need many hundreds of species to get precise estimates, particularly if some wing pattern types are rare. If the data set is modest in size expect the results to be sensitive to your choice of prior for the phylogenetic species effects. Cheers, Jarrod
On 14/11/2016 23:48, ben hogan wrote:
Hello,
I have collected categorical data about the colouration of a number of bird species (4 levels), and am attempting to see if colouration is correlated with the proportions (as percentage) of different prey types, and the birds' length. There is only one observation of each factor for each bird species, i.e. one row in the data frame.
I am having difficulty in understanding the outcome of the R-structure, and how to properly define the priors. I don't have any specific predictions about priors, and I believe the code I have used is supposed to generate flat/uninformed ones, but am not sure if that it in fact the case (very new to the subject). I am not sure if both pedigree and ginverse are necessary, I get the similar output without pedigree.
As it stands my model and priors for a model on F_Wing2 are;
k <- length(levels(Data$F_Wing2))
IJ <- (1/k) * (diag(k-1) + matrix(1, k-1, k-1))
prior.phyl = list(R = list(V = IJ, nu = 0),G = list( G1 = list(V = IJ, n = k-1 , nu = 0) ) )
Ainv<-inverseA(Tree, scale=FALSE, nodes="TIPS")$Ainv
myMCMC.phyl<- MCMCglmm(F_Wing2 ~ Birds+Reptiles+Amphibians+Fish+Mammals+as.numeric(Length),
random=~us(trait):species,
rcov = ~us(trait):units,
pedigree=Tree,
scale=FALSE,
ginverse = list(species=Ainv),
data = Data, family="categorical",
prior=prior.phyl,
nitt=10000,
thin=25,
burnin=2000)
This runs, and nitt etc are artificially low for testing purposes. The outcome is something like this;
DIC: 160.7702
G-structure: ~us(trait):species
post.mean l-95% CI u-95% CI eff.samp
traitF_Wing2.Bicolour:traitF_Wing2.Bicolour.species 112.05 2.267 237.3 2.048
traitF_Wing2.Mottled:traitF_Wing2.Bicolour.species 75.29 1.450 160.9 4.397
traitF_Wing2.Plain:traitF_Wing2.Bicolour.species 125.34 3.015 268.0 2.707
traitF_Wing2.Bicolour:traitF_Wing2.Mottled.species 75.29 1.450 160.9 4.397
traitF_Wing2.Mottled:traitF_Wing2.Mottled.species 52.12 1.345 123.3 7.140
traitF_Wing2.Plain:traitF_Wing2.Mottled.species 85.01 2.861 188.4 4.175
traitF_Wing2.Bicolour:traitF_Wing2.Plain.species 125.34 3.015 268.0 2.707
traitF_Wing2.Mottled:traitF_Wing2.Plain.species 85.01 2.861 188.4 4.175
traitF_Wing2.Plain:traitF_Wing2.Plain.species 143.20 5.880 310.2 3.252
R-structure: ~us(trait):units
post.mean l-95% CI u-95% CI eff.samp
traitF_Wing2.Bicolour:traitF_Wing2.Bicolour.units 0.50 0.50 0.50 0
traitF_Wing2.Mottled:traitF_Wing2.Bicolour.units 0.25 0.25 0.25 0
traitF_Wing2.Plain:traitF_Wing2.Bicolour.units 0.25 0.25 0.25 0
traitF_Wing2.Bicolour:traitF_Wing2.Mottled.units 0.25 0.25 0.25 0
traitF_Wing2.Mottled:traitF_Wing2.Mottled.units 0.50 0.50 0.50 0
traitF_Wing2.Plain:traitF_Wing2.Mottled.units 0.25 0.25 0.25 0
traitF_Wing2.Bicolour:traitF_Wing2.Plain.units 0.25 0.25 0.25 0
traitF_Wing2.Mottled:traitF_Wing2.Plain.units 0.25 0.25 0.25 0
traitF_Wing2.Plain:traitF_Wing2.Plain.units 0.50 0.50 0.50 0
Location effects: F_Wing2 ~ Birds + Reptiles + Amphibians + Fish + Mammals + as.numeric(Length)
post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept) -10.128704 -19.972170 -0.276258 14.864 0.0187 *
Birds -0.058857 -0.116441 -0.005308 24.074 0.0312 *
Reptiles -0.059507 -0.148332 0.011748 6.368 0.1313
Amphibians -0.098699 -0.336341 0.078327 12.190 0.3438
Fish -0.104092 -0.226801 -0.010597 11.712 0.0563 .
Mammals -0.006519 -0.062639 0.044000 17.301 0.7250
as.numeric(Length) 0.097417 -0.013097 0.232505 9.862 0.1187
---
As you can see, the R-Structure seems not to have been affected by the running of the model, and effective sample sizes are 0. 1) I am not sure why this is, and 2) the model doesn't run unless rcov = ~us(trait):units, but I do not understand what "units" refers to.
Any help greatly appreciated!
Best,
Ben Hogan
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.