GLM-normal distribution
This isn't really a mixed model question: it would be more appropriate for a generic stats or stats-ecology forum (e.g. r-sig-ecology at r-project.org, or CrossValidated [http://stats.stackexchange.com] A couple of quick points: - you don't need lme4 at all since you don't have a random effect in your model - a rule of thumb is that you shouldn't try to fit more than 1 model parameter per 10-15 data points, so this model (4 parameters for 19 data points) is pushing it a bit - you should not assess normality based on the *marginal* distribution; instead you should look at the residuals from the model (e.g. see plot(M2) below) - if you weight the linear model by number of species (as is probably appropriate) you get a p-value of 0.052 - your data are slightly underdispersed (less variance than expected from binomial); if you account for this by using family=quasibinomial you get almost identical results to the linear model. Overall I would say you have *weak* evidence at best for an effect of anchom. M1 <- glm(cbind(exot, nativ) ~ anchom + tipdecamp + exph500, data =mis.datos1, family =binomial)# the syntax of my model summary(M1) M2 <- lm(exot/(nativ+exot) ~ anchom + tipdecamp + exph500, data =mis.datos1, weight=nativ+exot) summary(M2) plot(M2) library(ggplot2); theme_set(theme_bw()) library(dplyr) library(tidyr) d2 <- mis.datos %>% mutate(tot=exot+nativ, prop_exot=exot/tot) %>% select(prop_exot,tot,anchom,tipdecamp,exph500) %>% gather(var,value,-c(prop_exot,tot,tipdecamp)) ggplot(d2 ,aes(value,prop_exot,colour=tipdecamp))+ geom_point(aes(size=tot))+facet_wrap(~var,scale="free_x")+ geom_smooth(method="glm",aes(weight=tot), method.args=list(family=binomial)) deviance(M1)/df.residual(M1) M3 <- update(M1, family =quasibinomial) ## scale parameters d3 <- mis.datos %>% mutate(anchom=scale(anchom), exph500=scale(exph500)) M4 <- update(M3, data=d3) library(dotwhisker) dwplot(list(M4))+geom_vline(xintercept=0,lty=2)
On 17-04-04 05:15 PM, Marcos Monasterolo wrote:
Dear all. I am doing an analysis on proportion data resulting from counts.
As I do have the count data available I am running a glm with binomial
distribution. However, after realizing the response variable is normal
(Anderson-Darling test did not reject normality of the calculated
proportions) I am now having second thoughts as to whether it might also be
possible to run a normal lm with proportion as the response variable. The
thing is one of the explanatory variables ("ancho", which I am really
interested in) is not significant in the binomial glm but significant in
the lm. My understanding is that I should stick with the binomial GLM, but
I wanted to have an expert opinion on this.
I provide a working code below. Thanks in advance for your help.
Marcos
id <- "0B6X3EoqLHXG-dnZqTXpWSkRPYkE" # google file ID
mis.datos <- read.table(sprintf("https://docs.google.com/uc?id=%s&
export=download", id), header = TRUE,sep=";",dec=",")
mis.datos1<-mis.datos[-c(3,6,7,8),] #these data points I don't need
library(nortest)
ad.test(mis.datos1$propexot)#evaluate normality
hist(mis.datos1$propexot)
library(lme4)
M1 <- glm(cbind(exot, nativ) ~ anchom + tipdecamp + exph500, data =
mis.datos1, family =binomial)# the syntax of my model
summary(M1)
----
Bi?l. Marcos Monasterolo
Becario doctoral - C?tedra de Bot?nica General, Facultad de Agronom?a, UBA
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models