Hello, I believe I have exhausted my online resources (and eyes) in trying to determine the appropriate method of analysis for the following investigation. I wish to determine if the efficiencies (% recovery) of two sampling units are significantly different. I sampled in three different fields. I attempted to collect 12 samples per unit per field (2 x 12 x 3 = 72); however, some sample sites had no seeds and resulting data were excluded (so as to not confuse with 'true' zero data; i.e., 0 seeds of x recovered). Working sample size = 24 and 27 (51), per unit. My dataset sets up like this: 1) 51 observations 2) Response variable = percent seeds recovered; x = 0-1 3) Predictor variable 1 = unit (K or L); fixed categorical 4) Predictor variable 2 = field (1, 2, or 3); random categorical More than 50% of my data are zeros, therefore, the distribution is far from normal. Can someone provide guidance RE how best to proceed? Thank you kindly in advance. -Everett -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/Proper-treatment-of-Proportion-Response-Data-with-Two-Categorical-Predictors-tp7577742.html Sent from the r-sig-ecology mailing list archive at Nabble.com.
Proper treatment of Proportion Response Data with Two Categorical Predictors
4 messages · Aitor Gastón, Everett
Everett, If you have the original binary data that were used to calculate proportions you can use generalized linear models with logit link (i.e. logistic regression). You can find a simple explanation of this approach and some examples with R code in http://www.bio.ic.ac.uk/research/crawley/statistics/exercises/R10Proportiondata.pdf Aitor -------------------------------------------------- From: "Everett" <ehanna23 at uwo.ca> Sent: Tuesday, December 11, 2012 12:19 AM To: <r-sig-ecology at r-project.org> Subject: [R-sig-eco] Proper treatment of Proportion Response Data with Two Categorical Predictors
Hello, I believe I have exhausted my online resources (and eyes) in trying to determine the appropriate method of analysis for the following investigation. I wish to determine if the efficiencies (% recovery) of two sampling units are significantly different. I sampled in three different fields. I attempted to collect 12 samples per unit per field (2 x 12 x 3 = 72); however, some sample sites had no seeds and resulting data were excluded (so as to not confuse with 'true' zero data; i.e., 0 seeds of x recovered). Working sample size = 24 and 27 (51), per unit. My dataset sets up like this: 1) 51 observations 2) Response variable = percent seeds recovered; x = 0-1 3) Predictor variable 1 = unit (K or L); fixed categorical 4) Predictor variable 2 = field (1, 2, or 3); random categorical More than 50% of my data are zeros, therefore, the distribution is far from normal. Can someone provide guidance RE how best to proceed? Thank you kindly in advance. -Everett -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/Proper-treatment-of-Proportion-Response-Data-with-Two-Categorical-Predictors-tp7577742.html Sent from the r-sig-ecology mailing list archive at Nabble.com.
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Aitor, Perhaps I am missing something, but I do not think that my original data can take binary form. Each sampling point had a unique number of seeds (0 - +infinity). I sampled at each site and collected a proportion of the seeds that were available, thus, I would have, for example, 10 seeds available of which 2 seeds were collected = 0.200 recovery (or 20% recovery). I do not think that logistic (binary) regression applies here but I am relatively novice with certain aspects of these topics. -Everett -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/Proper-treatment-of-Proportion-Response-Data-with-Two-Categorical-Predictors-tp7577742p7577747.html Sent from the r-sig-ecology mailing list archive at Nabble.com.
Following your example, you have 2 positive cases and 8 negative cases, i.e.
a binary response as you can code the data as 0 (not recovered) and 1
(recovered).
An example of the GLM approach using simulated data:
set.seed(100)#set random number generator to get reproducible results
N<-round(runif(51,1,10))#simulate number of available seeds
rp<-runif(51,0,1)#simulate proportion of recovered seeds
r<-round(N*rp)#simulate numer of recovered seeds
u<-factor(sample(c("K","L"),51,replace=T)) #simulate units
f<-factor(sample(c("f1","f2","f3"),51,replace=T)) #simulate fields
mod<-glm(cbind(r,N-r)~u + f, family="binomial") #fit a GLM
anova (mod,test="Chisq") #anova test
summary(mod) #summary of the model with "treatment contrasts"
This is a fixed effects model, but it can be adapted to mixed models using
the glmer function of the lme4 package. An example available in ?glmer
## generalized linear mixed model
(gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
family = binomial, data = cbpp))
Hope this helps
Aitor
--------------------------------------------------
From: "Everett" <ehanna23 at uwo.ca>
Sent: Tuesday, December 11, 2012 8:46 PM
To: <r-sig-ecology at r-project.org>
Subject: Re: [R-sig-eco] Proper treatment of Proportion Response Data with
Two Categorical Predictors
Aitor, Perhaps I am missing something, but I do not think that my original data can take binary form. Each sampling point had a unique number of seeds (0 - +infinity). I sampled at each site and collected a proportion of the seeds that were available, thus, I would have, for example, 10 seeds available of which 2 seeds were collected = 0.200 recovery (or 20% recovery). I do not think that logistic (binary) regression applies here but I am relatively novice with certain aspects of these topics. -Everett -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/Proper-treatment-of-Proportion-Response-Data-with-Two-Categorical-Predictors-tp7577742p7577747.html Sent from the r-sig-ecology mailing list archive at Nabble.com.
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology