Dear R-users, apologies for the total beginner's question. I have some behavioural data expressed in number of interactions measured over a certain amount of time (expressed in minutes). I have calculated the number of interactions per minute and I would like to model this variable as a function of a covariate. I am, however, struggling with the choice of the argument family. I have thought to log transform the data and model them with a GLM by specifying a Gaussian family. Unfortunately for me my data do not seem to follow a normal distribution. I then recalled that my response is some kind of proportion and should perhaps be modelled with binomial GLM. But the data are not bounded from 0 to 1...Any suggestions would be greatly appreciated! Please see data and R code at the bottom of the e-mail. Cheers, Mario # number of interactions recorded interactions<-c(2,1,2,3,0,0,0,0,1,2,0,2,0,0,0,0,0,0,1,0,0,0,0,0,0,0,2,3,0,8,7,6,3,5,4,5,7,6,0,0,0,0,2,2,3,4,1,2,5,6,2,4,0,6,0,0,39,46,45,48,11,8,7,8,6,7,13,10,7,9,32,38,39,25,32,30,11,13) # observation time (in minutes) obs.time<-c(9.4666666667,9.7,9.1,9.45,8.8166666667,8.9333333333,8.7166666667,8.85,6,3.85,6.1666666667,6.3166666667,2,2.3833333333,7.95,9.6333333333,9.8166666667,1.5,9.4166666667,9.35,1.2833333333,1.85,1,2.3333333333,6.6,6.3166666667,4.4333333333,2.5,9.35,9.3166666667,6.6,6.3166666667,6.3166666667,6.1666666667,6,9.5166666667,9.1,9.45,8.7333333333,2.5,2.3833333333,2,1.85,1.2833333333,2.3833333333,8.65,8.5333333333,8.5,8,7.6833333333,0.55,2.4,8.5166666667,8.1833333333,7.9166666667,8,9.5833333333,9.6833333333,9.6833333333,9.4833333333,9.5833333333,9.55,9.4833333333,9.45,9.7333333333,9.6833333333,9.6333333333,9.45,9.4833333333,9.6833333333,9.65,9.6166666667,9.5,9.4333333333,9.3166666667,9.3333333333,9.4166666667,9.4666666667) # covariate covariate<-c(15,15,18,18,40,40,45,45,42,42,50,50,200,200,400,400,500,500,200,200,150,150,90,90,45,45,37,37,42,42,37,37,80,80,110,110,150,150,300,300,250,250,85,85,18,18,42,42,15,15,50,50,45,45,200,200,45,45,15,15,115,115,125,125,500,500,550,550,550,550,45,45,37,37,80,80,100,100) # calculate total number of interaction per minute interactions.m<-interactions/obs.time # fit glm glm(log(interactions.m+1)~covariate)
error distribution in GLM model
2 messages · Mario Serrajotto, Peter Solymos
Mario, If you can assume that the waiting time between events is constant through time, you can model your counts per unit time with Poisson glm (constant waiting time leads to an exponential survival function). log(Observation time) can be used as an offset: glm(interactions~covariate, offset=log(obs.time), family=poisson) Note that diagnostic plots indicate that homogeneous Poisson process assumption might not hold. Peter -- P?ter S?lymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB solymos at ualberta.ca, Ph 780.492.8534, http://psolymos.github.com Alberta Biodiversity Monitoring Institute, http://www.abmi.ca Boreal Avian Modelling Project, http://www.borealbirds.ca On Sun, Aug 19, 2012 at 10:14 AM, Mario Serrajotto
<mario.serrajotto at gmail.com> wrote:
Dear R-users, apologies for the total beginner's question. I have some behavioural data expressed in number of interactions measured over a certain amount of time (expressed in minutes). I have calculated the number of interactions per minute and I would like to model this variable as a function of a covariate. I am, however, struggling with the choice of the argument family. I have thought to log transform the data and model them with a GLM by specifying a Gaussian family. Unfortunately for me my data do not seem to follow a normal distribution. I then recalled that my response is some kind of proportion and should perhaps be modelled with binomial GLM. But the data are not bounded from 0 to 1...Any suggestions would be greatly appreciated! Please see data and R code at the bottom of the e-mail. Cheers, Mario # number of interactions recorded interactions<-c(2,1,2,3,0,0,0,0,1,2,0,2,0,0,0,0,0,0,1,0,0,0,0,0,0,0,2,3,0,8,7,6,3,5,4,5,7,6,0,0,0,0,2,2,3,4,1,2,5,6,2,4,0,6,0,0,39,46,45,48,11,8,7,8,6,7,13,10,7,9,32,38,39,25,32,30,11,13) # observation time (in minutes) obs.time<-c(9.4666666667,9.7,9.1,9.45,8.8166666667,8.9333333333,8.7166666667,8.85,6,3.85,6.1666666667,6.3166666667,2,2.3833333333,7.95,9.6333333333,9.8166666667,1.5,9.4166666667,9.35,1.2833333333,1.85,1,2.3333333333,6.6,6.3166666667,4.4333333333,2.5,9.35,9.3166666667,6.6,6.3166666667,6.3166666667,6.1666666667,6,9.5166666667,9.1,9.45,8.7333333333,2.5,2.3833333333,2,1.85,1.2833333333,2.3833333333,8.65,8.5333333333,8.5,8,7.6833333333,0.55,2.4,8.5166666667,8.1833333333,7.9166666667,8,9.5833333333,9.6833333333,9.6833333333,9.4833333333,9.5833333333,9.55,9.4833333333,9.45,9.7333333333,9.6833333333,9.6333333333,9.45,9.4833333333,9.6833333333,9.65,9.6166666667,9.5,9.4333333333,9.3166666667,9.3333333333,9.4166666667,9.4666666667) # covariate covariate<-c(15,15,18,18,40,40,45,45,42,42,50,50,200,200,400,400,500,500,200,200,150,150,90,90,45,45,37,37,42,42,37,37,80,80,110,110,150,150,300,300,250,250,85,85,18,18,42,42,15,15,50,50,45,45,200,200,45,45,15,15,115,115,125,125,500,500,550,550,550,550,45,45,37,37,80,80,100,100) # calculate total number of interaction per minute interactions.m<-interactions/obs.time # fit glm glm(log(interactions.m+1)~covariate)
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology