Skip to content

error distribution in GLM model

2 messages · Mario Serrajotto, Peter Solymos

#
Dear R-users,

apologies for the total beginner's question. I have some behavioural
data expressed in number of interactions measured over a certain
amount of time (expressed in minutes). I have calculated the number of
interactions per minute and I would like to model this variable as a
function of a covariate. I am, however, struggling with the choice of
the argument family. I have thought to log transform the data and
model them with a GLM by specifying a Gaussian family. Unfortunately
for me my data do not seem to follow a normal distribution. I then
recalled that my response is some kind of proportion and should
perhaps be modelled with binomial GLM. But the data are not bounded
from 0 to 1...Any suggestions would be greatly appreciated! Please see
data and R code at the bottom of the e-mail.

Cheers,

Mario

# number of interactions recorded
interactions<-c(2,1,2,3,0,0,0,0,1,2,0,2,0,0,0,0,0,0,1,0,0,0,0,0,0,0,2,3,0,8,7,6,3,5,4,5,7,6,0,0,0,0,2,2,3,4,1,2,5,6,2,4,0,6,0,0,39,46,45,48,11,8,7,8,6,7,13,10,7,9,32,38,39,25,32,30,11,13)

# observation time (in minutes)
obs.time<-c(9.4666666667,9.7,9.1,9.45,8.8166666667,8.9333333333,8.7166666667,8.85,6,3.85,6.1666666667,6.3166666667,2,2.3833333333,7.95,9.6333333333,9.8166666667,1.5,9.4166666667,9.35,1.2833333333,1.85,1,2.3333333333,6.6,6.3166666667,4.4333333333,2.5,9.35,9.3166666667,6.6,6.3166666667,6.3166666667,6.1666666667,6,9.5166666667,9.1,9.45,8.7333333333,2.5,2.3833333333,2,1.85,1.2833333333,2.3833333333,8.65,8.5333333333,8.5,8,7.6833333333,0.55,2.4,8.5166666667,8.1833333333,7.9166666667,8,9.5833333333,9.6833333333,9.6833333333,9.4833333333,9.5833333333,9.55,9.4833333333,9.45,9.7333333333,9.6833333333,9.6333333333,9.45,9.4833333333,9.6833333333,9.65,9.6166666667,9.5,9.4333333333,9.3166666667,9.3333333333,9.4166666667,9.4666666667)

# covariate
covariate<-c(15,15,18,18,40,40,45,45,42,42,50,50,200,200,400,400,500,500,200,200,150,150,90,90,45,45,37,37,42,42,37,37,80,80,110,110,150,150,300,300,250,250,85,85,18,18,42,42,15,15,50,50,45,45,200,200,45,45,15,15,115,115,125,125,500,500,550,550,550,550,45,45,37,37,80,80,100,100)

# calculate total number of interaction per minute
interactions.m<-interactions/obs.time

# fit glm
glm(log(interactions.m+1)~covariate)
#
Mario,

If you can assume that the waiting time between events is constant
through time, you can model your counts per unit time with Poisson glm
(constant waiting time leads to an exponential survival function).
log(Observation time) can be used as an offset:

glm(interactions~covariate, offset=log(obs.time), family=poisson)

Note that diagnostic plots indicate that homogeneous Poisson process
assumption might not hold.

Peter

--
P?ter S?lymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
solymos at ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


On Sun, Aug 19, 2012 at 10:14 AM, Mario Serrajotto
<mario.serrajotto at gmail.com> wrote: