Getting vastly different results when running GLMs - R-help

Luke Duncan · 2011-08-17T15:43:26Z

An embedded and charset-unspecified text was scrubbed... Name: not available URL:

Thu, Aug 18, 2011 8:58 AM #

At 16:43 17/08/2011, Luke Duncan wrote:

Response in line below

I am analysing data from a study of behaviour and shade utilization of
chimpanzees. I am using GLMs in R (version 2.13.0) to test whether shade/sun
utilization is predicted by behaviour observed. I am thus interested in
whether an interaction of behaviour (as a predictor) and presence in the
sun/shade (also predictor) predicts the counts I have for the respective
categories.

I have my data organised as such:

 behaviour location specific total  Travel Sun 131 303  Travel Shade 172 303
Foraging Sun 248 651  Foraging Shade 403 651  Vigilance Sun 97 224
Vigilance Shade 127 224  Rest Sun 502 1143  Rest Shade 641 1143  Abnormal
Sun 33 58  Abnormal Shade 25 58  Play Sun 58 173  Play Shade 115 173
SelfGrooming Sun 183 595  SelfGrooming Shade 412 595  SocialGrooming Sun 59
358  SocialGrooming Shade 299 358  Other Sun 4 39  Other Shade 35 39  Hidden
Sun 120 656  Hidden Shade 536 656

I have coded the response variable as a specific count of the times
individuals were in the sun or shade, for each behaviour, out of a total
number of times a specific behaviour was observed (regardless of location
[sun/shade]). These are represented by the columns 'specific' and 'total'
respectively. I had originally coded these values as a proportion variable,
but had similar mismatch problems between R and Statistica (as described
below). The GLM I am running is a binomial one (as my count response
variables are divided dichotomously by the sun/shade predictor variable)
with a logit link function. My problem is this: I originally ran the data
through another stats program (Statistica) and got significant effects for
all first- and second-order effects. When I examined the raw data, the
patterns seen in the raw data suggested that these outcomes (of the GLM)
conformed to the raw data (i.e. confirmed the GLM results). I then ran the *
same* data through R using the following code:

behdata<-read.csv("behaviourshade.csv",header=TRUE)
behdata #Just to check that everything is there and working...
behav<-behdata$behaviour
loc<-behdata$location
prop<-behdata$proportion
spec<-behdata$specific
total<-behdata$total
model<-glm((cbind(spec,total))~behav*loc,family=binomial,data=behdata)

If you have extracted your variables from the data.frame you do not 
need the data=

Why did it delete 19 observations?

In general if you have an interaction you need to be cautious about 
making statement about the underlying main effects. You have found 
that the effect of sun differs for different behaviours so making an 
overall statement about sun may be problematic.

Michael Dewey
info at aghmed.fsnet.co.uk
http://www.aghmed.fsnet.co.uk/home.html