Skip to content

glm.nb() giving strongly different results

4 messages · David Croll, Bill Venables

#
Dear colleagues,



I have performed several dozens of glm.nb(response ~ variable) analyses 
weeks ago, and when I looked through the results today I saw that many 
of the results have quite different intercept values despite the 
response part remained the same.

I'm quite sure I did same kind of analysis when the intercept values 
were around consistently around 2.2 and when they were above 3. When I 
repeated the analyses today, the intercept values were normal, they were 
between 2.1 to 2.3 instead of being above 3. I'm standing in front of a 
puzzle... they surely aren't glm() results, for they would give 
intercept values well above 9.

Is there anything like a set.seed() thing that could have changed some 
properties inside R? On a second look, I discovered that the init.theta 
value is much lower in those analyses I have to perform again.


Does anybody have a clue to this problem? It isn't that important that I 
have an answer (because I simply have to repeat the analyses), but still...


David
#
I can console you on one point, though.  glm.nb does not use a stochastic algorithm, and so no random numbers are involved.  So unless you are generating fake data, the random number generator should play no part.


Bill Venables
http://www.cmis.csiro.au/bill.venables/ 


-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of David Croll
Sent: Wednesday, 25 March 2009 12:36 PM
To: r-help at r-project.org
Subject: [R] glm.nb() giving strongly different results


Dear colleagues,

I have performed several dozens of glm.nb(response ~ variable) analyses 
weeks ago, and when I looked through the results today I saw that many 
of the results have quite different intercept values despite the 
response part remained the same.

I'm quite sure I did same kind of analysis when the intercept values 
were around consistently around 2.2 and when they were above 3. When I 
repeated the analyses today, the intercept values were normal, they were 
between 2.1 to 2.3 instead of being above 3. I'm standing in front of a 
puzzle... they surely aren't glm() results, for they would give 
intercept values well above 9.

Is there anything like a set.seed() thing that could have changed some 
properties inside R? On a second look, I discovered that the init.theta 
value is much lower in those analyses I have to perform again.

Does anybody have a clue to this problem? It isn't that important that I 
have an answer (because I simply have to repeat the analyses), but still...

David

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Thank you, Bill, for your answer!


I am also at a total loss when looking for an explanation. I just can't 
remember what I did differently...

At least the errors are confined to a rather small dataset so the 
repetition of all the glm.nb() analyses won't take much time. The only 
thing I found out so far is that the problem appeared at binary 
explanatory variables to which the full set of study participants 
contributed their answers, but the problem did not occur when analysing 
binary variables where only the employed people contributed to the 
dataset...

Either employed people are some kind of magicians or I drank too little 
coffee to understand that I *really* did something different during my R 
work...

Thanks again,


David
#
I don't think I made myself clear, though.

With a series of models fitted using the same form

	fm <- glm.nb(response ~ variable, data)

why would you expect them to have similar intercepts? I just don't get it.

You could try fitting the same model in a different way using, e.g.

	fm <- glm.nb(response ~ I(variable - mean(variable)), data)

Now the intercepts really should all be exactly the same, and the slope parameters should not have change from the previous form.  Can you check that?

The problem with fitting models in this form, though, is that you cannot easily predict from the resulting fitted model object.


Bill Venables
http://www.cmis.csiro.au/bill.venables/ 


-----Original Message-----
From: David Croll [mailto:david.croll at gmx.ch] 
Sent: Wednesday, 25 March 2009 7:28 PM
To: Venables, Bill (CMIS, Cleveland); r-help at r-project.org
Subject: Re: [R] glm.nb() giving strongly different results

Thank you, Bill, for your answer!


I am also at a total loss when looking for an explanation. I just can't 
remember what I did differently...

At least the errors are confined to a rather small dataset so the 
repetition of all the glm.nb() analyses won't take much time. The only 
thing I found out so far is that the problem appeared at binary 
explanatory variables to which the full set of study participants 
contributed their answers, but the problem did not occur when analysing 
binary variables where only the employed people contributed to the 
dataset...

Either employed people are some kind of magicians or I drank too little 
coffee to understand that I *really* did something different during my R 
work...

Thanks again,


David