Skip to content

Theta from negative binomial regression and power_NegativeBinomiial from PASSED

5 messages · Sorkin, John, Andrew Robinson, Eric Berger +2 more

#
Colleagues,

I want to use the power_NetativeBinomial function from the PASSED library. The function requires a value for a parameter theta. The meaning of theta is not given in the documentation (at least I can?t find it) of the function. Further the descriptions of the negative binomial distribution that I am familiar with do not mention theta as being a parameter of the distribution. I noticed that when one runs the glm.nb function to perform a negative binomial regression one obtains a value for theta. This leads to two questions

  1.  Is the theta required by the power_NetativeBinomial function the theta that is produced by the glm.nb function
  2.  What is theta, and how does it relate to the parameters of the negative binomial distribution?

Thank you,
John
#
Hi John,

the negative binomial is a tricky one - there are several different parameterisations and therefore different interpretations of the parameters.  Joseph Hilbe wrote a whole book on it that might be wroth checking.

Cheers,

Andrew


--
Andrew Robinson
Chief Executive Officer, CEBRA and Professor of Biosecurity,
School/s of BioSciences and Mathematics & Statistics
University of Melbourne, VIC 3010 Australia
Tel: (+61) 0403 138 955
Email: apro at unimelb.edu.au
Website: https://researchers.ms.unimelb.edu.au/~apro at unimelb/

I acknowledge the Traditional Owners of the land I inhabit, and pay my respects to their Elders.
On 15 Sep 2023 at 11:52 AM +1000, Sorkin, John <jsorkin at som.umaryland.edu>, wrote:
External email: Please exercise caution

Colleagues,

I want to use the power_NetativeBinomial function from the PASSED library. The function requires a value for a parameter theta. The meaning of theta is not given in the documentation (at least I can?t find it) of the function. Further the descriptions of the negative binomial distribution that I am familiar with do not mention theta as being a parameter of the distribution. I noticed that when one runs the glm.nb function to perform a negative binomial regression one obtains a value for theta. This leads to two questions

1. Is the theta required by the power_NetativeBinomial function the theta that is produced by the glm.nb function
2. What is theta, and how does it relate to the parameters of the negative binomial distribution?

Thank you,
John
#
I asked your question to ChatGPT4

In statistics, what is the theta parameter of the negative binomial
distribution?

The response contained some math symbols but I am sending in plain
text mode so there may be some loss, hopefully not serious.

The negative binomial distribution is used to model the number of
successes before observing a fixed number of failures in a Bernoulli
process. It has two parameters, typically denoted r and p:

r - This is the number of failures until the experiment is stopped.
p - This is the probability of success on any given trial.

In some contexts, the negative binomial distribution is described in
terms of the number of failures before a fixed number of successes. In
this case:

r would be the number of successes.
p would still be the probability of success on any given trial.

In some literature and software implementations, the negative binomial
distribution's parameterization is provided in terms of r (or k) and ?
instead of r and p. Here:

r (or k) is the number of successes.
? represents a "success-to-failure" ratio.

The relationship between p and ? is:  p = ? / (?+1)

[i.e. p = theta / (theta + 1) ]

Conversely, ? = p / (1?p)

[i.e. theta = p / (1-p) ]

In summary, ? is a parameter that captures the success-to-failure
ratio in some parameterizations of the negative binomial distribution,
and it can be related back to the success probability p.


On Fri, Sep 15, 2023 at 5:06?AM Andrew Robinson via R-help
<r-help at r-project.org> wrote:
#
On Fri, 15 Sep 2023 01:51:27 +0000
"Sorkin, John" <jsorkin at som.umaryland.edu> wrote:

            
Plugging the p (the success probability) and the r (the number of
successes until the experiment is stopped) from the Wikipedia article
(where they are defined in terms of mean mu and variance sigma^2)
together with the variance from ?MASS::rnegbin (where it's defined as
mu + mu^2/theta) into Maxima and then solving for theta, I get:

solve(
 [
  p = mu / sigma^2,
  r = mu^2/(sigma^2-mu),
  sigma^2 = mu + mu^2/theta
 ],
 [mu, sigma, theta]
);
[
 mu = ((1-p)*r)/p,
 sigma = sqrt(r-p*r)/p,
 theta = r
]

That is, the theta from MASS seems to be equivalent to the number of
successes from the formulation in the Wikipedia article.
#
Yes, theta is the 'size' or overdispersion parameter.  Sometimes 
also denoted as k. Wikipedia discusses this parameterization in the 
paragraph starting "In negative binomial regression ..." (but they call 
this parameter r rather than theta or k).

   You can also see this in MASS on google books:

https://www.google.ca/books/edition/Modern_Applied_Statistics_with_S/CzwmBQAAQBAJ?hl=en&gbpv=1&dq=venables+ripley+negative+binomial&pg=PA206&printsec=frontcover

   This parameterization was added to R in version 1.3.0 ...
On 2023-09-15 2:27 a.m., Ivan Krylov wrote: