Colleagues, I want to use the power_NetativeBinomial function from the PASSED library. The function requires a value for a parameter theta. The meaning of theta is not given in the documentation (at least I can?t find it) of the function. Further the descriptions of the negative binomial distribution that I am familiar with do not mention theta as being a parameter of the distribution. I noticed that when one runs the glm.nb function to perform a negative binomial regression one obtains a value for theta. This leads to two questions 1. Is the theta required by the power_NetativeBinomial function the theta that is produced by the glm.nb function 2. What is theta, and how does it relate to the parameters of the negative binomial distribution? Thank you, John
Theta from negative binomial regression and power_NegativeBinomiial from PASSED
5 messages · Sorkin, John, Andrew Robinson, Eric Berger +2 more
Hi John, the negative binomial is a tricky one - there are several different parameterisations and therefore different interpretations of the parameters. Joseph Hilbe wrote a whole book on it that might be wroth checking. Cheers, Andrew -- Andrew Robinson Chief Executive Officer, CEBRA and Professor of Biosecurity, School/s of BioSciences and Mathematics & Statistics University of Melbourne, VIC 3010 Australia Tel: (+61) 0403 138 955 Email: apro at unimelb.edu.au Website: https://researchers.ms.unimelb.edu.au/~apro at unimelb/ I acknowledge the Traditional Owners of the land I inhabit, and pay my respects to their Elders.
On 15 Sep 2023 at 11:52 AM +1000, Sorkin, John <jsorkin at som.umaryland.edu>, wrote:
External email: Please exercise caution Colleagues, I want to use the power_NetativeBinomial function from the PASSED library. The function requires a value for a parameter theta. The meaning of theta is not given in the documentation (at least I can?t find it) of the function. Further the descriptions of the negative binomial distribution that I am familiar with do not mention theta as being a parameter of the distribution. I noticed that when one runs the glm.nb function to perform a negative binomial regression one obtains a value for theta. This leads to two questions 1. Is the theta required by the power_NetativeBinomial function the theta that is produced by the glm.nb function 2. What is theta, and how does it relate to the parameters of the negative binomial distribution? Thank you, John
I asked your question to ChatGPT4 In statistics, what is the theta parameter of the negative binomial distribution? The response contained some math symbols but I am sending in plain text mode so there may be some loss, hopefully not serious. The negative binomial distribution is used to model the number of successes before observing a fixed number of failures in a Bernoulli process. It has two parameters, typically denoted r and p: r - This is the number of failures until the experiment is stopped. p - This is the probability of success on any given trial. In some contexts, the negative binomial distribution is described in terms of the number of failures before a fixed number of successes. In this case: r would be the number of successes. p would still be the probability of success on any given trial. In some literature and software implementations, the negative binomial distribution's parameterization is provided in terms of r (or k) and ? instead of r and p. Here: r (or k) is the number of successes. ? represents a "success-to-failure" ratio. The relationship between p and ? is: p = ? / (?+1) [i.e. p = theta / (theta + 1) ] Conversely, ? = p / (1?p) [i.e. theta = p / (1-p) ] In summary, ? is a parameter that captures the success-to-failure ratio in some parameterizations of the negative binomial distribution, and it can be related back to the success probability p. On Fri, Sep 15, 2023 at 5:06?AM Andrew Robinson via R-help
<r-help at r-project.org> wrote:
Hi John, the negative binomial is a tricky one - there are several different parameterisations and therefore different interpretations of the parameters. Joseph Hilbe wrote a whole book on it that might be wroth checking. Cheers, Andrew -- Andrew Robinson Chief Executive Officer, CEBRA and Professor of Biosecurity, School/s of BioSciences and Mathematics & Statistics University of Melbourne, VIC 3010 Australia Tel: (+61) 0403 138 955 Email: apro at unimelb.edu.au Website: https://researchers.ms.unimelb.edu.au/~apro at unimelb/ I acknowledge the Traditional Owners of the land I inhabit, and pay my respects to their Elders. On 15 Sep 2023 at 11:52 AM +1000, Sorkin, John <jsorkin at som.umaryland.edu>, wrote: External email: Please exercise caution Colleagues, I want to use the power_NetativeBinomial function from the PASSED library. The function requires a value for a parameter theta. The meaning of theta is not given in the documentation (at least I can?t find it) of the function. Further the descriptions of the negative binomial distribution that I am familiar with do not mention theta as being a parameter of the distribution. I noticed that when one runs the glm.nb function to perform a negative binomial regression one obtains a value for theta. This leads to two questions 1. Is the theta required by the power_NetativeBinomial function the theta that is produced by the glm.nb function 2. What is theta, and how does it relate to the parameters of the negative binomial distribution? Thank you, John [[alternative HTML version deleted]] [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Fri, 15 Sep 2023 01:51:27 +0000
"Sorkin, John" <jsorkin at som.umaryland.edu> wrote:
What is theta, and how does it relate to the parameters of the negative binomial distribution?
Plugging the p (the success probability) and the r (the number of successes until the experiment is stopped) from the Wikipedia article (where they are defined in terms of mean mu and variance sigma^2) together with the variance from ?MASS::rnegbin (where it's defined as mu + mu^2/theta) into Maxima and then solving for theta, I get: solve( [ p = mu / sigma^2, r = mu^2/(sigma^2-mu), sigma^2 = mu + mu^2/theta ], [mu, sigma, theta] ); [ mu = ((1-p)*r)/p, sigma = sqrt(r-p*r)/p, theta = r ] That is, the theta from MASS seems to be equivalent to the number of successes from the formulation in the Wikipedia article.
Best regards, Ivan
Yes, theta is the 'size' or overdispersion parameter. Sometimes also denoted as k. Wikipedia discusses this parameterization in the paragraph starting "In negative binomial regression ..." (but they call this parameter r rather than theta or k). You can also see this in MASS on google books: https://www.google.ca/books/edition/Modern_Applied_Statistics_with_S/CzwmBQAAQBAJ?hl=en&gbpv=1&dq=venables+ripley+negative+binomial&pg=PA206&printsec=frontcover This parameterization was added to R in version 1.3.0 ...
On 2023-09-15 2:27 a.m., Ivan Krylov wrote:
On Fri, 15 Sep 2023 01:51:27 +0000 "Sorkin, John" <jsorkin at som.umaryland.edu> wrote:
What is theta, and how does it relate to the parameters of the negative binomial distribution?
Plugging the p (the success probability) and the r (the number of successes until the experiment is stopped) from the Wikipedia article (where they are defined in terms of mean mu and variance sigma^2) together with the variance from ?MASS::rnegbin (where it's defined as mu + mu^2/theta) into Maxima and then solving for theta, I get: solve( [ p = mu / sigma^2, r = mu^2/(sigma^2-mu), sigma^2 = mu + mu^2/theta ], [mu, sigma, theta] ); [ mu = ((1-p)*r)/p, sigma = sqrt(r-p*r)/p, theta = r ] That is, the theta from MASS seems to be equivalent to the number of successes from the formulation in the Wikipedia article.
Dr. Benjamin Bolker Professor, Mathematics & Statistics and Biology, McMaster University Director, School of Computational Science and Engineering (Acting) Graduate chair, Mathematics & Statistics > E-mail is sent at my convenience; I don't expect replies outside of working hours.