How to obtain individual log-likelihood value from glm?
In the book Modern Applied Statistics with S, 4th edition, 2002, by Venables and Ripley, there is a function logitreg on page 445, which does provide the weighted logistic regression I asked, judging by the loss function. And interesting enough, logitreg provides the same coefficients as glm in the example I provided earlier, even with weights < 1. Also for residual deviance, logitreg yields the same number as glm. Unless I misunderstood something, I am convinced that glm is a valid tool for weighted logistic regression despite the description on weights and somehow questionable logLik value in the case of non-integer weights < 1. Perhaps this is a bold claim: the description of weights can be modified and logLik can be updated as well. The stackexchange inquiry I provided is what I feel interesting, not the link in that post. Sorry for the confusion.
On Sat, Aug 29, 2020 at 10:18 AM John Smith <jswhct at gmail.com> wrote:
Thanks for very insightful thoughts. What I am trying to achieve with the weights is actually not new, something like https://stats.stackexchange.com/questions/44776/logistic-regression-with-weighted-instances. I thought my inquiry was not too strange, and I could utilize some existing codes. It is just an optimization problem at the end of day, or not? Thanks On Sat, Aug 29, 2020 at 9:02 AM John Fox <jfox at mcmaster.ca> wrote:
Dear John, On 2020-08-29 1:30 a.m., John Smith wrote:
Thanks Prof. Fox. I am curious: what is the model estimated below?
Nonsense, as Peter explained in a subsequent response to your prior posting.
I guess my inquiry seems more complicated than I thought: with y being
0/1, how to fit weighted logistic regression with weights <1, in the sense of weighted least squares? Thanks What sense would that make? WLS is meant to account for non-constant error variance in a linear model, but in a binomial GLM, the variance is purely a function for the mean. If you had binomial (rather than binary 0/1) observations (i.e., binomial trials exceeding 1), then you could account for overdispersion, e.g., by introducing a dispersion parameter via the quasibinomial family, but that isn't equivalent to variance weights in a LM, rather to the error-variance parameter in a LM. I guess the question is what are you trying to achieve with the weights? Best, John
On Aug 28, 2020, at 10:51 PM, John Fox <jfox at mcmaster.ca> wrote: Dear John I think that you misunderstand the use of the weights argument to
glm() for a binomial GLM. From ?glm: "For a binomial GLM prior weights are used to give the number of trials when the response is the proportion of successes." That is, in this case y should be the observed proportion of successes (i.e., between 0 and 1) and the weights are integers giving the number of trials for each binomial observation.
I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/
On 2020-08-28 9:28 p.m., John Smith wrote: If the weights < 1, then we have different values! See an example
below.
How should I interpret logLik value then?
set.seed(135)
y <- c(rep(0, 50), rep(1, 50))
x <- rnorm(100)
data <- data.frame(cbind(x, y))
weights <- c(rep(1, 50), rep(2, 50))
fit <- glm(y~x, data, family=binomial(), weights/10)
res.dev <- residuals(fit, type="deviance")
res2 <- -0.5*res.dev^2
cat("loglikelihood value", logLik(fit), sum(res2), "\n")
On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard <pdalgd at gmail.com>
wrote:
If you don't worry too much about an additive constant, then half the negative squared deviance residuals should do. (Not quite sure how
weights
factor in. Looks like they are accounted for.) -pd
On 25 Aug 2020, at 17:33 , John Smith <jswhct at gmail.com> wrote: Dear R-help, The function logLik can be used to obtain the maximum log-likelihood
value
from a glm object. This is an aggregated value, a summation of
individual
log-likelihood values. How do I obtain individual values? In the
following
example, I would expect 9 numbers since the response has length 9. I
could
write a function to compute the values, but there are lots of family members in glm, and I am trying not to reinvent wheels.
Thanks!
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
data.frame(treatment, outcome, counts) # showing data
glm.D93 <- glm(counts ~ outcome + treatment, family =
poisson())
(ll <- logLik(glm.D93))
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.