Skip to content

How to obtain individual log-likelihood value from glm?

6 messages · John Fox, John Smith

#
Dear John

I think that you misunderstand the use of the weights argument to glm() 
for a binomial GLM. From ?glm: "For a binomial GLM prior weights are 
used to give the number of trials when the response is the proportion of 
successes." That is, in this case y should be the observed proportion of 
successes (i.e., between 0 and 1) and the weights are integers giving 
the number of trials for each binomial observation.

I hope this helps,
  John

John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/
On 2020-08-28 9:28 p.m., John Smith wrote:
#
Thanks Prof. Fox. 

I am curious: what is the model estimated below?

I guess my inquiry seems more complicated than I thought: with y being 0/1, how to fit weighted logistic regression with weights <1, in the sense of weighted least squares? Thanks
#
Dear John,
On 2020-08-29 1:30 a.m., John Smith wrote:
Nonsense, as Peter explained in a subsequent response to your prior posting.
What sense would that make? WLS is meant to account for non-constant 
error variance in a linear model, but in a binomial GLM, the variance is 
purely a function for the mean.

If you had binomial (rather than binary 0/1) observations (i.e., 
binomial trials exceeding 1), then you could account for overdispersion, 
e.g., by introducing a dispersion parameter via the quasibinomial 
family, but that isn't equivalent to variance weights in a LM, rather to 
the error-variance parameter in a LM.

I guess the question is what are you trying to achieve with the weights?

Best,
  John
#
Thanks for very insightful thoughts. What I am trying to achieve with the
weights is actually not new, something like
https://stats.stackexchange.com/questions/44776/logistic-regression-with-weighted-instances.
I thought my inquiry was not too strange, and I could utilize some existing
codes. It is just an optimization problem at the end of day, or not? Thanks
On Sat, Aug 29, 2020 at 9:02 AM John Fox <jfox at mcmaster.ca> wrote:

            

  
  
#
In the book Modern Applied Statistics with S, 4th edition, 2002, by
Venables and Ripley, there is a function logitreg on page 445, which does
provide the weighted logistic regression I asked, judging by the loss
function. And interesting enough, logitreg provides the same coefficients
as glm in the example I provided earlier, even with weights < 1. Also for
residual deviance, logitreg yields the same number as glm. Unless I
misunderstood something, I am convinced that glm is a valid tool for
weighted logistic regression despite the description on weights and somehow
questionable logLik value in the case of non-integer weights < 1. Perhaps
this is a bold claim: the description of weights can be modified and logLik
can be updated as well.

The stackexchange inquiry I provided is what I feel interesting, not the
link in that post. Sorry for the confusion.
On Sat, Aug 29, 2020 at 10:18 AM John Smith <jswhct at gmail.com> wrote:

            

  
  
#
Dear John,

If you look at the code for logitreg() in the MASS text, you'll see that 
the casewise components of the log-likelihood are multiplied by the 
corresponding weights. As far as I can see, this only makes sense if the 
weights are binomial trials. Otherwise, while the coefficients 
themselves will be the same as obtained for proportionally similar 
integer weights (e.g., using your weights rather than weights/10), 
quantities such as the maximized log-likelihood, deviance, and 
coefficient standard errors will be uninterpretable.

logitreg() is simply another way to compute the MLE, using a 
general-purpose optimizer rather than than iteratively weighted 
least-squares, which is what glm() uses. That the two functions provide 
the same answer within rounding error is unsurprising -- they're solving 
the same problem. A difference between the two functions is that glm() 
issues a warning about non-integer weights, while logitreg() doesn't. As 
I understand it, the motivation for writing logitreg() is to provide a 
function that could easily be modified, e.g., to impose parameter 
constraints on the solution.

I think that this discussion has gotten unproductive. If you feel that 
proceeding with noninteger weights makes sense, for a reason that I 
don't understand, then you should go ahead.

Best,
  John
On 2020-08-29 1:23 p.m., John Smith wrote: