Skip to content

logistic regression

1 message · Edoardo M Airoldi

#
hi all,
 I am fitting a logistic regression model on binary data.  I care about 
the fitted probabilities, so I am not worried about infinite 
(or non-existent) MLEs.  I use:
I understand the three ways to fit model, and in my case Y is a factor,
one column
My question is about the weights.  I can use integer weights, which
makes more mathematical sense, and
or i can use
which makes more sense for my problem, but the mathematic is weak as I am
using non integer successes in a bernoulli...  Since non-integer weights 
make more sense, AND the predictions of my model actually get better on 
the rare class.  I estimate the accuracy 'out of the bag' over 10000 
experiments to get

          | integer wgt          | non-int wgt
 -------- + -------------------- + --------------------
 accuracy | A = 94.9%  B = 82.3% | A = 94.7%  B = 83.3%
 std.dev. |      2.3%      15.4% |      2.6%      13.2%
 avg. AIC | 707                  | 124

 As I understand instead of augmenting the successes on the rare class, 
which I did not observe, I am sinply down-weighting the successes on the 
populus class.  The populations can be thought as equal, and only the 
sample sizes are unbalanced.
 I was hoping that the continuity of the Binomial for N in [0,1] ans X in 
[0,1] could guarantee me that my results still make sense, but I am not 
sure.  Any thoughts?  Thanks

Edo