Skip to content

(no subject)

1 message · Nicholas Lewin-Koh

#
Hi Manuel,
I am guessing the problem is that because you have categorical
predictors,
you are getting empty cells in your cross validation sets, and hence
infinite coefficients.
Unfortunately, you are now in a very tricky situation, to get at the
generalization error of your
model you need to have a sampling scheme that approximates the
population distribution
of your predictors. One way to get at this might be to use Bayesian
logistic regression, with
very diffuse priors on the coefficients. This will serve to moderate the
problem of zero cells in your
resampling scheme, and probably increase your prediction error, which in
this case may be a good thing.

Hope this helps

Nicholas