Skip to content
Prev 302959 / 398506 Next

Analyzing Poor Performance Using naiveBayes()

I think you have been hit by the problem of high variance. (overfitting)

Maybe you should consider doing a feature selection perhaps using the
chisq ranking from FSelector.

And then training the Naive Bayes using the top n features (n=1 to
200) as ranked by chisq, plot the AUCs or F1 score from both training
set and cross training set against n. From the graph, you can select
the optimal number of n.
On Fri, Aug 10, 2012 at 6:40 AM, Kirk Fleming <kirkrfleming at hotmail.com> wrote: