Training with very few positives

Ben Bolker · 2013-02-11T14:19:53Z

James Jong gmail.com> writes: > > I have a binary classification problem where the fraction of positives is > very low, e.g. 20 positives in 10,000 examples (0.2%) > > What is an appropriate cross validation scheme for training a classifier > with very few positives? [snip] > ======================================== > but I am not getting good performance (my ROC values are classifiers above). Any thoughts? > My thought is that there probably jus

Ben Bolker

Mon, Feb 11, 2013 6:19 AM

James Jong <ribonucleico <at> gmail.com> writes:

[snip]

My thought is that there probably just isn't any way to get
good performance from this data set.  The effective size of your
data set is 20, which means it's very small, which means you may
just have reached the limits of your predictive power ...

  Ben Bolker

Training with very few positives

Thread (2 messages)