Skip to content
Prev 11604 / 20628 Next

Modeling precision and recall with GLMMs

Dear All,

I am examining the performance of a couple of classification-like methods
under different scenarios. Two of the metrics I am using are precision and
recall (TP/(TP + FP) and TP/(TP + FN), where TP, FP, and FN are "true
positives", "false positives", and "false negatives" in a simple two-way
confusion matrix). Some of the combinations of methods have been used on
exactly the same data sets. So it is easy to set up a binomial model (or
multinomial2 if using MCMCglmm) such as


cbind(TP, FP) ~ fixed effects + (1|dataset) 



However, the left hand side sounds questionable, specially with precision:
the expression TP/(TP + FP) has, in the denominator, a (TP + FP) [the
number of results returned, or retrieved instances, etc] that, itself, can
be highly method-dependent (i.e., affected by the fixed effects). So rather
than a true proportion, this seems more like a ratio, where each of TP and
FP have their own variance, a covariance, etc, and thus the error
distribution is a mess (not the tidy thing of a binomial).


I've looked around in the literature and have not found much (maybe the
problem are my searching skills :-). Most people use rankings of methods,
not directly modeling precision or recall in the left-hand side of a
(generalized) linear model. A couple of papers use a linear model on the
log-transformed response (which I think is even worse than the above
binomial model, specially with lots of 0s or 1s). Some other people use a
single measure, such as the F-measure or Matthews correlation coefficient,
and I am using something similar too, but I specifically wanted to also
model precision and recall.


An option would be a multi-response model with MCMCglmm, but I am not sure
if this is appropriate either (dependence of the sum of FP and TP on the
fixed effects).


Best,


R.