Skip to content

Naive Bayes Classifier

3 messages · Huntsinger, Reid, Johann Petrak, Brian Ripley

#
The "naive Bayes" classifier I've seen discussed in various machine-learning
papers and books is as described by David Meyer in his posting, except that
class (mixture component) membership is known in the training data. So it's
"supervised"--classes aren't "latent". The estimation is usually just via
"plug-in":

1. Compute marginal frequencies within class.

2. multiply these together as if variables (say x) were independent within
class to get an "estimate" of the class-conditional probabilities p(x | c)

3. via Bayes rule get the (x-) conditional probabilities over class
(posterior class probabilities) p(c | x). (Actually you don't need to divide
here since it's a common factor in the quantities to be compared to get the
classifier...)

4. To classify x find the class c maximizing p(c | x) (or minimizing the sum
of L(c,i)*p(i|x) over i if L(,) is a given loss function).

Often step 1 is replaced by Bayesian estimates of the marginal probabilities
to prevent 0 estimates and reduce variance. 

In case you don't find an R implementation I hope the above is helpful.

A final remark: while the expression for the posterior probabilities is the
same as for logistic regression (as Brian Ripley pointed out), the
estimation is different--even in large samples--when the model is incorrect
(as it is anticipated to be by the "naive" qualifier). Tom Mitchell's talk
at the SIAM Data Mining conference had an example of this, citing large
gains in performance by switching from the naive bayes approach to
maximizing the logistic regression likelihood.

Reid Huntsinger

-----Original Message-----
From: David Meyer [mailto:david.meyer at ci.tuwien.ac.at]
Sent: Thursday, May 17, 2001 5:32 AM
To: Murray Jorgensen
Cc: Ursula Sondhauss; r-help at stat.math.ethz.ch
Subject: Re: [R] Naive Bayes Classifier
Murray Jorgensen wrote:
in
Latent
this
be
You could also try lca() in package e1071.

-d
applies
defined),
http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
-.-
http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
_._
#
Are there hashes in R? Or a package that implements hashes?

I am also looking for multivariate gaussian random numbers.

That brings me to: is there cholesky decomposition of 
a matrix?

Johann
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
On Fri, 18 May 2001, Johann Petrak wrote:
[...]
mvrnorm in package MASS
Well, there is a Choleski decomposition (as he apparently spelt it):
?chol.