log(0) problem in max likelihood estimation
On Tue, 9 Jan 2001, Bill Simpson wrote:
This practical problem in maximum likelihood estimation must be encountered quite a bit. What do you do when a data point has a probability that comes out in numerical evaluation to zero? In calculating the log likelihood you then have a log(0) problem. Here is a simple example (probit) which illustrates the problem: x<-c(1,2,3,4,100) ntrials<-100 yes<-round(ntrials*pnorm((x-3)/1)) #points fall on normal CDF mean=3, sd=1 no<-ntrials-yes pyes<-yes/ntrials py<-function(b,x) pnorm((x-b[1])/b[2]) loglike<-function(b) -sum( yes*log(py(b,x)) + no*log(1-py(b,x)) ) out<-nlm(loglike,p=c(3,1),hessian=TRUE) In this example the right-most point gives a p(yes) of 1; 1-1=0; log(0) gives "NA/Inf replaced by maximum positive value" Please tell me how to deal with this problem. Thanks very much for any help.
Calculate your log-likelihood correctly! 0*log(0) is 0, not NaN. It is an example of good S programming on page 94 of V&R3. More generally, I think you should be using the full 5-arg form of pnorm, since pnorm(x, 3, 1, FALSE, TRUE) is a lot more accurate than your calculation.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._