nlm() problem or MLE problem?

Thu, Dec 9, 1999 5:03 AM #

I am trying to do a MLE fit of the weibull to some data, which I attach.

fitweibull<-function()
{
rt<-scan("r/rt/data2/triam1.dat")
rt<-sort(rt)
plot(rt,ppoints(rt))
a<-9
b<-.27
fn<-function(p) -sum( log(dweibull(rt,p[1],p[2])) )
cat("starting -log like=",fn(c(a,b)),"\n")
out<-nlm(fn,p=c(a,b), hessian=TRUE)
xfit<-seq(min(rt),max(rt),(max(rt)-min(rt))/100)
yfit<-pweibull(xfit,out$estimate[1], out$estimate[2])
lines(xfit,yfit,lty=2)
yfit2<-pweibull(xfit,a, b)
lines(xfit,yfit2)
list(out=out)
}

I got the starting values a=9, b=.27 from fitting the Weibull CDF by eye
to a quantile plot of the data. The final values fitted by nlm() are a=
4.8299357, b= 0.2753897

I plotted both CDFs against the quantile plot of the data. I would have
expected the MLE fit to be the one that lies closer to the data. NO.
The MLE solution (dashed) seems to fit quite badly in comparison with my
starting values (solid).

I wonder if this is just the way MLE is, or is there a problem with nlm()
here? There are numerous warnings from nlm(). But the starting values are
said to give a terrible -log likelihood, which is hard to believe. I am
using R 65.1 under Linux.

Thanks for any help!

Bill
PS when I do  dweibull(rt,9,.27)
the last value is 1.003383e-173
I guess this  one observation in the right-hand tail is dominating the
fit?! It contributes 173 to the -log likelihood
The nlm() fit to this last point gives 1.550882e-10.
What to do?
-------------- next part --------------
0.526431
0.310086
0.222007
0.209077
0.211198
0.222049
0.189770
0.218770
0.209489
0.205987
0.215240
0.273280
0.216011
0.266921
0.221901
0.207630
0.302419
0.271111
0.267090
0.206637
0.257495
0.221931
0.213982
0.249320
0.212692
0.277495
0.266773
0.254769
0.288441
0.281049
0.258157
0.237112
0.286718
0.304359
0.241937
0.247500
0.262666
0.210689
0.279578
0.301593
0.227285
0.244661
0.256363
0.274192
0.290597
0.270520
0.284376
0.259045
0.215456
0.207124
0.241684
0.282749
0.286396
0.277762
0.209074
0.232699
0.286557
0.258815
0.320231
0.266717
0.293170
0.233128
0.232281
0.227134
0.249905
0.258367
0.268534
0.254488
0.265343
0.227485
0.269131
0.273510
0.261441
0.289957
0.274722
0.278432
0.236914
0.234670
0.312916
0.239800
0.247947
0.254326
0.198332
0.232380
0.234817
0.253268
0.260428
0.282009
0.255082
0.287075
0.275064
0.278829
0.281103
0.266035
0.257373
0.300321
0.284854
0.253519
0.277225
0.297887

Thomas Lumley

Thu, Dec 9, 1999 8:22 AM #

On Thu, 9 Dec 1999, Bill Simpson wrote:

Well, it looks as though the data don't come from a Weibull distribution
(at least that last point).

You can also fit a Weibull with survreg() in survival5, which gives the
same results as nlm without any complaints.

If you drop the last point survreg() gives 9.66 and 0.27 for the MLEs, so
the last point really is the problem.  Either the model is badly wrong or
the point is badly wrong, but nlm() is doing fine.


	-thomas

Thomas Lumley
Assistant Professor, Biostatistics
University of Washington, Seattle


	

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._