Dear all,
I ran into problems with the function "optim" when I tried to do an mle estimation of a simple lognormal regression. Some warning message poped up saying NANs have been produced in the optimization process. But I could not figure out which part of my code has caused this. I wonder if anybody would help. The code is in the following and the data is in the attachment.
da <- read.table("da.txt",header=TRUE)
# fit with linear regression using log transformation of the response variable
fit <- lm(log(yp) ~ as.factor(ay)+as.factor(lag),data=da)
# define the log likelihood to be maximized over
llk.mar <- function(parm,y,x){
# parm is the vector of parameters
# the last element is sigma
# y is the response
# x is the design matrix
l <- length(parm)
beta <- parm[-l]
sigma <- parm[l]
x <- as.matrix(x)
mu <- x %*% beta
llk <- sum(dnorm(y, mu, sigma,log=TRUE))
return(llk)
}
# initial values
parm <- c(as.vector(coef(fit)),summary(fit)$sigma)
y <- log(da$yp)
x <- model.matrix(fit)
op <- optim(parm, llk.mar, y=y,x=x,control=list(fnscale=-1,maxit=100000))
After running the above code, I got the warning message:
Warning messages:
1: In dnorm(x, mean, sd, log) : NaNs produced
2: In dnorm(x, mean, sd, log) : NaNs produced
I would really appreciate if anybody would help to point out the problem with this code or tell me how to trace it down (using "trace"?)?
Many thanks in advance.
Wayne (Yanwei) Zhang
Statistical Research
CNA
NOTICE: This e-mail message, including any attachments and appended messages, is for the sole use of the intended recipients and may contain confidential and legally privileged information.
If you are not the intended recipient, any review, dissemination, distribution, copying, storage or other use of all or any portion of this message is strictly prohibited.
If you received this message in error, please immediately notify the sender by reply e-mail and delete this message in its entirety.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: da.txt
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100909/9427d224/attachment.txt>
Help on simple problem with optim
6 messages · Zhang,Yanwei, Thomas Stewart, Peng, C +3 more
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100909/9def444b/attachment.pl>
Yanwei!!!!!!!!!!!!!!! Have you tried to write the likelihood function using log-normal directly? if you haven't so, you may want to check ?rlnorm
View this message in context: http://r.789695.n4.nabble.com/Help-on-simple-problem-with-optim-tp2533420p2533487.html Sent from the R help mailing list archive at Nabble.com.
You can record all arguments and return values of the
calls that optim(par,fn) makes to fn with a function
like the following. It takes your function and makes
a new function that returns the same thing but also
records information it its environment. Thus, after
optim is done you can see its path to the optimum.
trackFn <- function (fn)
{
X <- NULL
VALUE <- NULL
force(fn)
function(x) {
X <<- rbind(X, x)
val <- fn(x)
VALUE <<- c(VALUE, val)
val
}
}
E.g.,
fn <- function(x)sum( (sin(x)-c(-1,0,1)) ^ 2 ) optim(c(0,0,0), tfn <- trackFn(fn))
$par
[1] -1.583466e+00 6.726235e-05 1.558809e+00
$value
[1] 1.612853e-08
$counts
function gradient
146 NA
$convergence
[1] 0
$message
NULL
with(environment(tfn), length(VALUE)) # agrees w/ counts above
[1] 146
with(environment(tfn), plot(VALUE, log="y")) with(environment(tfn), matplot(X, VALUE, log="y", type="l"))
You could also record warning messages by including a call to withCallingHandlers() that stashed the warning in a list. Recall trackFn each time you call optim(). Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com
-----Original Message-----
From: r-help-bounces at r-project.org
[mailto:r-help-bounces at r-project.org] On Behalf Of Zhang,Yanwei
Sent: Thursday, September 09, 2010 11:54 AM
To: r-help at r-project.org
Subject: [R] Help on simple problem with optim
Dear all,
I ran into problems with the function "optim" when I tried to
do an mle estimation of a simple lognormal regression. Some
warning message poped up saying NANs have been produced in
the optimization process. But I could not figure out which
part of my code has caused this. I wonder if anybody would
help. The code is in the following and the data is in the attachment.
da <- read.table("da.txt",header=TRUE)
# fit with linear regression using log transformation of the
response variable
fit <- lm(log(yp) ~ as.factor(ay)+as.factor(lag),data=da)
# define the log likelihood to be maximized over
llk.mar <- function(parm,y,x){
# parm is the vector of parameters
# the last element is sigma
# y is the response
# x is the design matrix
l <- length(parm)
beta <- parm[-l]
sigma <- parm[l]
x <- as.matrix(x)
mu <- x %*% beta
llk <- sum(dnorm(y, mu, sigma,log=TRUE))
return(llk)
}
# initial values
parm <- c(as.vector(coef(fit)),summary(fit)$sigma)
y <- log(da$yp)
x <- model.matrix(fit)
op <- optim(parm, llk.mar,
y=y,x=x,control=list(fnscale=-1,maxit=100000))
After running the above code, I got the warning message:
Warning messages:
1: In dnorm(x, mean, sd, log) : NaNs produced
2: In dnorm(x, mean, sd, log) : NaNs produced
I would really appreciate if anybody would help to point out
the problem with this code or tell me how to trace it down
(using "trace"?)?
Many thanks in advance.
Wayne (Yanwei) Zhang
Statistical Research
CNA
NOTICE: This e-mail message, including any attachments and
appended messages, is for the sole use of the intended
recipients and may contain confidential and legally
privileged information.
If you are not the intended recipient, any review,
dissemination, distribution, copying, storage or other use of
all or any portion of this message is strictly prohibited.
If you received this message in error, please immediately
notify the sender by reply e-mail and delete this message in
its entirety.
It is indeed a negative value for sigma that causes the issue.
You can check this by inserting this line
if(sigma <= 0 ) cat("Negative sigma=",sigma,"\n")
after the line
mu <- x %*% beta
in function llk.mar
Negative values for sigma can be avoided with the use of a transformation
for sigma, forcing it to be always positive.
Make optim use log(sigma) as parameter and transform this to sigma with
sigma <- exp(parm[l]) in llk.mar.
Like this
# define the log likelihood to be maximized over
llk.mar <- function(parm,y,x){
# parm is the vector of parameters
# the last element is sigma
# y is the response
# x is the design matrix
l <- length(parm)
beta <- parm[-l]
sigma <- exp(parm[l]) # <=== transform
x <- as.matrix(x)
mu <- x %*% beta
if(sigma <= 0 ) cat("Negative sigma=",sigma,"\n")
llk <- sum(dnorm(y, mu, sigma,log=TRUE))
return(llk)
}
# initial values
parm <- c(as.vector(coef(fit)),log(summary(fit)$sigma)) # use log(sigma) as
independent parameter
Caveat: transformations often help in situations like this but can lead to
badly scaled problems and are not a universal remedy.
/Berend
View this message in context: http://r.789695.n4.nabble.com/Help-on-simple-problem-with-optim-tp2533420p2533939.html Sent from the R help mailing list archive at Nabble.com.
3 days later
Did you check if the data in "da" has any NA in the dependent or the independent data?
Remember that your function llk.mar is going to evaluate dnorm for each pair. If any of those
pairs has an NA value, your function will return an NA at the end (sum(c(NA,1,2,3)) = NA)
I would check if the llk.mar function is fine for the whole domain of your data. My suggestion
is to add an if right before llk<-sum(dnorm....), where you evaluate for NAs in that vector.
If you find one, get rid of it before returning the function!. Try that before optim, then
let it do the solving.
Cheers,
Cristi?n Montes.
-----Mensaje original-----
De: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] En nombre de Zhang,Yanwei
Enviado el: Jueves, 09 de Septiembre de 2010 02:54 p.m.
Para: r-help at r-project.org
Asunto: [R] Help on simple problem with optim
Dear all,
I ran into problems with the function "optim" when I tried to do an mle estimation of a simple lognormal regression. Some warning message poped up saying NANs have been produced in the optimization process. But I could not figure out which part of my code has caused this. I wonder if anybody would help. The code is in the following and the data is in the attachment.
da <- read.table("da.txt",header=TRUE)
# fit with linear regression using log transformation of the response variable fit <- lm(log(yp) ~ as.factor(ay)+as.factor(lag),data=da)
# define the log likelihood to be maximized over llk.mar <- function(parm,y,x){
# parm is the vector of parameters
# the last element is sigma
# y is the response
# x is the design matrix
l <- length(parm)
beta <- parm[-l]
sigma <- parm[l]
x <- as.matrix(x)
mu <- x %*% beta
llk <- sum(dnorm(y, mu, sigma,log=TRUE))
return(llk)
}
# initial values
parm <- c(as.vector(coef(fit)),summary(fit)$sigma)
y <- log(da$yp)
x <- model.matrix(fit)
op <- optim(parm, llk.mar, y=y,x=x,control=list(fnscale=-1,maxit=100000))
After running the above code, I got the warning message:
Warning messages:
1: In dnorm(x, mean, sd, log) : NaNs produced
2: In dnorm(x, mean, sd, log) : NaNs produced
I would really appreciate if anybody would help to point out the problem with this code or tell me how to trace it down (using "trace"?)?
Many thanks in advance.
Wayne (Yanwei) Zhang
Statistical Research
CNA
NOTICE: This e-mail message, including any attachments and appended messages, is for the sole use of the intended recipients and may contain confidential and legally privileged information.
If you are not the intended recipient, any review, dissemination, distribution, copying, storage or other use of all or any portion of this message is strictly prohibited.
If you received this message in error, please immediately notify the sender by reply e-mail and delete this message in its entirety.