Help on simple problem with optim - R-help

Thu, Sep 9, 2010 11:54 AM #

Dear all,

I ran into problems with the function "optim" when I tried to do an mle estimation of a simple lognormal regression. Some warning message poped up saying NANs have been produced in the optimization process. But I could not figure out which part of my code has caused this. I wonder if anybody would help. The code is in the following and the data is in the attachment.


da <- read.table("da.txt",header=TRUE)

# fit with linear regression using log transformation of the response variable
fit <- lm(log(yp) ~ as.factor(ay)+as.factor(lag),data=da)

# define the log likelihood to be maximized over
llk.mar <- function(parm,y,x){
        # parm is the vector of parameters
        # the last element is sigma
        # y is the response
        # x is the design matrix
        l <- length(parm)
        beta <- parm[-l]
        sigma <- parm[l]
        x <- as.matrix(x)
        mu <- x %*% beta
        llk <- sum(dnorm(y, mu, sigma,log=TRUE))
        return(llk)
}

# initial values
parm <- c(as.vector(coef(fit)),summary(fit)$sigma)
y <- log(da$yp)
x <- model.matrix(fit)

op <- optim(parm, llk.mar, y=y,x=x,control=list(fnscale=-1,maxit=100000))


After running the above code, I got the warning message:
Warning messages:
1: In dnorm(x, mean, sd, log) : NaNs produced
2: In dnorm(x, mean, sd, log) : NaNs produced


I would really appreciate if anybody would help to point out the problem with this code or tell me how to trace it down (using "trace"?)?
Many thanks in advance.







Wayne (Yanwei) Zhang
Statistical Research
CNA





NOTICE:  This e-mail message, including any attachments and appended messages, is for the sole use of the intended recipients and may contain confidential and legally privileged information.
If you are not the intended recipient, any review, dissemination, distribution, copying, storage or other use of all or any portion of this message is strictly prohibited.
If you received this message in error, please immediately notify the sender by reply e-mail and delete this message in its entirety.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: da.txt
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100909/9427d224/attachment.txt>

Thomas Stewart

Thu, Sep 9, 2010 12:38 PM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100909/9def444b/attachment.pl>

Peng, C

Thu, Sep 9, 2010 12:40 PM #

Yanwei!!!!!!!!!!!!!!!

Have you tried to write the likelihood function using log-normal directly?
if you haven't so, you may want to check  ?rlnorm

View this message in context: http://r.789695.n4.nabble.com/Help-on-simple-problem-with-optim-tp2533420p2533487.html
Sent from the R help mailing list archive at Nabble.com.

William Dunlap

Thu, Sep 9, 2010 1:34 PM #

You can record all arguments and return values of the
calls that optim(par,fn) makes to fn with a function
like the following.  It takes your function and makes
a new function that returns the same thing but also
records information it its environment.  Thus, after
optim is done you can see its path to the optimum.

  trackFn <- function (fn) 
  {
      X <- NULL
      VALUE <- NULL
      force(fn)
      function(x) {
          X <<- rbind(X, x)
          val <- fn(x)
          VALUE <<- c(VALUE, val)
          val
      }
  }

E.g.,

$par
[1] -1.583466e+00  6.726235e-05  1.558809e+00

$value
[1] 1.612853e-08

$counts
function gradient 
     146       NA 

$convergence
[1] 0

$message
NULL

[1] 146

You could also record warning messages by including a call
to withCallingHandlers() that stashed the warning in a list.

Recall trackFn each time you call optim().

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

-----Original Message-----
From: r-help-bounces at r-project.org 
[mailto:r-help-bounces at r-project.org] On Behalf Of Zhang,Yanwei
Sent: Thursday, September 09, 2010 11:54 AM
To: r-help at r-project.org
Subject: [R] Help on simple problem with optim

Dear all,

I ran into problems with the function "optim" when I tried to 
do an mle estimation of a simple lognormal regression. Some 
warning message poped up saying NANs have been produced in 
the optimization process. But I could not figure out which 
part of my code has caused this. I wonder if anybody would 
help. The code is in the following and the data is in the attachment.

da <- read.table("da.txt",header=TRUE)

# fit with linear regression using log transformation of the 
response variable
fit <- lm(log(yp) ~ as.factor(ay)+as.factor(lag),data=da)

# define the log likelihood to be maximized over
llk.mar <- function(parm,y,x){
        # parm is the vector of parameters
        # the last element is sigma
        # y is the response
        # x is the design matrix
        l <- length(parm)
        beta <- parm[-l]
        sigma <- parm[l]
        x <- as.matrix(x)
        mu <- x %*% beta
        llk <- sum(dnorm(y, mu, sigma,log=TRUE))
        return(llk)
}

# initial values
parm <- c(as.vector(coef(fit)),summary(fit)$sigma)
y <- log(da$yp)
x <- model.matrix(fit)

op <- optim(parm, llk.mar, 
y=y,x=x,control=list(fnscale=-1,maxit=100000))

After running the above code, I got the warning message:
Warning messages:
1: In dnorm(x, mean, sd, log) : NaNs produced
2: In dnorm(x, mean, sd, log) : NaNs produced

I would really appreciate if anybody would help to point out 
the problem with this code or tell me how to trace it down 
(using "trace"?)?
Many thanks in advance.

Wayne (Yanwei) Zhang
Statistical Research
CNA

NOTICE:  This e-mail message, including any attachments and 
appended messages, is for the sole use of the intended 
recipients and may contain confidential and legally 
privileged information.
If you are not the intended recipient, any review, 
dissemination, distribution, copying, storage or other use of 
all or any portion of this message is strictly prohibited.
If you received this message in error, please immediately 
notify the sender by reply e-mail and delete this message in 
its entirety.

Berend Hasselman

Thu, Sep 9, 2010 11:37 PM #

It is indeed a negative value for sigma that causes the issue.
You can check this by inserting this line

        if(sigma <= 0 ) cat("Negative sigma=",sigma,"\n")

after the line

        mu <- x %*% beta 

in function llk.mar

Negative values for sigma can be avoided with the use of a transformation
for sigma, forcing it to be always positive.

Make optim use log(sigma) as parameter and transform this to sigma with
sigma <- exp(parm[l]) in llk.mar.
Like this

# define the log likelihood to be maximized over 
llk.mar <- function(parm,y,x){ 
        # parm is the vector of parameters 
        # the last element is sigma 
        # y is the response 
        # x is the design matrix 
        l <- length(parm) 
        beta <- parm[-l] 
        sigma <- exp(parm[l])  # <=== transform
        x <- as.matrix(x) 
        mu <- x %*% beta 
        if(sigma <= 0 ) cat("Negative sigma=",sigma,"\n")
        llk <- sum(dnorm(y, mu, sigma,log=TRUE)) 
        return(llk) 
} 

# initial values 
parm <- c(as.vector(coef(fit)),log(summary(fit)$sigma))  # use log(sigma) as
independent parameter

Caveat: transformations often help in situations like this but can lead to
badly scaled problems and are not a universal remedy.

/Berend

View this message in context: http://r.789695.n4.nabble.com/Help-on-simple-problem-with-optim-tp2533420p2533939.html
Sent from the R help mailing list archive at Nabble.com.

Cristian Montes

Mon, Sep 13, 2010 7:11 AM #

Did you check if the data in "da" has any NA in the dependent or the independent data?
Remember that your function llk.mar is going to evaluate dnorm for each pair.  If any of those
pairs has an NA value, your function will return an NA at the end (sum(c(NA,1,2,3)) = NA)

I would check if the llk.mar function is fine for the whole domain of your data.  My suggestion
is to add an if right before llk<-sum(dnorm....), where you evaluate for NAs in that vector.

If you find one, get rid of it before returning the function!.  Try that before optim, then
let it do the solving.

Cheers,

Cristi?n Montes.

-----Mensaje original-----
De: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] En nombre de Zhang,Yanwei
Enviado el: Jueves, 09 de Septiembre de 2010 02:54 p.m.
Para: r-help at r-project.org
Asunto: [R] Help on simple problem with optim

Dear all,

I ran into problems with the function "optim" when I tried to do an mle estimation of a simple lognormal regression. Some warning message poped up saying NANs have been produced in the optimization process. But I could not figure out which part of my code has caused this. I wonder if anybody would help. The code is in the following and the data is in the attachment.


da <- read.table("da.txt",header=TRUE)

# fit with linear regression using log transformation of the response variable fit <- lm(log(yp) ~ as.factor(ay)+as.factor(lag),data=da)

# define the log likelihood to be maximized over llk.mar <- function(parm,y,x){
        # parm is the vector of parameters
        # the last element is sigma
        # y is the response
        # x is the design matrix
        l <- length(parm)
        beta <- parm[-l]
        sigma <- parm[l]
        x <- as.matrix(x)
        mu <- x %*% beta
        llk <- sum(dnorm(y, mu, sigma,log=TRUE))
        return(llk)
}

# initial values
parm <- c(as.vector(coef(fit)),summary(fit)$sigma)
y <- log(da$yp)
x <- model.matrix(fit)

op <- optim(parm, llk.mar, y=y,x=x,control=list(fnscale=-1,maxit=100000))


After running the above code, I got the warning message:
Warning messages:
1: In dnorm(x, mean, sd, log) : NaNs produced
2: In dnorm(x, mean, sd, log) : NaNs produced


I would really appreciate if anybody would help to point out the problem with this code or tell me how to trace it down (using "trace"?)?
Many thanks in advance.







Wayne (Yanwei) Zhang
Statistical Research
CNA





NOTICE:  This e-mail message, including any attachments and appended messages, is for the sole use of the intended recipients and may contain confidential and legally privileged information.
If you are not the intended recipient, any review, dissemination, distribution, copying, storage or other use of all or any portion of this message is strictly prohibited.
If you received this message in error, please immediately notify the sender by reply e-mail and delete this message in its entirety.