Problem with Newton_Raphson - R-help

Thu, Sep 20, 2012 4:46 AM #

Hello,
I have being trying to estimate the parameters of the?generalized?exponential distribution. The random number generation for the GE distribution is?x<-(-log(1-U^(1/p1))/b), where U stands for uniform dist. The data i have generated to estimate the parameters is right censored and the code is given below; The problem is that, the newton-Raphson approach isnt working and i do not know what is wrong.?Can somebody suggest something or help in identifying what the problem might be.?

p1<-0.6;b<-2
n=20;rr=5000
U<-runif(n,0,1)
for (i in 1:rr){

x<-(-log(1-U^(1/p1))/b)

?meantrue<-gamma(1+(1/p1))*b
? meantrue
? d<-meantrue/0.30
? cen<- runif(n,min=0,max=d)
? s<-ifelse(x<=cen,1,0)
? q<-c(x,cen)

? ? z<-function(data, p){?
? ? shape<-p[1]
? ? scale<-p[2]
? ? log1<-n*sum(s)*log(p[1])+ n*sum(s)*log(p[2])+(p[1]-1)*sum(s)*log(1-((exp(-(p[2])*sum(x)))))
-(p[2])*sum(t) + (p[1])*log((exp(-(p[2])*sum(x))))-
(p[1])*sum(s)*log((exp(-(p[2])*sum(x))))
? return(-log1)
? }
}
? start <- c(1,1)
? zz<-optim(start,fn=z,data=q,hessian=T)
? zz
? m1<-zz$par[2]
? p<-zz$par[1]?


Thank you
Chris Kelvin
INSPEM. UPM

Berend Hasselman

Thu, Sep 20, 2012 5:52 AM #

On 20-09-2012, at 13:46, Christopher Kelvin wrote:

Newton-Raphson? is not a method for optim.

Running your code given above gives an error message:

Error in sum(t) : invalid 'type' (closure) of argument

Where is object 't'?

Why are you defining function z within the rr loop? Only the last definition is given to optim.
Why use p[1] and p[2] explicitly in the calculation of log1 in the body of function z when you can use shape and scale defined in the lines before log1 <-.

Berend

Christopher Kelvin

Thu, Sep 20, 2012 6:17 AM #

By your suggestion if i understand you, i have changed (p[1]) and (p[2]) and have also corrected the error sum(t), but if i run it, the parameters estimates are very large.
Can you please run it and help me out? The code is given below.


p1<-0.6;b<-2
n=20;rr=5000
U<-runif(n,0,1)
for (i in 1:rr){

x<-(-log(1-U^(1/p1))/b)

?meantrue<-gamma(1+(1/p1))*b
? meantrue
? d<-meantrue/0.30
? cen<- runif(n,min=0,max=d)
? s<-ifelse(x<=cen,1,0)
? q<-c(x,cen)
}
? ? z<-function(data, p){?
? ? shape<-p[1]
? ? scale<-p[2]
? ? log1<-n*sum(s)*log(shape)+ n*sum(s)*log(scale)+(shape-1)*sum(s)*log(1-((exp(-(scale)*sum(x)))))
-(scale)*sum(x) + (shape)*log((exp(-(scale)*sum(x))))-
(shape)*sum(s)*log((exp(-(scale)*sum(x))))
? return(-log1)
? }
? start <- c(1,1)
? zz<-optim(start,fn=z,data=q,hessian=T)
? zz



Thank you
Chris Kelvin








----- Original Message -----
From: Berend Hasselman <bhh at xs4all.nl>
To: Christopher Kelvin <chris_kelvin2001 at yahoo.com>
Cc: "r-help at r-project.org" <r-help at r-project.org>
Sent: Thursday, September 20, 2012 8:52 PM
Subject: Re: [R] Problem with Newton_Raphson

On 20-09-2012, at 13:46, Christopher Kelvin wrote:

Newton-Raphson? is not a method for optim.

Running your code given above gives an error message:

Error in sum(t) : invalid 'type' (closure) of argument

Where is object 't'?

Why are you defining function z within the rr loop? Only the last definition is given to optim.
Why use p[1] and p[2] explicitly in the calculation of log1 in the body of function z when you can use shape and scale defined in the lines before log1 <-.

Berend

Berend Hasselman

Thu, Sep 20, 2012 7:06 AM #

On 20-09-2012, at 15:17, Christopher Kelvin wrote:

1. You think you are using a (quasi) Newton method. But the default method is "Nelder-Mead". You should specify method explicitly for example method="BFGS". When you do that you will see that the results are completely different. You should also carefully inspect the return value of optim. In your case zz$convergence is 1 which means "indicates that the iteration limit maxit had been reached.".
When you use method="BFGS" you will get zz$ convergence is 0.

This may do what you want

zz<-optim(start, fn=z, data=q, method="BFGS", hessian=T)
zz

2. when providing examples such as yours it is a good idea to issue set.seed(<number>) at the start of your script to ensure reproducibility.

3. The function z does not calculate what you want: since fully formed expressions are terminated by a newline and since the remainder of the line after log1<- is a complete expression, log1 does include the value of the two following  lines.
See the difference between 

a <- 1
b <- 11
a
-b

and

a-
b

So you could write this

    log1 <- n*sum(s)*log(shape)+ n*sum(s)*log(scale)+
            (shape-1)*sum(s)*log(1-((exp(-(scale)*sum(x))))) -(scale)*sum(x) + 
            (shape)*log((exp(-(scale)*sum(x)))) - (shape)*sum(s)*log((exp(-(scale)*sum(x))))

and you will see rather different results.
You will have to determine if they are what you expect/desire.

A final remark on function z:

- do not calculate things like n*sum(s) repeatedly: doing something like A<-n*sum(s) and reusing A is more efficient.
- same thing for log((exp(-(scale)*sum(x)))) (recalculated four times)
- same thing for sum(x)

See below for a partly rewritten function z and results with method="BFGS".
I have put a set.seed(1) at the start of your code.

good luck,

Berend

z<-function(p,data){ 
    shape<-p[1]
    scale<-p[2]
    log1 <- n*sum(s)*log(shape)+ n*sum(s)*log(scale)+
            (shape-1)*sum(s)*log(1-((exp(-(scale)*sum(x))))) -(scale)*sum(x) + 
            (shape)*log((exp(-(scale)*sum(x)))) - (shape)*sum(s)*log((exp(-(scale)*sum(x))))
    return(-log1)
}

start <- c(1,1)

[1] -138.7918

There were 34 warnings (use warnings() to see them)

$par
[1] 1.009614e+11 8.568498e+01

$value
[1] -1.27583e+15

$counts
function gradient 
     270       87 

$convergence
[1] 0

$message
NULL

$hessian
       [,1] [,2]
[1,] -62500    0
[2,]      0    0

Berend Hasselman

Thu, Sep 20, 2012 7:18 AM #

On 20-09-2012, at 16:06, Berend Hasselman wrote:

Oops!
No not four times. Twice.
But (exp(-(scale)*sum(x))) is calculated three times.

Berend

Christopher Kelvin

Thu, Sep 20, 2012 8:31 AM #

Thank you very much for everything. Your suggestions were very?helpful.?

Chris


----- Original Message -----
From: Berend Hasselman <bhh at xs4all.nl>
To: Christopher Kelvin <chris_kelvin2001 at yahoo.com>
Cc: "r-help at r-project.org" <r-help at r-project.org>
Sent: Thursday, September 20, 2012 10:06 PM
Subject: Re: [R] Problem with Newton_Raphson

On 20-09-2012, at 15:17, Christopher Kelvin wrote:

1. You think you are using a (quasi) Newton method. But the default method is "Nelder-Mead". You should specify method explicitly for example method="BFGS". When you do that you will see that the results are completely different. You should also carefully inspect the return value of optim. In your case zz$convergence is 1 which means "indicates that the iteration limit maxit had been reached.".
When you use method="BFGS" you will get zz$ convergence is 0.

This may do what you want

zz<-optim(start, fn=z, data=q, method="BFGS", hessian=T)
zz

2. when providing examples such as yours it is a good idea to issue set.seed(<number>) at the start of your script to ensure reproducibility.

3. The function z does not calculate what you want: since fully formed expressions are terminated by a newline and since the remainder of the line after log1<- is a complete expression, log1 does include the value of the two following? lines.
See the difference between 

a <- 1
b <- 11
a
-b

and

a-
b

So you could write this

? ? log1 <- n*sum(s)*log(shape)+ n*sum(s)*log(scale)+
? ? ? ? ? ? (shape-1)*sum(s)*log(1-((exp(-(scale)*sum(x))))) -(scale)*sum(x) + 
? ? ? ? ? ? (shape)*log((exp(-(scale)*sum(x)))) - (shape)*sum(s)*log((exp(-(scale)*sum(x))))

and you will see rather different results.
You will have to determine if they are what you expect/desire.

A final remark on function z:

- do not calculate things like n*sum(s) repeatedly: doing something like A<-n*sum(s) and reusing A is more efficient.
- same thing for log((exp(-(scale)*sum(x)))) (recalculated four times)
- same thing for sum(x)

See below for a partly rewritten function z and results with method="BFGS".
I have put a set.seed(1) at the start of your code.

good luck,

Berend

z<-function(p,data){ 
? ? shape<-p[1]
? ? scale<-p[2]
? ? log1 <- n*sum(s)*log(shape)+ n*sum(s)*log(scale)+
? ? ? ? ? ? (shape-1)*sum(s)*log(1-((exp(-(scale)*sum(x))))) -(scale)*sum(x) + 
? ? ? ? ? ? (shape)*log((exp(-(scale)*sum(x)))) - (shape)*sum(s)*log((exp(-(scale)*sum(x))))
? ? return(-log1)
}

start <- c(1,1)

[1] -138.7918

There were 34 warnings (use warnings() to see them)

$par
[1] 1.009614e+11 8.568498e+01

$value
[1] -1.27583e+15

$counts
function gradient 
? ?  270? ? ?  87 

$convergence
[1] 0

$message
NULL

$hessian
? ? ?  [,1] [,2]
[1,] -62500? ? 0
[2,]? ? ? 0? ? 0