Hello,
I have being trying to estimate the parameters of the?generalized?exponential distribution. The random number generation for the GE distribution is?x<-(-log(1-U^(1/p1))/b), where U stands for uniform dist. The data i have generated to estimate the parameters is right censored and the code is given below; The problem is that, the newton-Raphson approach isnt working and i do not know what is wrong.?Can somebody suggest something or help in identifying what the problem might be.?
p1<-0.6;b<-2
n=20;rr=5000
U<-runif(n,0,1)
for (i in 1:rr){
x<-(-log(1-U^(1/p1))/b)
?meantrue<-gamma(1+(1/p1))*b
? meantrue
? d<-meantrue/0.30
? cen<- runif(n,min=0,max=d)
? s<-ifelse(x<=cen,1,0)
? q<-c(x,cen)
? ? z<-function(data, p){?
? ? shape<-p[1]
? ? scale<-p[2]
? ? log1<-n*sum(s)*log(p[1])+ n*sum(s)*log(p[2])+(p[1]-1)*sum(s)*log(1-((exp(-(p[2])*sum(x)))))
-(p[2])*sum(t) + (p[1])*log((exp(-(p[2])*sum(x))))-
(p[1])*sum(s)*log((exp(-(p[2])*sum(x))))
? return(-log1)
? }
}
? start <- c(1,1)
? zz<-optim(start,fn=z,data=q,hessian=T)
? zz
? m1<-zz$par[2]
? p<-zz$par[1]?
Thank you
Chris Kelvin
INSPEM. UPM
Problem with Newton_Raphson
6 messages · Berend Hasselman, Christopher Kelvin
On 20-09-2012, at 13:46, Christopher Kelvin wrote:
Hello, I have being trying to estimate the parameters of the generalized exponential distribution. The random number generation for the GE distribution is x<-(-log(1-U^(1/p1))/b), where U stands for uniform dist. The data i have generated to estimate the parameters is right censored and the code is given below; The problem is that, the newton-Raphson approach isnt working and i do not know what is wrong. Can somebody suggest something or help in identifying what the problem might be.
Newton-Raphson? is not a method for optim.
p1<-0.6;b<-2
n=20;rr=5000
U<-runif(n,0,1)
for (i in 1:rr){
x<-(-log(1-U^(1/p1))/b)
meantrue<-gamma(1+(1/p1))*b
meantrue
d<-meantrue/0.30
cen<- runif(n,min=0,max=d)
s<-ifelse(x<=cen,1,0)
q<-c(x,cen)
z<-function(data, p){
shape<-p[1]
scale<-p[2]
log1<-n*sum(s)*log(p[1])+ n*sum(s)*log(p[2])+(p[1]-1)*sum(s)*log(1-((exp(-(p[2])*sum(x)))))
-(p[2])*sum(t) + (p[1])*log((exp(-(p[2])*sum(x))))-
(p[1])*sum(s)*log((exp(-(p[2])*sum(x))))
return(-log1)
}
}
start <- c(1,1)
zz<-optim(start,fn=z,data=q,hessian=T)
zz
m1<-zz$par[2]
p<-zz$par[1]
Running your code given above gives an error message: Error in sum(t) : invalid 'type' (closure) of argument Where is object 't'? Why are you defining function z within the rr loop? Only the last definition is given to optim. Why use p[1] and p[2] explicitly in the calculation of log1 in the body of function z when you can use shape and scale defined in the lines before log1 <-. Berend
By your suggestion if i understand you, i have changed (p[1]) and (p[2]) and have also corrected the error sum(t), but if i run it, the parameters estimates are very large.
Can you please run it and help me out? The code is given below.
p1<-0.6;b<-2
n=20;rr=5000
U<-runif(n,0,1)
for (i in 1:rr){
x<-(-log(1-U^(1/p1))/b)
?meantrue<-gamma(1+(1/p1))*b
? meantrue
? d<-meantrue/0.30
? cen<- runif(n,min=0,max=d)
? s<-ifelse(x<=cen,1,0)
? q<-c(x,cen)
}
? ? z<-function(data, p){?
? ? shape<-p[1]
? ? scale<-p[2]
? ? log1<-n*sum(s)*log(shape)+ n*sum(s)*log(scale)+(shape-1)*sum(s)*log(1-((exp(-(scale)*sum(x)))))
-(scale)*sum(x) + (shape)*log((exp(-(scale)*sum(x))))-
(shape)*sum(s)*log((exp(-(scale)*sum(x))))
? return(-log1)
? }
? start <- c(1,1)
? zz<-optim(start,fn=z,data=q,hessian=T)
? zz
Thank you
Chris Kelvin
----- Original Message -----
From: Berend Hasselman <bhh at xs4all.nl>
To: Christopher Kelvin <chris_kelvin2001 at yahoo.com>
Cc: "r-help at r-project.org" <r-help at r-project.org>
Sent: Thursday, September 20, 2012 8:52 PM
Subject: Re: [R] Problem with Newton_Raphson
On 20-09-2012, at 13:46, Christopher Kelvin wrote:
Hello, I have being trying to estimate the parameters of the generalized exponential distribution. The random number generation for the GE distribution is x<-(-log(1-U^(1/p1))/b), where U stands for uniform dist. The data i have generated to estimate the parameters is right censored and the code is given below; The problem is that, the newton-Raphson approach isnt working and i do not know what is wrong. Can somebody suggest something or help in identifying what the problem might be.
Newton-Raphson? is not a method for optim.
p1<-0.6;b<-2
n=20;rr=5000
U<-runif(n,0,1)
for (i in 1:rr){
x<-(-log(1-U^(1/p1))/b)
? meantrue<-gamma(1+(1/p1))*b
? meantrue
? d<-meantrue/0.30
? cen<- runif(n,min=0,max=d)
? s<-ifelse(x<=cen,1,0)
? q<-c(x,cen)
? ? z<-function(data, p){
? ? shape<-p[1]
? ? scale<-p[2]
? ? log1<-n*sum(s)*log(p[1])+ n*sum(s)*log(p[2])+(p[1]-1)*sum(s)*log(1-((exp(-(p[2])*sum(x)))))
-(p[2])*sum(t) + (p[1])*log((exp(-(p[2])*sum(x))))-
(p[1])*sum(s)*log((exp(-(p[2])*sum(x))))
? return(-log1)
? }
}
? start <- c(1,1)
? zz<-optim(start,fn=z,data=q,hessian=T)
? zz
? m1<-zz$par[2]
? p<-zz$par[1]
Running your code given above gives an error message: Error in sum(t) : invalid 'type' (closure) of argument Where is object 't'? Why are you defining function z within the rr loop? Only the last definition is given to optim. Why use p[1] and p[2] explicitly in the calculation of log1 in the body of function z when you can use shape and scale defined in the lines before log1 <-. Berend
On 20-09-2012, at 15:17, Christopher Kelvin wrote:
By your suggestion if i understand you, i have changed (p[1]) and (p[2]) and have also corrected the error sum(t), but if i run it, the parameters estimates are very large.
Can you please run it and help me out? The code is given below.
p1<-0.6;b<-2
n=20;rr=5000
U<-runif(n,0,1)
for (i in 1:rr){
x<-(-log(1-U^(1/p1))/b)
meantrue<-gamma(1+(1/p1))*b
meantrue
d<-meantrue/0.30
cen<- runif(n,min=0,max=d)
s<-ifelse(x<=cen,1,0)
q<-c(x,cen)
}
z<-function(data, p){
shape<-p[1]
scale<-p[2]
log1<-n*sum(s)*log(shape)+ n*sum(s)*log(scale)+(shape-1)*sum(s)*log(1-((exp(-(scale)*sum(x)))))
-(scale)*sum(x) + (shape)*log((exp(-(scale)*sum(x))))-
(shape)*sum(s)*log((exp(-(scale)*sum(x))))
return(-log1)
}
start <- c(1,1)
zz<-optim(start,fn=z,data=q,hessian=T)
zz
1. You think you are using a (quasi) Newton method. But the default method is "Nelder-Mead". You should specify method explicitly for example method="BFGS". When you do that you will see that the results are completely different. You should also carefully inspect the return value of optim. In your case zz$convergence is 1 which means "indicates that the iteration limit maxit had been reached.".
When you use method="BFGS" you will get zz$ convergence is 0.
This may do what you want
zz<-optim(start, fn=z, data=q, method="BFGS", hessian=T)
zz
2. when providing examples such as yours it is a good idea to issue set.seed(<number>) at the start of your script to ensure reproducibility.
3. The function z does not calculate what you want: since fully formed expressions are terminated by a newline and since the remainder of the line after log1<- is a complete expression, log1 does include the value of the two following lines.
See the difference between
a <- 1
b <- 11
a
-b
and
a-
b
So you could write this
log1 <- n*sum(s)*log(shape)+ n*sum(s)*log(scale)+
(shape-1)*sum(s)*log(1-((exp(-(scale)*sum(x))))) -(scale)*sum(x) +
(shape)*log((exp(-(scale)*sum(x)))) - (shape)*sum(s)*log((exp(-(scale)*sum(x))))
and you will see rather different results.
You will have to determine if they are what you expect/desire.
A final remark on function z:
- do not calculate things like n*sum(s) repeatedly: doing something like A<-n*sum(s) and reusing A is more efficient.
- same thing for log((exp(-(scale)*sum(x)))) (recalculated four times)
- same thing for sum(x)
See below for a partly rewritten function z and results with method="BFGS".
I have put a set.seed(1) at the start of your code.
good luck,
Berend
z<-function(p,data){
shape<-p[1]
scale<-p[2]
log1 <- n*sum(s)*log(shape)+ n*sum(s)*log(scale)+
(shape-1)*sum(s)*log(1-((exp(-(scale)*sum(x))))) -(scale)*sum(x) +
(shape)*log((exp(-(scale)*sum(x)))) - (shape)*sum(s)*log((exp(-(scale)*sum(x))))
return(-log1)
}
start <- c(1,1)
z(start, data=q)
[1] -138.7918
zz<-optim(start, fn=z, data=q, method="BFGS", hessian=T)
There were 34 warnings (use warnings() to see them)
zz
$par
[1] 1.009614e+11 8.568498e+01
$value
[1] -1.27583e+15
$counts
function gradient
270 87
$convergence
[1] 0
$message
NULL
$hessian
[,1] [,2]
[1,] -62500 0
[2,] 0 0
On 20-09-2012, at 16:06, Berend Hasselman wrote:
On 20-09-2012, at 15:17, Christopher Kelvin wrote: ..... A final remark on function z: - do not calculate things like n*sum(s) repeatedly: doing something like A<-n*sum(s) and reusing A is more efficient. - same thing for log((exp(-(scale)*sum(x)))) (recalculated four times) - same thing for sum(x)
Oops! No not four times. Twice. But (exp(-(scale)*sum(x))) is calculated three times. Berend
Thank you very much for everything. Your suggestions were very?helpful.? Chris ----- Original Message ----- From: Berend Hasselman <bhh at xs4all.nl> To: Christopher Kelvin <chris_kelvin2001 at yahoo.com> Cc: "r-help at r-project.org" <r-help at r-project.org> Sent: Thursday, September 20, 2012 10:06 PM Subject: Re: [R] Problem with Newton_Raphson
On 20-09-2012, at 15:17, Christopher Kelvin wrote:
By your suggestion if i understand you, i have changed (p[1]) and (p[2]) and have also corrected the error sum(t), but if i run it, the parameters estimates are very large.
Can you please run it and help me out? The code is given below.
p1<-0.6;b<-2
n=20;rr=5000
U<-runif(n,0,1)
for (i in 1:rr){
x<-(-log(1-U^(1/p1))/b)
? meantrue<-gamma(1+(1/p1))*b
? meantrue
? d<-meantrue/0.30
? cen<- runif(n,min=0,max=d)
? s<-ifelse(x<=cen,1,0)
? q<-c(x,cen)
}
? ? z<-function(data, p){
? ? shape<-p[1]
? ? scale<-p[2]
? ? log1<-n*sum(s)*log(shape)+ n*sum(s)*log(scale)+(shape-1)*sum(s)*log(1-((exp(-(scale)*sum(x)))))
-(scale)*sum(x) + (shape)*log((exp(-(scale)*sum(x))))-
(shape)*sum(s)*log((exp(-(scale)*sum(x))))
? return(-log1)
? }
? start <- c(1,1)
? zz<-optim(start,fn=z,data=q,hessian=T)
? zz
1. You think you are using a (quasi) Newton method. But the default method is "Nelder-Mead". You should specify method explicitly for example method="BFGS". When you do that you will see that the results are completely different. You should also carefully inspect the return value of optim. In your case zz$convergence is 1 which means "indicates that the iteration limit maxit had been reached.".
When you use method="BFGS" you will get zz$ convergence is 0.
This may do what you want
zz<-optim(start, fn=z, data=q, method="BFGS", hessian=T)
zz
2. when providing examples such as yours it is a good idea to issue set.seed(<number>) at the start of your script to ensure reproducibility.
3. The function z does not calculate what you want: since fully formed expressions are terminated by a newline and since the remainder of the line after log1<- is a complete expression, log1 does include the value of the two following? lines.
See the difference between
a <- 1
b <- 11
a
-b
and
a-
b
So you could write this
? ? log1 <- n*sum(s)*log(shape)+ n*sum(s)*log(scale)+
? ? ? ? ? ? (shape-1)*sum(s)*log(1-((exp(-(scale)*sum(x))))) -(scale)*sum(x) +
? ? ? ? ? ? (shape)*log((exp(-(scale)*sum(x)))) - (shape)*sum(s)*log((exp(-(scale)*sum(x))))
and you will see rather different results.
You will have to determine if they are what you expect/desire.
A final remark on function z:
- do not calculate things like n*sum(s) repeatedly: doing something like A<-n*sum(s) and reusing A is more efficient.
- same thing for log((exp(-(scale)*sum(x)))) (recalculated four times)
- same thing for sum(x)
See below for a partly rewritten function z and results with method="BFGS".
I have put a set.seed(1) at the start of your code.
good luck,
Berend
z<-function(p,data){
? ? shape<-p[1]
? ? scale<-p[2]
? ? log1 <- n*sum(s)*log(shape)+ n*sum(s)*log(scale)+
? ? ? ? ? ? (shape-1)*sum(s)*log(1-((exp(-(scale)*sum(x))))) -(scale)*sum(x) +
? ? ? ? ? ? (shape)*log((exp(-(scale)*sum(x)))) - (shape)*sum(s)*log((exp(-(scale)*sum(x))))
? ? return(-log1)
}
start <- c(1,1)
z(start, data=q)
[1] -138.7918
zz<-optim(start, fn=z, data=q, method="BFGS", hessian=T)
There were 34 warnings (use warnings() to see them)
zz
$par [1] 1.009614e+11 8.568498e+01 $value [1] -1.27583e+15 $counts function gradient ? ? 270? ? ? 87 $convergence [1] 0 $message NULL $hessian ? ? ? [,1] [,2] [1,] -62500? ? 0 [2,]? ? ? 0? ? 0