Skip to content

Spline GARCH

6 messages · Paul Gilbert, Alexios Ghalanos, Bastian Offermann

#
Hi all,

I am currently implementing the Engle & Rangel (2008) Spline GARCH 
model. I use the nlminb optimizer which does not provide a hessian 
unfortunately to get the standard errors of the coefficients. I can get 
around this using the 'hessian' function in numDeriv, but usually get 
NaN values for the omega parameter.

Can anybody recommend additional optimizers that directly return a 
hessian? How sensitive are the coefficients to the initial starting values?

Thanks in advance!
#
On 02/07/2014 08:19 AM, Bastian Offermann wrote:
Do you know why this happens, or can you provide a simple example? An 
NaN value from hessian() is often because the function fails to evaluate 
in a small neighbourhood of the point where it is being calculated, that 
is, at your parameter estimate. Are you on the boundary of the feasible 
region?
A hessian returned by an optimizer is usually one that is built up by 
some approximation during the optimization process. One of the original 
purposes of hessian() was to try to do something that is usually better 
than that, specifically because you want a good approximation if you are 
going to use it to calculate standard errors. (And, of course, you want 
the conditions to hold for the hessian to be an approximation of the 
variance.)  Just because an optimizer returns something for the hessian, 
it it not clear that you would want to use it to calculate standard 
errors. The purpose of the hessian built up by an optimizer is to speed 
the optimization, not necessarily to provide a good approximation to the 
hessian.  In the case where hessian() is returning NaNs I would be 
concerned that anything returned by an optimizer could be simply bogus.
This depends on a number of things, the optimizer you use being one of 
them. Most optimizers have some mechanism to specify something different 
from the default for the stopping criteria and you can, for a problem 
without local optimum issues (e.g. convex level sets), reduce 
sensitivity to the starting value by tightening the stopping criteria. 
The more serious problem is when you have local optimum issues. Then you 
will get false convergence and thus extreme sensitivity to starting 
values. Even for a parameter space that is generally good, there are 
often parameter values for which the optimization is a bit sensitive. 
And, of course, all this also depends on your dataset. Generally, the 
sensitivity will increase with short datasets.

The previous paragraph is about the coefficient estimate. At the same 
coefficient estimate hessian() will return the same thing, but a hessian 
built up by an optimizer will depend on the path, and generally needs a 
fairly large number of final steps in the vicinity of the optimum to 
give a good approximation. Thus, somewhat counter intuitively, if you do 
an optimization starting with values for the coefficients that are very 
close to the optimum you will get quick convergence but often a bad 
hessian approximation from the optimizer.

Paul
#
You might also want to consider the issue of scaling when it comes to the intercepts (mean or variance), as well as  your positivity and stationarity conditions/constraints.

-Alexios
#
Thanks a lot, Paul. Here is some reproducible code. Do you know of any 
good reference how to build up these models from scratch, e.g. 
calculating standard errors of estimates coming from a numerical 
optimizer, information criteria etc. ?



library('BB')
library('alabama')
library('nloptr')

### Spline GARCH ###

r = rnorm(13551, 0.0004, 0.016)    # pseudo returns, 13551 is the number 
of obs. of real market data I apply the model to, inputs arbitrary


### Specifying time trend


k = 3                                                       # number of 
knots
bounds = floor(1:k * 13551/k)                # partition of time horizon
bounds = c(0, bounds[1:(k-1)])
time.lin = 0:13550                           # linear time trend
time.nonlin <- matrix(rep(time.lin, k), length(time.lin), k) # quadratic 
time trend

for(i in 1:k) {                                              #
time.nonlin[,i] <- time.nonlin[,i] - bounds[i]               #
time.nonlin[which(time.nonlin[,i] < 0), i] <- 0 #
time.nonlin[, i] <- time.nonlin[, i]^2                       #
               }                                              #

time.trend = cbind(time.lin, time.nonlin)

for(i in 1:dim(time.trend)[2]) time.trend[,i] <- 
time.trend[,i]/time.trend[dim(time.trend)[1], i]    # normalizing 
between 0 and 1

head(time.trend)
tail(time.trend)




### Spline function

splgarch <- function(para) {
mu <- para[1]
omega <- para[2]
alpha <- para[3]
beta <- para[4]
cc <- para[5]
w <- para[6:(k+6)]
Tau <- cc * exp( apply(t(diag(w)%*%t(time.trend)), 1, sum) )
e2 <- (r-mu)^2
e2t <- omega + alpha * c(mean(e2), e2[-length(r)]) / Tau^2
s2 <- filter(e2t, beta, "recursive", init = mean(e2))
sig2 <- s2 * Tau
0.5*sum( log(2*pi) + log(sig2) + e2/sig2) }



### Spline parameter initialization
mu <- mean(r)
small <- 1e-6
alpha <- 0.1
beta <- 0.8
omega <- (1-alpha-beta)
para <- c(mu, omega, alpha, beta, 1, rep(small, length(4:(k+4))))

lo <- c(-10*abs(mu), small, small, small, rep(-10, length(3:(k+4))))
hi <- c(10*abs(mu), 100*abs(mu), 1-small, 1-small, rep(10, length(3:(k+4))))



### Spline optimization

fit <- nlminb(start = para, objective = splgarch, lower = lo, upper = 
hi, control = list(x.tol = 1e-8,trace=0))
names(fit$par) <- c('alpha', 'beta', 'c', paste('w', sep = '', 0:k))
round(fit$par, 6)

fit.hessian = hessian(splgarch, fit$par, method="complex")
fit.hessian









R version 2.15.2 (2012-10-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United 
States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods base

other attached packages:
[1] nloptr_0.9.6      alabama_2011.9-1  numDeriv_2012.9-1 BB_2014.1-1



Am 2/7/2014 10:28 AM, schrieb Paul Gilbert:
#
There was an error in the previous code, this should be correct...



library('BB')
library('alabama')
library('nloptr')

r = rnorm(13551, 0.0004, 0.016)     # pseudo returns


### Spline GARCH ###

### Specifying time trend


k = 3                                                       # number of 
knots
bounds = floor(1:k * 13551/k)                # partition of time horizon
bounds = c(0, bounds[1:(k-1)])
time.lin = 0:13550                           # linear time trend
time.nonlin <- matrix(rep(time.lin, k), length(time.lin), k) # quadratic 
time trend

for(i in 1:k) {                                              #
time.nonlin[,i] <- time.nonlin[,i] - bounds[i]               #
time.nonlin[which(time.nonlin[,i] < 0), i] <- 0 #
time.nonlin[, i] <- time.nonlin[, i]^2                       #
               }                                              #

time.trend = cbind(time.lin, time.nonlin)

for(i in 1:dim(time.trend)[2]) time.trend[,i] <- 
time.trend[,i]/time.trend[dim(time.trend)[1], i]   # normalizing between 
0 and 1

head(time.trend)
tail(time.trend)


### Spline function

splgarch <- function(para) {
mu <- para[1]
omega <- para[2]
alpha <- para[3]
beta <- para[4]
cc <- para[5]
w <- para[6:(k+6)]
Tau <- cc * exp( apply(t(diag(w)%*%t(time.trend)), 1, sum) )
e2 <- (r-mu)^2
e2t <- omega + alpha * c(mean(e2), e2[-length(r)]) / Tau^2
s2 <- filter(e2t, beta, "recursive", init = mean(e2))
sig2 <- s2 * Tau
0.5*sum( log(2*pi) + log(sig2) + e2/sig2) }



### Spline parameter initialization
mu <- mean(r)
small <- 1e-6
alpha <- 0.1
beta <- 0.8
omega <- (1-alpha-beta)
para <- c(mu, omega, alpha, beta, 1, rep(small, length(4:(k+4))))

lo <- c(-10*abs(mu), small, small, small, rep(-10, length(3:(k+4))))
hi <- c(10*abs(mu), 100*abs(mu), 1-small, 1-small, rep(10, length(3:(k+4))))


### Spline optimization

fit <- nlminb(start = para, objective = splgarch, lower = lo, upper = 
hi, hessian = TRUE, control = list(x.tol = 1e-8,trace=0))
names(fit$par) <- c('mu', 'omega', 'alpha', 'beta', 'c', paste('w', sep 
= '', 0:k))
round(fit$par, 6)

fit.hessian = hessian(splgarch, fit$par, method="complex")
fit.hessian



Am 2/7/2014 10:28 AM, schrieb Paul Gilbert:
3 days later
#
Sorry for the delay, I'm travelling and not always on my email.
(See below.)
On 02/07/2014 02:40 PM, Bastian Offermann wrote:
This gives (at least with my working copy)

Error in filter(e2t, beta, "recursive", init = mean(e2)) : invalid input
In addition: Warning messages:
1: In filter(e2t, beta, "recursive", init = mean(e2)) :
   imaginary parts discarded in coercion
2: In filter(e2t, beta, "recursive", init = mean(e2)) :
   imaginary parts discarded in coercion
Error in grad.default(func = fn, x = x, method = "complex", method.args 
= list(eps = .Machine$double.eps),  :
   function does not accept complex argument as required by method 
'complex'.
Error: object 'fit.hessian' not found

The "complex" method is truly powerful in cases where it can be used, 
but the requirement on the function (complex analytic, even though you 
may be only interested in it as a real valued function with real 
arguments) is non-trivial. I clearly need to beef up the warnings in the 
documentation. They are there, but perhaps you did not follow the 
suggestion to see ?grad, and even there I may be assuming people might 
actual look at the reference. I will work on the documentation. Your 
function does not even handle a complex argument, so it is definitely 
not suitable for this method.

Try

fit.hessian = hessian(splgarch, fit$par, method="Richardson")

fit.hessian
                [,1]          [,2]         [,3]          [,4]         [,5]
  [1,] 54159760.8324     -391558.2 3.018716e+05     -4467.786 1.101539e+02
  [2,]  -391558.2296 90404150206.1 2.220223e+09 224552735.372 2.183666e+08
  [3,]   301871.5678  2220223211.2 6.070358e+07   5516188.460 5.352351e+06
  [4,]    -4467.7858   224552735.4 5.516188e+06    558096.767 5.425405e+05
  [5,]      110.1539   218366571.2 5.352351e+06    542540.452 5.277932e+05
  [6,]    -1624.6496    12373678.7 2.983966e+05     30769.665 2.989512e+04
  [7,]    -2190.8683     8249523.2 1.948851e+05     20514.137 1.993109e+04
  [8,]    -2569.8729     5499872.3 1.264006e+05     13676.544 1.328788e+04
  [9,]    -2991.0444     2750144.2 6.090795e+04      6838.777 6.644464e+03
                [,6]         [,7]         [,8]         [,9]
  [1,]    -1624.6496   -2190.8683   -2569.8729   -2991.0444
  [2,] 12373678.7285 8249523.2423 5499872.3398 2750144.2107
  [3,]   298396.5695  194885.1344  126400.6003   60907.9503
  [4,]    30769.6650   20514.1374   13676.5444    6838.7768
  [5,]    29895.1161   19931.0875   13287.8823    6644.4640
  [6,]     2258.4950    1693.8479    1254.6936     690.5527
  [7,]     1693.8479    1355.1300    1070.7734     636.5426
  [8,]     1254.6936    1070.7734     903.6127     584.6161
  [9,]      690.5527     636.5426     584.6161     453.8935

I think this may be what you are looking for, but beware that your 
alpha parameter is on the boundary. The fact that hessian() works on the 
boundary suggests that the function actually evaluates beyond the 
boundary without returning an error, but you need to be sure the 
function is not doing something strange outside the boundary. I'm not 
familiar with how to interpret the hessian as an approximation of the 
variance at a boundary point. Do you think of the se as symmetric in 
this case?

Paul