Skip to content

strucchange Nyblom-Hansen Test?

6 messages · buehlerman, Achim Zeileis

#
I want to apply Nyblom-Hansen test with the strucchange package, but I don't
know how is the correct way and what is the difference between the following
two approaches (leeding to different results):


data("longley")

# 1. Approach:
sctest(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data = longley,
type = "Nyblom-Hansen")

#results in:
#        Score-based CUSUM test with mean L2 norm
#
#data:  Employed ~ Year + GNP.deflator + GNP + Armed.Forces 
#f(efp) = 0.8916, p-value = 0.4395

#2. Approach:
sctest(gefp(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data =
longley), functional = meanL2BB)

#results in:
#        M-fluctuation test
#
#data:  gefp(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data =
longley) 
#f(efp) = 0.8165, p-value = 0.3924


I could not find any examples or further remarks of the first approach with
sctest(..., type = "Nyblom-Hansen").
Maybe the first approach is unlike the second no joint test for all
coefficients? 

Thank you in advance for your help!

--
View this message in context: http://r.789695.n4.nabble.com/strucchange-Nyblom-Hansen-Test-tp3887208p3887208.html
Sent from the R help mailing list archive at Nabble.com.
#
On Sun, 9 Oct 2011, buehlerman wrote:

            
The difference is that sctest(formula, type = "Nyblom-Hansen") applies the 
Nyblom-Hansen test statistic to a model which assesses both coefficients 
_and_ error variance.

The approach via functional = meanL2BB, on the other hand, allows to apply 
the same type of test statistic to the score functions of any model. In 
your case, where you used the default fit = glm in gefp(), a linear 
regression model is used where the error variance is _not_ included as a 
full model parameter but only as a nuisance parameter. Hence, the 
difference.

Of course, one may also add another score function for the error variance. 
On example("DIJA", package = "strucchange") I provide a function normlm() 
with corresponding estfun() method. If you load these, you can do:

R> sctest(gefp(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data = 
longley, fit = normlm), functional = meanL2BB)

         M-fluctuation test

data:  gefp(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data = longley,
       fit = normlm)
f(efp) = 0.8916, p-value = 0.4395

which leads to the same output as sctest(formula, type = "Nyblom-Hansen").

Finally, instead of using gefp(..., fit = normlm), you could have also 
used efp(..., type = "Score-CUSUM"):

R> sctest(efp(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data = 
longley, type = "Score-CUSUM"), functional = "meanL2")

         Score-based CUSUM test with mean L2 norm

data:  efp(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data = 
longley,      type = "Score-CUSUM")
f(efp) = 0.8916, p-value = 0.4395

I hope that this clarifies more than adding to the confusion ;-)

The reason for the various approaches is that efp() was always confined to 
the linear model and gefp() then extended it to arbitrary estimating 
function-based models. And for the linear model this provides the option 
of treating the variance of a nuisance parameter or a full model 
parameter.

Hope that helps,
Z
5 days later
#
Thanks a lot for your immediate help and detailed explanation!

About one thing I'm not quite clear:

When the default fit = glm in gefp() is used:
sctest(gefp(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data =
longley, fit = lm), functional = meanL2BB)

is this then the original Nyblom's Parameter Stability Test (1989) or is it
the joint (Nyblom-)Hansen test with theta = (beta) constant instead of theta
= (beta, sigma^2) constant ?

--
View this message in context: http://r.789695.n4.nabble.com/strucchange-Nyblom-Hansen-Test-tp3887208p3905669.html
Sent from the R help mailing list archive at Nabble.com.
#
On Fri, 14 Oct 2011, buehlerman wrote:

            
To understand what exactly is tested, it may help to decompose this into 
the steps (1) model fitting, (2) process setup, (3) test statistic 
computation.

m <- lm(Employed ~ Year + GNP.deflator + GNP + Armed.Forces,
   data = longley)
p <- gefp(m, fit = NULL)
sctest(p, functional = meanL2BB)

To see which parameters are assessed you can look at

estfun(m)

or the corresponding fluctuation process

plot(p, aggregate = FALSE)

both of which have one column per parameter. The visualization for the 
Nyblom-Hansen test statistic

plot(p, functional = meanL2BB)

does not show the individual processes because you first aggregate across 
parameters using the squared Euclidian norm (before taking its mean).
The original Nyblom suggestion was for ML estimation of a distribution 
with one parameter (without covariates). Hansen extended this to the 
linear regression model and proposed to either compute one test statistic 
per parameter (which you can do with the "parm" argument of gefp) or a 
joint statistic for all parameters. Hansen included in "all" parameters 
also the variance, but the idea can be directly modified to any other 
model with a vector of parameters.

The reason that estfun(lm_object) does not include the variance is that 
coef(lm_object) and vcov(lm_object) etc. also do not include it. So it is 
the conceptual question whether lm() computes the OLS estimator or the 
full ML estimator. The normlm() wrapper on the ?DJIA man page is one 
possibility to switch to the ML view for the purposes of using 
"strucchange".

hth,
Z
11 days later
#
Thank you, things seem to be clearer :-)
The "parm" argument of gefp is a nice feature, but what is about the
significance level in test statistic compuation (sctest)? Is there multiple
testing correction applied or should I rather use for this case the double
max statistic as recommended below?

An excerpt from page 5 of the paper "A Unified Approach to Structural Change
Tests Based obn F Statistics, OLS Residuals, and ML Scores" (Achim Zeileis):
Hansen (1992) suggests to compute this statistic for the full process efp(t)
to test all coefficients
simultaneously and also for each component of the process (efp(t))j
(denoting the j-th component
of the process efp(t), j = 1, . . . , k) individually to assess which
parameter causes the instability.
*Note, that this approach leads to a violation of the significance level of
the procedure if no multiple
testing correction is applied.* This can be avoided if a functional is
applied to the empirical
fluctuation process which aggregates over time first yielding k independent
test statistics (see
Zeileis and Hornik 2003, for more details).

--
View this message in context: http://r.789695.n4.nabble.com/strucchange-Nyblom-Hansen-Test-tp3887208p3940055.html
Sent from the R help mailing list archive at Nabble.com.
#
On Wed, 26 Oct 2011, buehlerman wrote:

            
Great.
By applying the functional in sctest(), you implicitly correct for the 
number of parameters tested. Thus, you don't need to apply another 
correction for multiple testing. (The only caveat with the p-values from 
sctest() is that these are always asymptotic p-values and may not be exact 
in finite samples. And for many functionals these have been determined by 
simulation.)

This is discussed in a little bit more detail in

      Zeileis A. (2006), Implementing a Class of Structural Change
      Tests: An Econometric Computing Approach. _Computational
      Statistics & Data Analysis_, *50*, 2987-3008.
      doi:10.1016/j.csda.2005.07.001.

The comment quoted below pertains to the fact that Hansen (1992) suggested 
to compute one p-value for each individual parameter as well as another 
p-value for all parameters jointly. In such a situation, you would have to 
apply some multiple testing procedure. The meanL2BB functional in 
strucchange only computes the joint p-value.

hth,
Z