I want to apply Nyblom-Hansen test with the strucchange package, but I don't
know how is the correct way and what is the difference between the following
two approaches (leeding to different results):
data("longley")
# 1. Approach:
sctest(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data = longley,
type = "Nyblom-Hansen")
#results in:
# Score-based CUSUM test with mean L2 norm
#
#data: Employed ~ Year + GNP.deflator + GNP + Armed.Forces
#f(efp) = 0.8916, p-value = 0.4395
#2. Approach:
sctest(gefp(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data =
longley), functional = meanL2BB)
#results in:
# M-fluctuation test
#
#data: gefp(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data =
longley)
#f(efp) = 0.8165, p-value = 0.3924
I could not find any examples or further remarks of the first approach with
sctest(..., type = "Nyblom-Hansen").
Maybe the first approach is unlike the second no joint test for all
coefficients?
Thank you in advance for your help!
--
View this message in context: http://r.789695.n4.nabble.com/strucchange-Nyblom-Hansen-Test-tp3887208p3887208.html
Sent from the R help mailing list archive at Nabble.com.
strucchange Nyblom-Hansen Test?
6 messages · buehlerman, Achim Zeileis
On Sun, 9 Oct 2011, buehlerman wrote:
I want to apply Nyblom-Hansen test with the strucchange package, but I don't know how is the correct way and what is the difference between the following two approaches (leeding to different results):
The difference is that sctest(formula, type = "Nyblom-Hansen") applies the
Nyblom-Hansen test statistic to a model which assesses both coefficients
_and_ error variance.
The approach via functional = meanL2BB, on the other hand, allows to apply
the same type of test statistic to the score functions of any model. In
your case, where you used the default fit = glm in gefp(), a linear
regression model is used where the error variance is _not_ included as a
full model parameter but only as a nuisance parameter. Hence, the
difference.
Of course, one may also add another score function for the error variance.
On example("DIJA", package = "strucchange") I provide a function normlm()
with corresponding estfun() method. If you load these, you can do:
R> sctest(gefp(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data =
longley, fit = normlm), functional = meanL2BB)
M-fluctuation test
data: gefp(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data = longley,
fit = normlm)
f(efp) = 0.8916, p-value = 0.4395
which leads to the same output as sctest(formula, type = "Nyblom-Hansen").
Finally, instead of using gefp(..., fit = normlm), you could have also
used efp(..., type = "Score-CUSUM"):
R> sctest(efp(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data =
longley, type = "Score-CUSUM"), functional = "meanL2")
Score-based CUSUM test with mean L2 norm
data: efp(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data =
longley, type = "Score-CUSUM")
f(efp) = 0.8916, p-value = 0.4395
I hope that this clarifies more than adding to the confusion ;-)
The reason for the various approaches is that efp() was always confined to
the linear model and gefp() then extended it to arbitrary estimating
function-based models. And for the linear model this provides the option
of treating the variance of a nuisance parameter or a full model
parameter.
Hope that helps,
Z
data("longley")
# 1. Approach:
sctest(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data = longley,
type = "Nyblom-Hansen")
#results in:
# Score-based CUSUM test with mean L2 norm
#
#data: Employed ~ Year + GNP.deflator + GNP + Armed.Forces
#f(efp) = 0.8916, p-value = 0.4395
#2. Approach:
sctest(gefp(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data =
longley), functional = meanL2BB)
#results in:
# M-fluctuation test
#
#data: gefp(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data =
longley)
#f(efp) = 0.8165, p-value = 0.3924
I could not find any examples or further remarks of the first approach with
sctest(..., type = "Nyblom-Hansen").
Maybe the first approach is unlike the second no joint test for all
coefficients?
Thank you in advance for your help!
--
View this message in context: http://r.789695.n4.nabble.com/strucchange-Nyblom-Hansen-Test-tp3887208p3887208.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
5 days later
Thanks a lot for your immediate help and detailed explanation! About one thing I'm not quite clear: When the default fit = glm in gefp() is used: sctest(gefp(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data = longley, fit = lm), functional = meanL2BB) is this then the original Nyblom's Parameter Stability Test (1989) or is it the joint (Nyblom-)Hansen test with theta = (beta) constant instead of theta = (beta, sigma^2) constant ? -- View this message in context: http://r.789695.n4.nabble.com/strucchange-Nyblom-Hansen-Test-tp3887208p3905669.html Sent from the R help mailing list archive at Nabble.com.
On Fri, 14 Oct 2011, buehlerman wrote:
Thanks a lot for your immediate help and detailed explanation! About one thing I'm not quite clear: When the default fit = glm in gefp() is used: sctest(gefp(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data = longley, fit = lm), functional = meanL2BB)
To understand what exactly is tested, it may help to decompose this into the steps (1) model fitting, (2) process setup, (3) test statistic computation. m <- lm(Employed ~ Year + GNP.deflator + GNP + Armed.Forces, data = longley) p <- gefp(m, fit = NULL) sctest(p, functional = meanL2BB) To see which parameters are assessed you can look at estfun(m) or the corresponding fluctuation process plot(p, aggregate = FALSE) both of which have one column per parameter. The visualization for the Nyblom-Hansen test statistic plot(p, functional = meanL2BB) does not show the individual processes because you first aggregate across parameters using the squared Euclidian norm (before taking its mean).
is this then the original Nyblom's Parameter Stability Test (1989) or is it the joint (Nyblom-)Hansen test with theta = (beta) constant instead of theta = (beta, sigma^2) constant ?
The original Nyblom suggestion was for ML estimation of a distribution with one parameter (without covariates). Hansen extended this to the linear regression model and proposed to either compute one test statistic per parameter (which you can do with the "parm" argument of gefp) or a joint statistic for all parameters. Hansen included in "all" parameters also the variance, but the idea can be directly modified to any other model with a vector of parameters. The reason that estfun(lm_object) does not include the variance is that coef(lm_object) and vcov(lm_object) etc. also do not include it. So it is the conceptual question whether lm() computes the OLS estimator or the full ML estimator. The normlm() wrapper on the ?DJIA man page is one possibility to switch to the ML view for the purposes of using "strucchange". hth, Z
-- View this message in context: http://r.789695.n4.nabble.com/strucchange-Nyblom-Hansen-Test-tp3887208p3905669.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
11 days later
Thank you, things seem to be clearer :-)
Hansen extended this to the linear regression model and proposed to either compute one test statistic per parameter (which you can do with the "parm" argument of gefp) or a joint statistic for all parameters. Hansen included in "all" parameters also the variance,
The "parm" argument of gefp is a nice feature, but what is about the significance level in test statistic compuation (sctest)? Is there multiple testing correction applied or should I rather use for this case the double max statistic as recommended below? An excerpt from page 5 of the paper "A Unified Approach to Structural Change Tests Based obn F Statistics, OLS Residuals, and ML Scores" (Achim Zeileis): Hansen (1992) suggests to compute this statistic for the full process efp(t) to test all coefficients simultaneously and also for each component of the process (efp(t))j (denoting the j-th component of the process efp(t), j = 1, . . . , k) individually to assess which parameter causes the instability. *Note, that this approach leads to a violation of the significance level of the procedure if no multiple testing correction is applied.* This can be avoided if a functional is applied to the empirical fluctuation process which aggregates over time first yielding k independent test statistics (see Zeileis and Hornik 2003, for more details). -- View this message in context: http://r.789695.n4.nabble.com/strucchange-Nyblom-Hansen-Test-tp3887208p3940055.html Sent from the R help mailing list archive at Nabble.com.
On Wed, 26 Oct 2011, buehlerman wrote:
Thank you, things seem to be clearer :-)
Great.
Hansen extended this to the linear regression model and proposed to either compute one test statistic per parameter (which you can do with the "parm" argument of gefp) or a joint statistic for all parameters. Hansen included in "all" parameters also the variance,
The "parm" argument of gefp is a nice feature, but what is about the significance level in test statistic compuation (sctest)? Is there multiple testing correction applied or should I rather use for this case the double max statistic as recommended below?
By applying the functional in sctest(), you implicitly correct for the
number of parameters tested. Thus, you don't need to apply another
correction for multiple testing. (The only caveat with the p-values from
sctest() is that these are always asymptotic p-values and may not be exact
in finite samples. And for many functionals these have been determined by
simulation.)
This is discussed in a little bit more detail in
Zeileis A. (2006), Implementing a Class of Structural Change
Tests: An Econometric Computing Approach. _Computational
Statistics & Data Analysis_, *50*, 2987-3008.
doi:10.1016/j.csda.2005.07.001.
The comment quoted below pertains to the fact that Hansen (1992) suggested
to compute one p-value for each individual parameter as well as another
p-value for all parameters jointly. In such a situation, you would have to
apply some multiple testing procedure. The meanL2BB functional in
strucchange only computes the joint p-value.
hth,
Z
An excerpt from page 5 of the paper "A Unified Approach to Structural Change Tests Based obn F Statistics, OLS Residuals, and ML Scores" (Achim Zeileis): Hansen (1992) suggests to compute this statistic for the full process efp(t) to test all coefficients simultaneously and also for each component of the process (efp(t))j (denoting the j-th component of the process efp(t), j = 1, . . . , k) individually to assess which parameter causes the instability. *Note, that this approach leads to a violation of the significance level of the procedure if no multiple testing correction is applied.* This can be avoided if a functional is applied to the empirical fluctuation process which aggregates over time first yielding k independent test statistics (see Zeileis and Hornik 2003, for more details). -- View this message in context: http://r.789695.n4.nabble.com/strucchange-Nyblom-Hansen-Test-tp3887208p3940055.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.