Skip to content

ljung-box tests in arma and garch models

4 messages · michal miklovic, Spencer Graves, John C Frain +1 more

#
Hi, Michal and Patrick: 


PATRICK: 

      In your 2002 paper on the "Robustness of the Ljung-Box Test and 
its Rank Equivalent" 
(http://www.burns-stat.com/pages/Working/ljungbox.pdf), do you consider 
using m-g degrees of freedom, where  m = number of lags and g = number 
of parameters estimated (ignoring an intercept)?  I didn't read every 
word, but I only saw you using 'm' degrees of freedom, and I did not 
notice a comment on this issue. 

      Your Exhibit 3 (p. 7) presents a histogram of the "Distribution of 
the 50-lag Ljung-Box p-vallue under the Gaussian distribution with 100 
observations".  It looks to me like a Beta(a, b) distribution, with a < 
b < 1 but with both a and b fairly close to 1.  The excess of p-values 
in the lower tail suggests to me that the real degrees of freedom for a 
reference chi-square should in this case be slightly greater than 50.  
Your Exhibit 10 shows a comparable histogram for the "Distribution of 
the Ljung-Box 15 lag p-value for the square of a t with 4 degrees of 
freedom with 10,000 observations."  This looks to me like a Beta(a, b) 
distribution with b < a < 1 but with many fewer p-values near 0 than 
near 1.  This in turn suggests to me that the degrees of freedom of the 
reference chi-square test would be less than 15 in this case.  Apart 
from this question, your power curves, Exhibits 14-22 provide rather 
persuasive support for your recommended use of the rank equivalent to 
the traditional Ljung-Box. 


MICHAL: 

      Thanks very much for your further comments on this.  The standard 
asymptotic theory would support Enders' and Tsay's usage of m-g degrees 
of freedom, with m = number of lags and g = number of parameters 
estimated, apart from an intercept -- PROVIDED the parameters were 
estimated using to minimize the Ljung-Box statistic.  However, the 
parameters are typically estimated to maximize a likelihood.  The effect 
of this would likely be to understate the p-value, which we generally 
want to avoid. 

      However, we never want to use these statistics infinite sample 
sizes and degrees of freedom.  Therefore, the asymptotic theory is only 
a guideline, preferably with some adjustment for finite sample sizes and 
degrees of freedom.  Therefore, it is wise to evaluate the adequacy of 
the asymptotics with appropriate simulations.  These may have been 
done;  I have not researched the literature on this, apart from Burns 
(2002).  If anyone knows of other relevant simulations, I'd like to hear 
about them.

      By the way, Tsay's second edition (2005, p. 44) includes a similar 
comment:  "For an AR(p) model, the Ljung-Box statistic Q(m) follows 
asymptotically a chi-square distribution with m-g degrees of freedom, 
where g denotes the number of AR coefficients used in the model."  This 
is similar to but different from your quote from the first edition. 


      Best Wishes,
      Spencer Graves
michal miklovic wrote:
#
For a proof that the appropriate degrees of freedom is s-p-q see
Brockwell and Davis (1990), Time Series: Theory and Methods, 2nd
Edition, Springer, page 310.

John Frain
On 30/12/2007, michal miklovic <mmiklovic at yahoo.com> wrote:

  
    
#
I thought I'd start off with some background for those who
don't know what we are talking about.

The Ljung-Box test in this context is used to see if the model
that is fit has captured all of the signal.  So in hypothesis testing
terms, we have things backwards -- we are satisfied when we
see large p-values rather than wanting to see small p-values.

The working paper referred to below shows that the Ljung-Box
test is fantastically robust to the data being non-Gaussian.  However,
there is a practical setting in which it is not robust enough.  That is
when testing if a garch model captures all of the variation in variance
by squaring the residuals (which will themselves be long-tailed in
practice).

One symptom is seeing p-values for the Ljung-Box test that are very
close to 1, such as .998.  (This is essentially saying that the model has
overfit the data, but overfitting a couple thousand observations with a
handful of parameters is unlikely.)

A good remedy is to use the ranks of the squared residuals rather than
the actual squared residuals in the Ljung-Box test.

This thread is really about the degrees of freedom with which to use to
get the p-value from the test statistic.  In the big picture I regard 
this as
rather unimportant -- it doesn't matter much if the p-value is 3.3% or
3.4%.  However, I do believe in doing things as well as possible.

The asymptotics seem to be saying to use 'm - g' degrees of freedom rather
than 'm'.  Asymptotics are nice but the real question is what happens in
a finite sample with a long-tailed distribution.

Spencer, no I didn't look at degrees of freedom when I was doing the
simulations for the paper.

Pat
Spencer Graves wrote: