Skip to content

Time series misalignment

5 messages · Fernando Saldanha, Achim Zeileis

#
This maybe a basic question, but I have spent several hours
researching and I could not get an answer, so please bear with me. The
problem is with time series in the package tseries. As the example
below shows, the time series can get misaligned, so that bad results
are obtained when doing regressions. I found a way to do this
correctly, but I find it rather cumbersome. My question is: is there a
better way to do it?

Thanks for any help.

Suppose I define:
Then I get:
Time Series:
Start = 2 
End = 5 
Frequency = 1 
[1] 1 1 2 2
Time Series:
Start = 1 
End = 4 
Frequency = 1 
[1] 1 1 2 2

Notice that the Start values for x1 and z1 are different.

However, if I regress z1 on z1 I get:
Call:
lm(formula = z1 ~ x1, na.action = NULL)

Coefficients:
(Intercept)           x1  
          0            1  
          
But this is the wrong answer. The time series z1 and x1 are
misaligned. lm is ignoring the fact that Start = 2 for x1 and Start =
1 for z1.

To fix this problem I did the following:
Time Series:
Start = 2 
End = 4 
Frequency = 1 
  y1 x1 z1
2  2  1  1
3  3  1  2
4  5  2  2

These versions of z1 and x1 are correctly aligned.

Now I can do:
Call:
lm(formula = tsf[, 3] ~ tsf[, 2])

Coefficients:
(Intercept)     tsf[, 2]  
        1.0          0.5 
        
This is the correct answer. However, it is rather cumbersome to refer
to the aligned variables as columns of the time series object tsf.

As an observation, I also called ts.intersect with the option dframe =
t and got exactly the same results.

So my question is: is there a less cumbersome way to keep these time
series aligned?

Thanks again for any help.
#
Fernando:
BTW: the `tseries' package is not involved here.
lm() per se has only very limited support for time series regression.
Therefore, there are currently several tools under development for
addressing this issue. In particular, Gabor Grothendieck and myself are
working on different approaches to this problem.

<snip>
It is probably simpler to just do
  lm1 <- lm(z1 ~ x1, data = tsf)

Another approach is implemented in the zoo package. This implements an
formula dispatch and you can do
  lm1 <- lm(I(z1 ~ x1))
*without* computing tsf first.

Depending on what you want to do with the fitted models, one of the two
approaches might be easier, currently. In particular, if you want to fit
and compare several models, then I would compute the intersection first
and fit all models of interest on this data set.

Furthermore, note that the dispatch implementation via I() in zoo is
still under development and likely to change in future versions. (But
this mainly means that improved implementations will become available
soon, stay tuned :-)
Z
#
Can one also predetermine a set and then estimate all the models one
wants to compare using the zoo package? Or can that be done only with
the tseries package?

Thanks.

FS
On 4/12/05, Achim Zeileis <Achim.Zeileis at wu-wien.ac.at> wrote:
#
On Tue, 12 Apr 2005 18:47:21 -0400 Fernando Saldanha wrote:

            
Sure, you can merge() several series first and then pass this as the
data argument to lm(). See the vignette of the zoo package for more
examples.
Really, you are *not* using the tseries package here!

(In the old days, the class "ts" and its methods used to be in the
package ts, but this was merged into stats long ago. tseries is an
entirely different package.)
Z
#
Thanks, Achim,

I managed to do what I wanted, thanks to your suggestion, except for
one thing. When I called ts.intersect I could only provide numerical
arguments (more precisely, objects that can be coerced into time
series, I guess). That means I was not able to pass the original
row.names that I had read into a data frame (those were character
strings). At that point those row.names became misaligned with the
data frame created by the ts.intersect call, whose row names were just
1, 2, 3, .... Is there a way to avoid this problem?

I will check the zoo package, but I started with R three days ago, so
it's a bit of information overload right now.

Thanks for the help.

FS
On 4/12/05, Achim Zeileis <Achim.Zeileis at wu-wien.ac.at> wrote: