Skip to content

How to use "lag"?

6 messages · Spencer Graves, Dirk Eddelbuettel, Remigijus Lapinskas +1 more

#
Is it possible to fit a lagged regression, "y[t]=b0+b1*x[t-1]+e", 
using the function "lag"?  If so, how?  If not, of what use is the 
function "lag"?  I get the same answer from y~x as y~lag(x), whether 
using lm or arima.  I found it using y~c(NA, x[-length(x)])).  Consider 
the following: 

 > set.seed(1)
 > x <- rep(c(rep(0, 4), 9), len=9)
 > y <- (rep(c(rep(0, 5), 9), len=9)+rnorm(9)) # y[t] = x[t-1]+e
 >
 > lm(y~x)
(Intercept)            x 
     1.2872      -0.1064 
 > lm(y~lag(x))
(Intercept)       lag(x) 
     1.2872      -0.1064 
 > arima(y, xreg=x)
      intercept        x
         1.2872  -0.1064
s.e.     0.9009   0.3003
sigma^2 estimated as 6.492:  log likelihood = -21.19,  aic = 48.38
 > arima(y, xreg=lag(x))
      intercept   lag(x)
         1.2872  -0.1064
s.e.     0.9009   0.3003
 > arima(y, xreg=c(NA, x[-9]))
      intercept  c(NA, x[-9])
         0.4392        0.8600
s.e.     0.2372        0.0745
sigma^2 estimated as 0.3937:  log likelihood = -7.62,  aic = 21.25
 > arima(ts(y), xreg=lag(ts(x)))
arima(x = ts(y), xreg = lag(ts(x)))
      intercept  lag(ts(x))
         1.2872     -0.1064
s.e.     0.9009      0.3003
sigma^2 estimated as 6.492:  log likelihood = -21.19,  aic = 48.38
 
      Thanks for your help. 
      Spencer Graves
#
Spencer,

You may want to peruse the list archive for posts that match 'ts' and are
written by Brian Ripley -- these issues have come up before. 

The ts class is designed for arima and friends (like Kalman filtering), and
very useful in that context, but possibly not so much anywhere else.  lag()
only shifts the _reference dates_ attached to the object. So in a data.frame
context (as for lm()) .... nothing happens.

Personally, I use its as my main container for daily or weekly data. There is
also zoo, which I have meant to examine more closely for a while now.  You
want want to use one of those for shifting, matching, intersecting, ... and
then use one of the exporter functions (core() for its) to pass to lm(), say.

Hope this helps,  Dirk
#
Spencer Graves <spencer.graves <at> pdf.com> writes:

: 
: Is it possible to fit a lagged regression, "y[t]=b0+b1*x[t-1]+e", 
: using the function "lag"?  If so, how?  If not, of what use is the 
: function "lag"?  I get the same answer from y~x as y~lag(x), whether 
: using lm or arima.  I found it using y~c(NA, x[-length(x)])).  Consider 
: the following: 
: 
:  > set.seed(1)
:  > x <- rep(c(rep(0, 4), 9), len=9)
:  > y <- (rep(c(rep(0, 5), 9), len=9)+rnorm(9)) # y[t] = x[t-1]+e
:  >
:  > lm(y~x)
: (Intercept)            x 
:      1.2872      -0.1064 
:  > lm(y~lag(x))
: (Intercept)       lag(x) 
:      1.2872      -0.1064 
:  > arima(y, xreg=x)
:       intercept        x
:          1.2872  -0.1064
: s.e.     0.9009   0.3003
: sigma^2 estimated as 6.492:  log likelihood = -21.19,  aic = 48.38
:  > arima(y, xreg=lag(x))
:       intercept   lag(x)
:          1.2872  -0.1064
: s.e.     0.9009   0.3003
:  > arima(y, xreg=c(NA, x[-9]))
:       intercept  c(NA, x[-9])
:          0.4392        0.8600
: s.e.     0.2372        0.0745
: sigma^2 estimated as 0.3937:  log likelihood = -7.62,  aic = 21.25
:  > arima(ts(y), xreg=lag(ts(x)))
: arima(x = ts(y), xreg = lag(ts(x)))
:       intercept  lag(ts(x))
:          1.2872     -0.1064
: s.e.     0.9009      0.3003
: sigma^2 estimated as 6.492:  log likelihood = -21.19,  aic = 48.38
: 

Here is some sample code:

R> # following 3 lines are from your post
R> set.seed(1)
R> x <- rep(c(rep(0, 4), 9), len=9)
R> y <- (rep(c(rep(0, 5), 9), len=9)+rnorm(9)) # y[t] = x[t-1]+e
R> 
R> # here are some examples using ts class - first one uses no lag
R> lm(y ~ x, cbind(y = ts(y), x = ts(x)))

Call:
lm(formula = y ~ x, data = cbind(y = ts(y), x = ts(x)))

Coefficients:
(Intercept)            x  
     1.2872      -0.1064  

R> # now lets redo it with a lag. 
R> lm(y ~ lagx, cbind(y = ts(y), lagx = lag(ts(x), -1)) )

Call:
lm(formula = y ~ lagx, data = cbind(y = ts(y), lagx = lag(ts(x),     -1)))

Coefficients:
(Intercept)         lagx  
     0.4392       0.8600  

R> # here is arima without a lag
R> b <- cbind(ts(y), ts(x))
R> arima(b[,1], order = c(1,1,1), xreg = b[,2])

Call:
arima(x = b[, 1], order = c(1, 1, 1), xreg = b[, 2])

Coefficients:
         ar1      ma1   b[, 2]
      0.3906  -1.0000  -0.3803
s.e.  0.4890   0.4119   0.3753

sigma^2 estimated as 7.565:  log likelihood = -20.2,  aic = 48.4

R> # and now we redo arima with a lag
R> bb <- cbind(ts(y), lag(ts(x),-1))
R> arima(bb[,1], order = c(1,1,1), xreg = bb[,2])

Call:
arima(x = bb[, 1], order = c(1, 1, 1), xreg = bb[, 2])

Coefficients:
          ar1      ma1  bb[, 2]
      -0.2991  -0.8252   0.8537
s.e.   0.4516   1.0009   0.0838

sigma^2 estimated as 0.444:  log likelihood = -7.9,  aic = 23.8

R> # you can alternately use the I notation with lm and ts objects
R> # if you load zoo first
R> library(zoo)
R> yt <- ts(y); xt <- ts(x)
R> lm(I(yt ~ xt))

Call:
lm(formula = I(yt ~ xt))

Coefficients:
(Intercept)           xt  
     1.2872      -0.1064  

R> lm(I(yt ~ lag(xt, -1)))

Call:
lm(formula = I(yt ~ lag(xt, -1)))

Coefficients:
(Intercept)  lag(xt, -1)  
     0.4392       0.8600
#
Dirk Eddelbuettel <edd <at> debian.org> writes:
Here is the example redone using zoo:

R> # here it is redone using zoo objects
R> 
R> # following 3 lines are from the original post
R> set.seed(1)
R> x <- rep(c(rep(0, 4), 9), len=9)
R> y <- (rep(c(rep(0, 5), 9), len=9)+rnorm(9)) # y[t] = x[t-1]+e
R> 
R> library(zoo)
R> 
R> yz <- zoo(y); xz <- zoo(x)
R> lm(I(yz ~ xz))

Call:
lm(formula = I(yz ~ xz))

Coefficients:
(Intercept)           xz  
     1.2872      -0.1064  

R> lm(I(yz ~ lag(xz, -1)))

Call:
lm(formula = I(yz ~ lag(xz, -1)))

Coefficients:
(Intercept)  lag(xz, -1)  
     0.4392       0.8600  

R> 
R> z <- merge(yz, xz)
R> arima(coredata(z[,1]), order = c(1,1,1), xreg = coredata(z[,2]))

Call:
arima(x = coredata(z[, 1]), order = c(1, 1, 1), xreg = coredata(z[, 2]))

Coefficients:
         ar1      ma1  coredata(z[, 2])
      0.3906  -1.0000           -0.3803
s.e.  0.4890   0.4119            0.3753

sigma^2 estimated as 7.565:  log likelihood = -20.2,  aic = 48.4
 
R> zz <- merge(yz, lag(xz, -1))
R> arima(coredata(zz[,1]), order = c(1,1,1), xreg = coredata(zz[,2]))

Call:
arima(x = coredata(zz[, 1]), order = c(1, 1, 1), xreg = coredata(zz[, 2]))

Coefficients:
          ar1      ma1  coredata(zz[, 2])
      -0.2991  -0.8252             0.8537
s.e.   0.4516   1.0009             0.0838

sigma^2 estimated as 0.444:  log likelihood = -7.9,  aic = 23.8
#
I use the following two function for a lagged regression:

lm.lag=function(y,lag=1) summary(lm(embed(y,lag+1)[,1]~embed(y,lag+1)[,2:(lag+1)]))
lm.lag.x=function(y,x,lag=1) summary(lm(embed(y,lag+1)[,1]~embed(x,lag+1)[,2:(lag+1)]))

for, respectively,

y_t=a+b_1*y_t-1+...+b_lag*y_t-lag
y_t=a+b_1*x_t-1+...+b_lag*x_t-lag

I am not quite sure whether this an answer to your question, but here
are two examples:

set.seed(7)
ar1=arima.sim(n=300,list(ar=0.8))
lm.lag(ar1)
lm.lag.x(ar1,ar1)

set.seed(8)
ar3=arima.sim(n = 200, list(ar = c(0.4, -0.5, 0.7)))
lm.lag(ar3,3) 
lm.lag.x(ar3,ar3,3)

Best wishes,
Rem
Saturday, March 5, 2005, 6:14:15 PM, you wrote:
SG>       Is it possible to fit a lagged regression, "y[t]=b0+b1*x[t-1]+e",
SG> using the function "lag"?  If so, how?  If not, of what use is the
SG> function "lag"?  I get the same answer from y~x as y~lag(x), whether
SG> using lm or arima.  I found it using y~c(NA, x[-length(x)])). Consider
SG> the following: 

 >> set.seed(1)
 >> x <- rep(c(rep(0, 4), 9), len=9)
 >> y <- (rep(c(rep(0, 5), 9), len=9)+rnorm(9)) # y[t] = x[t-1]+e
 >>
 >> lm(y~x)
SG> (Intercept)            x 
SG>      1.2872      -0.1064 
 >> lm(y~lag(x))
SG> (Intercept)       lag(x) 
SG>      1.2872      -0.1064 
 >> arima(y, xreg=x)
SG>       intercept        x
SG>          1.2872  -0.1064
SG> s.e.     0.9009   0.3003
SG> sigma^2 estimated as 6.492:  log likelihood = -21.19,  aic = 48.38
 >> arima(y, xreg=lag(x))
SG>       intercept   lag(x)
SG>          1.2872  -0.1064
SG> s.e.     0.9009   0.3003
 >> arima(y, xreg=c(NA, x[-9]))
SG>       intercept  c(NA, x[-9])
SG>          0.4392        0.8600
SG> s.e.     0.2372        0.0745
SG> sigma^2 estimated as 0.3937:  log likelihood = -7.62,  aic = 21.25
 >> arima(ts(y), xreg=lag(ts(x)))
SG> arima(x = ts(y), xreg = lag(ts(x)))
SG>       intercept  lag(ts(x))
SG>          1.2872     -0.1064
SG> s.e.     0.9009      0.3003
SG> sigma^2 estimated as 6.492:  log likelihood = -21.19,  aic = 48.38
 
SG>       Thanks for your help. 
SG>       Spencer Graves

SG> ______________________________________________
SG> R-help at stat.math.ethz.ch mailing list
SG> https://stat.ethz.ch/mailman/listinfo/r-help
SG> PLEASE do read the posting guide!
SG> http://www.R-project.org/posting-guide.html
#
Remigijus Lapinskas <remigijus.lapinskas <at> maf.vu.lt> writes:

: 
: I use the following two function for a lagged regression:
: 
: lm.lag=function(y,lag=1) summary(lm(embed(y,lag+1)[,1]~embed(y,lag+1)[,2:
(lag+1)]))
: lm.lag.x=function(y,x,lag=1) summary(lm(embed(y,lag+1)[,1]~embed(x,lag+1)[,2:
(lag+1)]))
: 
: for, respectively,
: 
: y_t=a+b_1*y_t-1+...+b_lag*y_t-lag
: y_t=a+b_1*x_t-1+...+b_lag*x_t-lag


One could combine these into one function like this:

lm.lag <- function(y, x = y, lag = 1) 
   summary( lm( embed(y, lag+1)[,1] ~ embed(x, lag+1)[,-1] ) )