Skip to content

multilevel time series?

3 messages · Malcolm Fairbrother, ONKELINX, Thierry, Douglas Bates

#
Dear all,

In macro-social science, it's become fairly conventional to analyse repeated cross-sectional survey data using three-level models. Individual survey espondents (level-1) are nested in state-years (level-2), which are in turn nested within states (level-3). One big pay-off is the ability to examine how time-constant or time-varying state-level variables affect level-1 outcomes.

A co-author and I recently had a reviewer question whether this approach is adequate, however. He/she suggested that this approach could generate very misleading results, if the data are nonstationary. (We just included a linear time effect in our models.) So I'm thinking about how to proceed (and I'm not particularly knowledgeable about time series analysis). Any advice would be much appreciated. We used lme4 to fit the models in our paper, and we have several tens of thousands of respondents nested in 48 states, each observed about 15 or 16 times over about a 30-year period.

(1) Is the reviewer's query? Is he/she right to question this approach?

(2) How might we test for nonstationarity? The reviewer mentioned differencing the outcome variable, but in a multilevel context I'm not sure how to do that... Perhaps we could calculate an *aggregate* value for every state-year, and check the aggregated data for autocorrelation? My understanding is that autocorrelation across multiple lags is a strong indicator of nonstationarity (while, conversely, the absence of multiple-lag autocorrelation is almost a guarantee of stationarity). I believe this can be done with nlme, as a two-level model, with state-years nested within states.

(3) However, that approach would seem to throw away a lot of level-1 information (about individual respondents), and I'm not sure about the implications for any significance tests. An alternative approach would seem to be "multilevel time series", where autocorrelation at the *group* rather than individual/first level is specifically allowed for in the model. However, I can't find any references to R packages (or other software) that allow for the specification of, for example, AR1 processes at anything other than level-1 in multilevel models.

In short, I'd be curious to hear what people think... (especially if anyone out there happens to be a whiz at both multilevel and time series analysis). I hope I've been clear about the problem, but I'm happy to elaborate. Thanks in advance for any help.

Cheers,
Malcolm


Dr Malcolm Fairbrother
Lecturer
School of Geographical Sciences
University of Bristol
#
Dear Malcolm,

Your design requires IMHO crossed random effects instead of nested
random effects. Individual is clearly crossed with year. Each individual
can be surveyed in more that one year and vice versa. If they were
nested, all data from a specific individual would come from only one
specific year. The same goes for state and year, they are rather crossed
than nested.

Fitting year as a crossed random effect will take nonstationarity along
time into account. The size of variance of this random effect will
indicate how strong this nonstationarity is.

HTH,

Thierry

------------------------------------------------------------------------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie & Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics & Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
Thierry.Onkelinx at inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
Druk dit bericht a.u.b. niet onnodig af.
Please do not print this message unnecessarily.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.
3 days later
#
On Mon, Sep 27, 2010 at 3:34 AM, ONKELINX, Thierry
<Thierry.ONKELINX at inbo.be> wrote:
Malcolm's original description mentions modeling a linear trend in
time, which would make sense to me.  Even taking into account the fact
that a person can move from one state to another (hence you don't have
strict nesting of the person and state factors) such data can still be
analyzed using lme4.  Before doing so I would want to plot response
versus time for several individuals, just to see if a linear trend
looks adequate.  Having 15 to 20 different time points per subject
would allow you to model more than a linear trend within subject.

Sometimes people will approach such a case using time series methods,
even though the series are rather short.  Simple relationships like an
AR1 (first-order autoregressive) model generate marginal covariance
patterns that are very similar to that generated by a model with
per-subject random effects for the intercept and the slope with
respect to time.  This is why I don't usually combine these terms.  It
is hard to separate out the effect of each.

Your suggestion is somewhat different.  It is more like a panel data
type of model and could definitely be appropriate if the effect of a
particular year was more-or-less common across subjects.  This type of
model is applied to data like the quarterly profits of several
companies.  Macro-economic forces can (and did) have industry-wide
effects on the Q1 results in 2009 so it makes sense to regard each
time period as distinct.

If, on the other hand, you had time trends within individuals but not
synchronized across time periods then I would set up a model for the
within-subject time trends and try to incorporate random effects in
that model, as Malcolm seems to indicate they have done.