Repeated Observations Linear Mixed Model with Outside-Group Spatial Correlation

Wed, Mar 22, 2017 9:51 AM

Thanks for the responses everyone.

David and Ben - I am in the midst of trying the hack right now. I created a
single 'group' with all observations. Then I constructed the correlation
distance matrix using three variables latitude, longitude, and *time*. This
way I can capture the correlations from repeated observations (they have
the same latitude and longitude) as a function of their temporal distance
(time) and the correlations across stations as a function of their spatial
and temporal distance. I think this structure is actually better and more
flexible. I am still trying to figure out how to determine the tradeoff
between spatial distance and temporal distance that I want. Also, as Ben
suggested this method appears to be quite inefficient. The model has been
running for 11 hours and is using around 12gb of RAM. I think I might
create groups by clustering the stations geographically to reduce the
computational burden of what is now 12,271x12,271 correlation matrix.

Dexter -  lm::lm(), nlme::lme(), and lme4::lmer() all have the function
resid(model) that provides the residual for each observation in the model.
I used the package "ape" to calculate the Moran I.

model_spatial = lme(log(counts) ~ log(Population)
                 +Drive +Med_Income + Buff2 +Rain + Temperature
                 , data = Data, random = ~1|id, method = "ML" )

model_All = lmer(log(counts) ~ log(Population)
                 +Drive +Med_Income + Buff2 +Rain + Temperature
                 + (1|id)
                  , data = Data )
Data$resid_spat = resid(model_spatial )
Data$resid_all = resid(model_All )

Thierry - I assume you meant "nlme *cant *do what you would like to do"?

I will check out the INLA, spaMM, and HSAR packages if the nlme workaround
does not actually work.

On Wed, Mar 22, 2017 at 11:02 AM, Ben Bolker <bbolker at gmail.com> wrote:

  The spaMM package is also worth a try.

  A hack suggested a while ago by Dormann et al. is to make a single
group that includes all observations, but (1) it's pretty inefficient
and (2) I don't think you can have that *and* another, non-trivial
grouping factor.

On 17-03-22 10:46 AM, Dexter Locke wrote:

Dear list,

You may consider also the citations below and HSAR package:
https://cran.r-project.org/web/packages/HSAR/vignettes/hsar.html

Dear Michael,

Do you have comparable code for extracting the residuals of the model and
testing for spatial autocorrelation that works with your second model
"model_All" created with lme4::lmer ?  What package contains the

"Moran.I"

function?

- Dexter

Dong, G., & Harris, R. (2014). Spatial Autoregressive Models for
Geographically Hierarchical Data Structures. *Geographical Analysis*,
n/a-n/a. http://doi.org/10.1111/gean.12049

Dong, G., Harris, R., Jones, K., & Yu, J. (2015). Multilevel Modelling

with

Spatial Interaction Effects with Application to an Emerging Land Market

in

Beijing, China. *PloS One*, *10*(6), e0130761.
http://doi.org/10.1371/journal.pone.0130761

Dong, G., Ma, J., Harris, R., & Pryce, G. (2016). Spatial Random Slope
Multilevel Modeling Using Multivariate Conditional Autoregressive

Models: A

Case Study of Subjective Travel Satisfaction in Beijing. *Annals of the
American Association of Geographers*, *106*(1), 19?35.
http://doi.org/10.1080/00045608.2015.1094388

On Wed, Mar 22, 2017 at 5:07 AM, Thierry Onkelinx <

thierry.onkelinx at inbo.be>

wrote:

Dear Michael,

The correlation structures in nlme assume correlation among the
residuals **within** the most detail level of the random effects.
Residuals of observations originating from different levels of the
random effects are assumed to be uncorrelated. So nlme can do what you
would like to do.

As Ben already mentioned, INLA is useful as it allows for spatially
correlated random effects. You can find information on the INLA
website (www.r-inla.org) and in a few books. e.g.
- Blangiardo & Cameletti (2015) Spatial and Spatio-temporal Bayesian
Model with R - INLA
- Zuur et al (in press) Beginner's Guide to Spatial, Temporal and
Spatial-Temporal Ecological Data Analysis with R-INLA: Using GLM and
GLMM

Best regards,
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no
more than asking him to perform a post-mortem examination: he may be
able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does
not ensure that a reasonable answer can be extracted from a given body
of data. ~ John Tukey


2017-03-21 22:19 GMT+01:00 Michael Hyland <mhyland at u.northwestern.edu>:

Thanks for the quick response.

This is a subset. Full dataset is 12,266 observations across 530 groups

(or

Bikeshare stations).

for
model_spatial.gau <- update(model_spatial, correlation = corGaus(form

= ~

latitude +longitude ), method = "ML")
and
model_spatial.gau <- update(model_spatial, correlation = corGaus(form

= ~

latitude +longitude| id ), method = "ML")
the error is "cannot have zero distances in "corSpatial" which I assume

is

due to the repeated observations having the same exact lat and lon;
therefore zero distance

Moreover, when I do anything with '| id' I think the model only

accounts

for 'within-group' correlations, not across stations


On Tue, Mar 21, 2017 at 4:12 PM, Ben Bolker <bbolker at gmail.com> wrote:

Your approach seems about right.

- What precisely does "unsuccessful" mean?  warnings, errors,
ridiculous answers?
- Is this your whole data set or a subset?
- centering and scaling predictors is always worth a shot to fix
numeric problems
- INLA is more powerful than lme for fitting spatial correlations,
although it's a *steep* learning curve ...


On Tue, Mar 21, 2017 at 5:08 PM, Michael Hyland
<mhyland at u.northwestern.edu> wrote:

Hi,

I'm new to the listserv.

A shortened version of my dataset is below. I am developing a model

to

forecast monthly ridership at Bikeshare stations. I want to predict

'Cnts'

as a function of 'Population' - 'Temperature'. The dataset is

unbalanced

(unequal number of observations for each station) and most of

covariates

do

not vary over time, but a few do.

I have successfully used lmer() and lme() in R to capture the

dependency

between the error terms for repeated observations from a given

station

('id').

model_spatial = lme(log(counts) ~ log(Population)
                 +Drive +Med_Income + Buff2 +Rain + Temperature
                 , data = Data, random = ~1|id, method = "ML" )

model_All = lmer(log(counts) ~ log(Population)
                 +Drive +Med_Income + Buff2 +Rain + Temperature
                 + (1|id)
                  , data = Data )

However, a Moran's I test of the residuals suggests that the

residuals

are

spatially correlated.
station.dists <- as.matrix(dist(cbind(Data$longitude,

Data$latitude)))

station.dists.inv <- 1/station.dists
station.dists.inv[is.infinite(station.dists.inv)] <- 0   #Distance

value is

inf for repeated observations from the same station
Data$resid_all = resid(model_spatial)
Moran.I(Data$resid_all, station.dists.inv)


Hence, I need to develop a model in R that accounts for spatial

correlation

across stations, while simultaneously capturing correlations between
observations from a single station.  I've tried the following updates

to

the lme() model, but have been unsuccessful.
model_spatial.gau <- update(model_spatial, correlation = corGaus(form

= ~

latitude +longitude ), method = "ML")
model_spatial.gau <- update(model_spatial, correlation = corGaus(form

= ~

latitude +longitude| id ), method = "ML")

Is there a way to formulate the correlation matrix in lme() or lmer()

such

that the correlation between repeated obvservations of a given

station

*and*

the spatial autocorrelation between stations is accounted for?


year month id Cnts latitude longitude Population Drive Med_Income

Buff2

Rain

Temperature
2015 8 2 2597 41.87264 -87.62398 4256 0.3418054 76857 127 0.07 71.8
2015 9 2 2772 41.87264 -87.62398 4256 0.3418054 76857 128 0.15 69
2015 10 2 684 41.87264 -87.62398 4256 0.3418054 76857 128 0.07 54.7
2015 11 2 394 41.87264 -87.62398 4256 0.3418054 76857 128 0.15 44.6
2015 12 2 148 41.87264 -87.62398 4256 0.3418054 76857 129 0.16 39
2016 1 2 44 41.87264 -87.62398 4256 0.3418054 76857 129 0.03 24.7
2015 5 3 2303 41.86723 -87.61536 16735 0.4312349 75227 90 0.15 60.4
2015 6 3 3323 41.86723 -87.61536 16735 0.4312349 75227 98 0.24 67.4
2015 7 3 5920 41.86723 -87.61536 16735 0.4312349 75227 98 0.09 72.3
2015 8 3 4405 41.86723 -87.61536 16735 0.4312349 75227 98 0.07 71.8
2015 9 3 3638 41.86723 -87.61536 16735 0.4312349 75227 99 0.15 69
2015 10 3 2061 41.86723 -87.61536 16735 0.4312349 75227 99 0.07 54.7
2015 11 3 1074 41.86723 -87.61536 16735 0.4312349 75227 99 0.15 44.6
2015 12 3 374 41.86723 -87.61536 16735 0.4312349 75227 100 0.16 39
2016 1 3 188 41.86723 -87.61536 16735 0.4312349 75227 100 0.03 24.7
2016 2 3 474 41.86723 -87.61536 16735 0.4312349 75227 100 0.04 30.4
2015 6 4 2968 41.85627 -87.61335 16735 0.4312349 75227 68 0.24 67.4
2015 7 4 4266 41.85627 -87.61335 16735 0.4312349 75227 68 0.09 72.3
2015 8 4 3442 41.85627 -87.61335 16735 0.4312349 75227 68 0.07 71.8
2015 9 4 2552 41.85627 -87.61335 16735 0.4312349 75227 69 0.15 69
2015 10 4 1301 41.85627 -87.61335 16735 0.4312349 75227 69 0.07 54.7


Thanks,
Mike Hyland

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Repeated Observations Linear Mixed Model with Outside-Group Spatial Correlation

Thread (8 messages)