Repeated Observations Linear Mixed Model with Outside-Group Spatial Correlation
Thanks for the responses everyone.
David and Ben - I am in the midst of trying the hack right now. I created a
single 'group' with all observations. Then I constructed the correlation
distance matrix using three variables latitude, longitude, and *time*. This
way I can capture the correlations from repeated observations (they have
the same latitude and longitude) as a function of their temporal distance
(time) and the correlations across stations as a function of their spatial
and temporal distance. I think this structure is actually better and more
flexible. I am still trying to figure out how to determine the tradeoff
between spatial distance and temporal distance that I want. Also, as Ben
suggested this method appears to be quite inefficient. The model has been
running for 11 hours and is using around 12gb of RAM. I think I might
create groups by clustering the stations geographically to reduce the
computational burden of what is now 12,271x12,271 correlation matrix.
Dexter - lm::lm(), nlme::lme(), and lme4::lmer() all have the function
resid(model) that provides the residual for each observation in the model.
I used the package "ape" to calculate the Moran I.
model_spatial = lme(log(counts) ~ log(Population)
+Drive +Med_Income + Buff2 +Rain + Temperature
, data = Data, random = ~1|id, method = "ML" )
model_All = lmer(log(counts) ~ log(Population)
+Drive +Med_Income + Buff2 +Rain + Temperature
+ (1|id)
, data = Data )
Data$resid_spat = resid(model_spatial )
Data$resid_all = resid(model_All )
Thierry - I assume you meant "nlme *cant *do what you would like to do"?
I will check out the INLA, spaMM, and HSAR packages if the nlme workaround
does not actually work.
On Wed, Mar 22, 2017 at 11:02 AM, Ben Bolker <bbolker at gmail.com> wrote:
The spaMM package is also worth a try. A hack suggested a while ago by Dormann et al. is to make a single group that includes all observations, but (1) it's pretty inefficient and (2) I don't think you can have that *and* another, non-trivial grouping factor. On 17-03-22 10:46 AM, Dexter Locke wrote:
Dear list, You may consider also the citations below and HSAR package: https://cran.r-project.org/web/packages/HSAR/vignettes/hsar.html Dear Michael, Do you have comparable code for extracting the residuals of the model and testing for spatial autocorrelation that works with your second model "model_All" created with lme4::lmer ? What package contains the
"Moran.I"
function? - Dexter Dong, G., & Harris, R. (2014). Spatial Autoregressive Models for Geographically Hierarchical Data Structures. *Geographical Analysis*, n/a-n/a. http://doi.org/10.1111/gean.12049 Dong, G., Harris, R., Jones, K., & Yu, J. (2015). Multilevel Modelling
with
Spatial Interaction Effects with Application to an Emerging Land Market
in
Beijing, China. *PloS One*, *10*(6), e0130761. http://doi.org/10.1371/journal.pone.0130761 Dong, G., Ma, J., Harris, R., & Pryce, G. (2016). Spatial Random Slope Multilevel Modeling Using Multivariate Conditional Autoregressive
Models: A
Case Study of Subjective Travel Satisfaction in Beijing. *Annals of the American Association of Geographers*, *106*(1), 19?35. http://doi.org/10.1080/00045608.2015.1094388 On Wed, Mar 22, 2017 at 5:07 AM, Thierry Onkelinx <
thierry.onkelinx at inbo.be>
wrote:
Dear Michael, The correlation structures in nlme assume correlation among the residuals **within** the most detail level of the random effects. Residuals of observations originating from different levels of the random effects are assumed to be uncorrelated. So nlme can do what you would like to do. As Ben already mentioned, INLA is useful as it allows for spatially correlated random effects. You can find information on the INLA website (www.r-inla.org) and in a few books. e.g. - Blangiardo & Cameletti (2015) Spatial and Spatio-temporal Bayesian Model with R - INLA - Zuur et al (in press) Beginner's Guide to Spatial, Temporal and Spatial-Temporal Ecological Data Analysis with R-INLA: Using GLM and GLMM Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey 2017-03-21 22:19 GMT+01:00 Michael Hyland <mhyland at u.northwestern.edu>:
Thanks for the quick response. This is a subset. Full dataset is 12,266 observations across 530 groups
(or
Bikeshare stations). for model_spatial.gau <- update(model_spatial, correlation = corGaus(form
= ~
latitude +longitude ), method = "ML") and model_spatial.gau <- update(model_spatial, correlation = corGaus(form
= ~
latitude +longitude| id ), method = "ML") the error is "cannot have zero distances in "corSpatial" which I assume
is
due to the repeated observations having the same exact lat and lon; therefore zero distance Moreover, when I do anything with '| id' I think the model only
accounts
for 'within-group' correlations, not across stations On Tue, Mar 21, 2017 at 4:12 PM, Ben Bolker <bbolker at gmail.com> wrote:
Your approach seems about right. - What precisely does "unsuccessful" mean? warnings, errors, ridiculous answers? - Is this your whole data set or a subset? - centering and scaling predictors is always worth a shot to fix numeric problems - INLA is more powerful than lme for fitting spatial correlations, although it's a *steep* learning curve ... On Tue, Mar 21, 2017 at 5:08 PM, Michael Hyland <mhyland at u.northwestern.edu> wrote:
Hi, I'm new to the listserv. A shortened version of my dataset is below. I am developing a model
to
forecast monthly ridership at Bikeshare stations. I want to predict
'Cnts'
as a function of 'Population' - 'Temperature'. The dataset is
unbalanced
(unequal number of observations for each station) and most of
covariates
do
not vary over time, but a few do. I have successfully used lmer() and lme() in R to capture the
dependency
between the error terms for repeated observations from a given
station
('id').
model_spatial = lme(log(counts) ~ log(Population)
+Drive +Med_Income + Buff2 +Rain + Temperature
, data = Data, random = ~1|id, method = "ML" )
model_All = lmer(log(counts) ~ log(Population)
+Drive +Med_Income + Buff2 +Rain + Temperature
+ (1|id)
, data = Data )
However, a Moran's I test of the residuals suggests that the
residuals
are
spatially correlated. station.dists <- as.matrix(dist(cbind(Data$longitude,
Data$latitude)))
station.dists.inv <- 1/station.dists station.dists.inv[is.infinite(station.dists.inv)] <- 0 #Distance
value is
inf for repeated observations from the same station Data$resid_all = resid(model_spatial) Moran.I(Data$resid_all, station.dists.inv) Hence, I need to develop a model in R that accounts for spatial
correlation
across stations, while simultaneously capturing correlations between observations from a single station. I've tried the following updates
to
the lme() model, but have been unsuccessful. model_spatial.gau <- update(model_spatial, correlation = corGaus(form
= ~
latitude +longitude ), method = "ML") model_spatial.gau <- update(model_spatial, correlation = corGaus(form
= ~
latitude +longitude| id ), method = "ML") Is there a way to formulate the correlation matrix in lme() or lmer()
such
that the correlation between repeated obvservations of a given
station
*and*
the spatial autocorrelation between stations is accounted for? year month id Cnts latitude longitude Population Drive Med_Income
Buff2
Rain
Temperature
2015 8 2 2597 41.87264 -87.62398 4256 0.3418054 76857 127 0.07 71.8
2015 9 2 2772 41.87264 -87.62398 4256 0.3418054 76857 128 0.15 69
2015 10 2 684 41.87264 -87.62398 4256 0.3418054 76857 128 0.07 54.7
2015 11 2 394 41.87264 -87.62398 4256 0.3418054 76857 128 0.15 44.6
2015 12 2 148 41.87264 -87.62398 4256 0.3418054 76857 129 0.16 39
2016 1 2 44 41.87264 -87.62398 4256 0.3418054 76857 129 0.03 24.7
2015 5 3 2303 41.86723 -87.61536 16735 0.4312349 75227 90 0.15 60.4
2015 6 3 3323 41.86723 -87.61536 16735 0.4312349 75227 98 0.24 67.4
2015 7 3 5920 41.86723 -87.61536 16735 0.4312349 75227 98 0.09 72.3
2015 8 3 4405 41.86723 -87.61536 16735 0.4312349 75227 98 0.07 71.8
2015 9 3 3638 41.86723 -87.61536 16735 0.4312349 75227 99 0.15 69
2015 10 3 2061 41.86723 -87.61536 16735 0.4312349 75227 99 0.07 54.7
2015 11 3 1074 41.86723 -87.61536 16735 0.4312349 75227 99 0.15 44.6
2015 12 3 374 41.86723 -87.61536 16735 0.4312349 75227 100 0.16 39
2016 1 3 188 41.86723 -87.61536 16735 0.4312349 75227 100 0.03 24.7
2016 2 3 474 41.86723 -87.61536 16735 0.4312349 75227 100 0.04 30.4
2015 6 4 2968 41.85627 -87.61335 16735 0.4312349 75227 68 0.24 67.4
2015 7 4 4266 41.85627 -87.61335 16735 0.4312349 75227 68 0.09 72.3
2015 8 4 3442 41.85627 -87.61335 16735 0.4312349 75227 68 0.07 71.8
2015 9 4 2552 41.85627 -87.61335 16735 0.4312349 75227 69 0.15 69
2015 10 4 1301 41.85627 -87.61335 16735 0.4312349 75227 69 0.07 54.7
Thanks,
Mike Hyland
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models