spaMM::fitme() - a glmm for longitudinal data that accounts for spatial autocorrelation

Dear Thierry,

please provide a reproducible example so that we know what you have 
actually done.

Best,

F.

Le 14/07/2020 ? 20:00, Thierry Onkelinx a ?crit?:
Dear Fran?ois and Sarah,

INLA seems more efficient. I ran a model with Mattern correlation 
structure on 13K locations (1 observation per location) in under 10 
minutes on a laptop with 16GB RAM.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE 
AND FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx at inbo.be <mailto:thierry.onkelinx at inbo.be>
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be <http://www.inbo.be>

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no 
more than asking him to perform a post-mortem examination: he may be 
able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does 
not ensure that a reasonable answer can be extracted from a given body 
of data. ~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////

<https://www.inbo.be>

Op di 14 jul. 2020 om 18:22 schreef Francois Rousset 
<francois.rousset at umontpellier.fr 
<mailto:francois.rousset at umontpellier.fr>>:

    Dear Sarah,

    Le 14/07/2020 ? 16:55, Sarah Chisholm a ?crit?:
    > Hi Mollie, thank you for your suggestion. glmmTMB seems like a good
    > option for my needs as well. In your sample code above, can you
    > explain what the term 'group' does in matern(pos+0|group)? Does
    this
    > allow the spatial correlation structure to be applied to specific
    > groupings in the data (in my case, for example, by 'continent')?
    >
    > Francois, thank you for this very clear answer. This is a very
    > convenient feature of the function! May I ask you a couple of other
    > questions about some issues that I've had with spaMM::fitme()?
    >
    > In particular, when I try fitting this model to a large data set
    (~14
    > 000 rows x 7 columns, ~2 MB), the model will run for an extended
    > period of time, to the point where I've had to terminate the
    > computation. I've tried applying the suggestions that are
    mentioned in
    > the user guide, i.e. setting?init=list(lambda=0.1)
    > and?init=list(lambda=NaN). Implementing init=list(lambda=0.1)
    returned
    > an error suggesting that there was a lack of memory, while
    running the
    > model with init=list(lambda=NaN) also ran for an extended period of
    > time without completing. Is there something else I can do to
    speed up
    > the fit of these models?
    >
    > I've had a similar problem with an even larger data set (~185
    000 rows
    > x 8 columns, ~21 MB), where, when I try running the model, this
    error
    > is returned immediately:
    >
    > ErrorinZA %*%xmatrix :Cholmoderror 'problem too large'at file
    > ../Core/cholmod_dense.c,line 105
    >
    > I've tried running this model on two devices, both with a 64-bit OS
    > with Windows 10, one with 32 GB of RAM and the other with 64 GB.
    I've
    > gotten the same error from both devices. Is there a way that
    fitme()
    > can accommodate these large data sets?
    spaMM can handle large data sets, but the first issue to consider
    here
    is the number of distinct locations for the spatial random effect.
    The
    large correlation matrices of geostatistical models will always be a
    problem, both in terms of memory requirements and of potentially huge
    computation times. My guess from past experiments is that one should
    still be able to fit models with ~ 10K locations within a few days
    on a
    computer with <60 Gb of RAM (given perhaps some tinkering of the
    arguments), so at least the data set of 14 000 rows should be
    feasible,
    particularly if the number of locations is smaller.

    Anyone planning to analyze large spatial data sets should anticipate
    these problems and check by themselves whether there is any practical
    alternative suitable for their particular problem. The discussion in
    section 6.2 of the "gentle introduction" to spaMM may then be useful.

    Best,

    F.

    >
    > Thank you,
    >
    > Sarah
    ? ? ? ? [[alternative HTML version deleted]]

    _______________________________________________
    R-sig-mixed-models at r-project.org
    <mailto:R-sig-mixed-models at r-project.org> mailing list
    https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

spaMM::fitme() - a glmm for longitudinal data that accounts for spatial autocorrelation

Thread (15 messages)