gamm4 error with large dataset

Ben Bolker · 2014-04-30T16:23:30Z

On 14-04-30 12:03 PM, Daniel Hocking wrote: > I am trying to predict daily water temperature from air temperature > primarily but ideally would include other factors such as > precipitation and landscape characteristics. I have paired air and > water temperatures from 600+ sites over a ~10 year period. Some sites > have daily temperatures for just a couple months and others for > years, and some for a couple months sporadically in different years. > I am trying to use a mixed effects gamm so I c

Ben Bolker

Wed, Apr 30, 2014 9:23 AM

On 14-04-30 12:03 PM, Daniel Hocking wrote:

I can imagine that this problem is caused by the size of the
fixed-effect matrix.  A couple of thoughts (none of them practical, I'm
afraid):

  * I was going to say that it's too bad that we haven't yet managed to
implement a sparse model matrix structure;
  * then I was going to say that a potential trick/workaround for this
(for many-level _categorical_ variables) is to treat the factor as a
random effect, then use devFunOnly/modular structure to fix the theta
parameter for that variable at a large value, making it a pseudo-fixed
effect and getting the benefits of (1) a little bit of regularization
and (2) model matrix sparsity -- but doing this within gamm4 would be
harder/require more hacking
  * then I realized that your fixed-effect model matrix probably isn't
sparse, because it looks like it's made up entirely of continuous covariates
  * that got me thinking about the fact that some of your continuous
covariates only vary at higher levels (i.e. Lat/Long and presumably
Forest, Agriculture, elevation, etc.), and wondering whether there would
be any way to save space by going back to the underlying model
formulation and writing this out in terms of another multiplication of
higher-level covariates times an indicator matrix ...

  ... all of which is fascinating (to me at least) but none of which
actually gets you any farther with your specific problem.  Sorry.

  Ben Bolker

gamm4 error with large dataset

Thread (2 messages)