Hi,
I have a spatial data set with many observations (~50,000) and would like to
keep as much data as possible. There is spatial dependence, so I am
attempting a mixed model in R with a spherical variogram defining the
correlation as a function of distance between points. I have tried nlme,
lme, glmmML, and glmmPQL. In all case the matrix needed (seems to be
(N^2)/2 - N) is too large for my machine to handle even when maxed
(memory.limit and virtual memory in vista). Past the range of my variogram
(which I have a good estimate of), the matrix that R is trying to allocate
will have 0 values (I believe). Therefore, it seems wasteful to allocate
the full matrix. Is there a way to 'trim' it so that the matrix size (and
hopefully processing wait time) is decreased? Further, it seems the matrix
is now being filled with double precision data. Is there a way to lessen
precision and so save memory? If I do find a way (probably will be forced to
decrease N), for a logistic regression, which of the functions I mentioned
is likely to execute more quickly with usual settings/output? I'm asking
for a rough idea in advance because of processing time limits. I believe
glmmPQL will likely be slower due to the multiple calls to lme. Thanks for
any advice/insight. -seth
Update on above. I sampled my data to create a 10,000 observation data set.
I then tried lme with a correlation = corSpher and only one predictor, as a
test. I set my memory.limit to the max allowable. It ran for a while then
returned
Error: cannot allocate vector of size 64.0 Mb.
I can see how 50K obs busted it. But 64 Mb? Perhaps there is another limit
set by the lme function? -seth