Skip to content

Large dataset

2 messages · KAM Tin Seong, Roger Bivand

#
On Wed, 11 Jul 2007, KAM Tin Seong wrote:

            
The obvious first question is why? If you take a sample of 10000 points, 
you will get a very good approximation of the khat value at the s-bins - 
so why try all 360K? Either take tiles of your data set, or some other 
partition - since you are assuming anyway that the process is the same 
over the whole area, aren't you? You will certainly find that simulating 
from a CSR for 360K to put an envelope on the result will also be memory 
intensive.

By the way, always include your specific code, and the output of 
sessionInfo(). It may be that you have a very complex polygon windoe, 
and/or too much detail in the s-bins. Do you get the same memory 
constraint in spatstat, or equivalently in spatial?

I assume that you are looking to see if the point pattern is clustered, is 
that correct? If you really need to use all the points, we'd need to debug 
the command to see where the memory allocation is occurring, since it may 
be an avoidable copying operation somewhere. More RAM is another option.

Roger