Fellow spatial statisticians:
I am trying to align a bunch of world-wide datasets at 0.1 degree
resolution. I figured that the easiest way to do that is to create a
dataset where the rownames form the index that allows to relate all my
variable-sized point datasets to. I therefore created a shapefile base
dataset that contains appr. 6.5 million points and had hoped that I
could read this into R to then link all my other data to. Unfortunately,
I ran into memory allocation problems (see error messages attached to
this email).
My hunch is that these can be overcome by just changing some settings
(hopefully without having to recompile R sources). Do you have
suggestions what these are and what the practically hard limit is? 6.5
million points is large but not uncommon these days, so I figured that
this should be doable without embarking on major efforts.
Alternatively, would it save me a lot of memory space if I tried to read
this into a SpatialPixel or SpatialGrid structure?
Cheers,
Jochen
> basepoints = readOGR(".", "basepoints")
OGR data source with driver: ESRI Shapefile
Source: ".", layer: "basepoints"
with 3465355 rows and 7 columns
Feature type: wkbPoint with 2 dimensions
Warning in data.frame(dlist) :
Reached total allocation of 1535Mb: see help(memory.size)
Warning in data.frame(dlist) :
Reached total allocation of 1535Mb: see help(memory.size)
Warning in data.frame(dlist) :
Reached total allocation of 1535Mb: see help(memory.size)
Warning in data.frame(dlist) :
Reached total allocation of 1535Mb: see help(memory.size)
Warning in as.data.frame.integer(x[[i]], optional = TRUE) :
Reached total allocation of 1535Mb: see help(memory.size)
Warning in as.data.frame.integer(x[[i]], optional = TRUE) :
Reached total allocation of 1535Mb: see help(memory.size)
Warning in as.data.frame.integer(x[[i]], optional = TRUE) :
Reached total allocation of 1535Mb: see help(memory.size)
Warning in as.data.frame.integer(x[[i]], optional = TRUE) :
Reached total allocation of 1535Mb: see help(memory.size)
Error: cannot allocate vector of size 13.2 Mb
where to tweak memory allocation settings
3 messages · Jochen Albrecht, Robert J. Hijmans, Roger Bivand
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-geo/attachments/20091008/dd8575b3/attachment.pl>
On Thu, 8 Oct 2009, Jochen Albrecht wrote:
Fellow spatial statisticians: I am trying to align a bunch of world-wide datasets at 0.1 degree resolution. I figured that the easiest way to do that is to create a dataset where the rownames form the index that allows to relate all my variable-sized point datasets to. I therefore created a shapefile base dataset that contains appr. 6.5 million points and had hoped that I could read this into R to then link all my other data to. Unfortunately, I ran into memory allocation problems (see error messages attached to this email).
You have not reported sessionInfo() or your operating system (Windows?). See the R for Windows FAQ for memory management there. In fact the problem isn't the number of points (here 3.5 million, presumably the raster cells with data, not 6.5 million for global coverage), it is likely the combination of character columns and their conversion into factor form - do they contain long character strings?
My hunch is that these can be overcome by just changing some settings (hopefully without having to recompile R sources). Do you have suggestions what these are and what the practically hard limit is? 6.5 million points is large but not uncommon these days, so I figured that this should be doable without embarking on major efforts. Alternatively, would it save me a lot of memory space if I tried to read this into a SpatialPixel or SpatialGrid structure?
To read into a SpatialPixelDataFrame object, you still need points, or go through a SpatialGridDataFrame. For a SpatialGridDataFrame, store the data as a multiband GeoTiff for example. If however there is something odd - long character strings as attributes, the problem will be the same. One advantage of going through a SpatialGridDataFrame is that the 3.5 million row names fo not get generated and do not take up space - readOGR() always generates feature IDs from the input geometry FIDs. If the support is grid support, use a gridded representation, as in Robert's suggestion for using the raster package. Hope this helps, Roger
Cheers, Jochen
basepoints = readOGR(".", "basepoints")
OGR data source with driver: ESRI Shapefile Source: ".", layer: "basepoints" with 3465355 rows and 7 columns Feature type: wkbPoint with 2 dimensions Warning in data.frame(dlist) : Reached total allocation of 1535Mb: see help(memory.size) Warning in data.frame(dlist) : Reached total allocation of 1535Mb: see help(memory.size) Warning in data.frame(dlist) : Reached total allocation of 1535Mb: see help(memory.size) Warning in data.frame(dlist) : Reached total allocation of 1535Mb: see help(memory.size) Warning in as.data.frame.integer(x[[i]], optional = TRUE) : Reached total allocation of 1535Mb: see help(memory.size) Warning in as.data.frame.integer(x[[i]], optional = TRUE) : Reached total allocation of 1535Mb: see help(memory.size) Warning in as.data.frame.integer(x[[i]], optional = TRUE) : Reached total allocation of 1535Mb: see help(memory.size) Warning in as.data.frame.integer(x[[i]], optional = TRUE) : Reached total allocation of 1535Mb: see help(memory.size) Error: cannot allocate vector of size 13.2 Mb
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no