Skip to content

memory issue on R with Linux 64

7 messages · Edzer Pebesma, Robert J. Hijmans, Roger Bivand +1 more

#
Hi List,

I get an error using readGDAL{rgdal}: cannot allocate vector of size 3.1 Gb

I am using Linux 64bit (opensuse 11) with 4 gig swap and 4 gig Ram and R 2.8.0. 

The load monitor shows that most of Ram is used up and then when Swap use starts increasing, R returns the error.

Is there anything I should do within R to circumvent this?


Any help appreciated
Thanks
Herry
#
Well, this doesn't come as a surprise; if it did for you then you didn't 
read the list archives well.

R has been designed for analysing statistical data, which usually 
doesn't outnumber billions of observations, and not for 
analysis/processing of large grids/imagery.

rgdal has infrastructure to let you go through huge grids by reading and 
writing only parts at a time; you can find pointers to this in the rgdal 
documentation, examples on the list. I don't know of functions that do 
this automatically for you; maybe the raster package on r-forge?

Another option is to buy more ram. I am using debian on a 32 Gb ram 
workstation; I was surprised (really) how little it costed. It saves me 
time.
--
Edzer
Alexander.Herr at csiro.au wrote:

  
    
#
On Thu, 29 Jan 2009, Alexander.Herr at csiro.au wrote:

            
This is a tile of your 73K by 80K raster, right? One possibility is to use 
smaller tiles, another to get more memory (as Edzer wrote), a third to use 
lower level functions in rgdal to avoid duplication (and repeated gc()) - 
in readGDAL the data read by getRasterData() are copied, so at least 
doubling memory usage.

Do you need to read the raster? If this is the overlay problem, you should 
be able to use the output of GDALinfo for your raster to build a 
GridTopology and SpatialGrid, and overlay (tiles of) that on the 
SpatialPolygons (tiles of that because overlay() will generate cell centre 
coordinates and do point in polygon, so you're still stuck with too many 
coordinates). The next issue would be to copy out the polygon IDs, or the 
extracted values, as a raster - here the forthcoming raster package on 
R-Forge may be what you need.

Roger

  
    
#
Herry,

This is how you can do it in package{raster}. (revision 209;  a build
should be available within 24 hours).

Following Edzer's example:

require(raster)
library(maptools)
# read a SpatialPolygonsDataFrame
nc <- readShapePoly(system.file("shapes/sids.shp",
      package="maptools")[1], proj4string=CRS("+proj=longlat +datum=NAD27"))
# create a new RasterLayer object from the polygon bounding box and
set row / cols.
rs <- rasterFromBbox(nc, nrows=54, ncols=176)
# transfer polygons to raster, use values of column 13 in the polygon dataframe
rs <- polygonsToRaster(nc, rs, field=13)
# plot, either directly
plot(rs)
plot(nc, add=T, border="blue")
# or via sp for a real map
x11()
spplot(asSpGrid(rs), col.regions=bpy.colors())

also see the example in ?polygonsToRaster

The polygonsToRaster function works for very large rasters (row by row
processing if you provide an output file name). I have only tested if
for a very limited number of very simple cases, so user beware...
The algorithm needs optimization for speed, so that might be a problem
for your very large grids. (particularly of your polygons are also
complex). It also needs some tweaking (and options) for deciding when
a polygon is in a grid cell. As is, the intention is that a polygon
has to overlap with the center of a cell to be considered inside.

Robert
On Thu, Jan 29, 2009 at 3:09 PM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
#
On Thu, 29 Jan 2009, Robert Hijmans wrote:

            
It would be very useful to compare the output of this procedure, which 
looks very promising, with Starspan:

http://starspan.casil.ucdavis.edu/doku/

to see which subpixel heuristics may be helpful.

Roger

  
    
#
Hi Edzer,

I didn't expect R to do magic (although it does it better than most prgs). What surprised me was that under Linux there was no use of swap space - I hoped memory handling would be including swap space. 

Cheers
Herry


-----Original Message-----
From: Edzer Pebesma [mailto:edzer.pebesma at uni-muenster.de] 
Sent: Thursday, January 29, 2009 6:01 PM
To: Herr, Alexander Herr - Herry (CSE, Gungahlin)
Cc: r-sig-geo at stat.math.ethz.ch
Subject: Re: [R-sig-Geo] memory issue on R with Linux 64

Well, this doesn't come as a surprise; if it did for you then you didn't read the list archives well.

R has been designed for analysing statistical data, which usually doesn't outnumber billions of observations, and not for analysis/processing of large grids/imagery.

rgdal has infrastructure to let you go through huge grids by reading and writing only parts at a time; you can find pointers to this in the rgdal documentation, examples on the list. I don't know of functions that do this automatically for you; maybe the raster package on r-forge?

Another option is to buy more ram. I am using debian on a 32 Gb ram workstation; I was surprised (really) how little it costed. It saves me time.
--
Edzer
Alexander.Herr at csiro.au wrote:
--
Edzer Pebesma
Institute for Geoinformatics (ifgi), University of M?nster Weseler Stra?e 253, 48151 M?nster, Germany. Phone: +49 251 8333081, Fax: +49 251 8339763 http://ifgi.uni-muenster.de/
http://www.springer.com/978-0-387-78170-9 e.pebesma at wwu.de
#
Thanks Robert,

PolygonsToRaster may be a way, but I only want to transfer onto the raster where the raster is not NA (ie a subset of the polygon area). So what I would need is a why to exclude all polygon parts where the raster is NA.

In theory I could do the whole transfer and than stamp out all pixels of the new raster where the original raster has NA or I could delete the polygons where raster has NA prior to transfer. However, it would be easier to do the transfer in one step. 

Cheers
Herry 



Dr Alexander Herr - Herry
CSIRO, Sustainable Ecosystems
Gungahlin Homestead
Bellenden Street
GPO Box 284
Crace, ACT 2601
 
Phone/www 
(02) 6242 1542; 6242 1705(fax)
0408679811 (mob)

home: www.csiro.au/people/Alexander.Herr
Webadmin ABS: http://ausbats.org.au
Sustainable Ecosystems: www.cse.csiro.au
--------------------------------------------


-----Original Message-----
From: Robert Hijmans [mailto:r.hijmans at gmail.com] 
Sent: Thursday, January 29, 2009 7:58 PM
To: Herr, Alexander Herr - Herry (CSE, Gungahlin)
Cc: r-sig-geo at stat.math.ethz.ch
Subject: Re: [R-sig-Geo] memory issue on R with Linux 64

Herry,

This is how you can do it in package{raster}. (revision 209;  a build should be available within 24 hours).

Following Edzer's example:

require(raster)
library(maptools)
# read a SpatialPolygonsDataFrame
nc <- readShapePoly(system.file("shapes/sids.shp",
      package="maptools")[1], proj4string=CRS("+proj=longlat +datum=NAD27")) # create a new RasterLayer object from the polygon bounding box and set row / cols.
rs <- rasterFromBbox(nc, nrows=54, ncols=176) # transfer polygons to raster, use values of column 13 in the polygon dataframe rs <- polygonsToRaster(nc, rs, field=13) # plot, either directly
plot(rs)
plot(nc, add=T, border="blue")
# or via sp for a real map
x11()
spplot(asSpGrid(rs), col.regions=bpy.colors())

also see the example in ?polygonsToRaster

The polygonsToRaster function works for very large rasters (row by row processing if you provide an output file name). I have only tested if for a very limited number of very simple cases, so user beware...
The algorithm needs optimization for speed, so that might be a problem for your very large grids. (particularly of your polygons are also complex). It also needs some tweaking (and options) for deciding when a polygon is in a grid cell. As is, the intention is that a polygon has to overlap with the center of a cell to be considered inside.

Robert
On Thu, Jan 29, 2009 at 3:09 PM, Roger Bivand <Roger.Bivand at nhh.no> wrote: