Hi List,
I get an error using readGDAL{rgdal}: cannot allocate vector of size 3.1 Gb
I am using Linux 64bit (opensuse 11) with 4 gig swap and 4 gig Ram and R 2.8.0.
The load monitor shows that most of Ram is used up and then when Swap use starts increasing, R returns the error.
Is there anything I should do within R to circumvent this?
Any help appreciated
Thanks
Herry
memory issue on R with Linux 64
7 messages · Edzer Pebesma, Robert J. Hijmans, Roger Bivand +1 more
Well, this doesn't come as a surprise; if it did for you then you didn't read the list archives well. R has been designed for analysing statistical data, which usually doesn't outnumber billions of observations, and not for analysis/processing of large grids/imagery. rgdal has infrastructure to let you go through huge grids by reading and writing only parts at a time; you can find pointers to this in the rgdal documentation, examples on the list. I don't know of functions that do this automatically for you; maybe the raster package on r-forge? Another option is to buy more ram. I am using debian on a 32 Gb ram workstation; I was surprised (really) how little it costed. It saves me time. -- Edzer
Alexander.Herr at csiro.au wrote:
Hi List,
I get an error using readGDAL{rgdal}: cannot allocate vector of size 3.1 Gb
I am using Linux 64bit (opensuse 11) with 4 gig swap and 4 gig Ram and R 2.8.0.
The load monitor shows that most of Ram is used up and then when Swap use starts increasing, R returns the error.
Is there anything I should do within R to circumvent this?
Any help appreciated
Thanks
Herry
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Edzer Pebesma Institute for Geoinformatics (ifgi), University of M?nster Weseler Stra?e 253, 48151 M?nster, Germany. Phone: +49 251 8333081, Fax: +49 251 8339763 http://ifgi.uni-muenster.de/ http://www.springer.com/978-0-387-78170-9 e.pebesma at wwu.de
On Thu, 29 Jan 2009, Alexander.Herr at csiro.au wrote:
Hi List,
I get an error using readGDAL{rgdal}: cannot allocate vector of size 3.1
Gb
This is a tile of your 73K by 80K raster, right? One possibility is to use smaller tiles, another to get more memory (as Edzer wrote), a third to use lower level functions in rgdal to avoid duplication (and repeated gc()) - in readGDAL the data read by getRasterData() are copied, so at least doubling memory usage. Do you need to read the raster? If this is the overlay problem, you should be able to use the output of GDALinfo for your raster to build a GridTopology and SpatialGrid, and overlay (tiles of) that on the SpatialPolygons (tiles of that because overlay() will generate cell centre coordinates and do point in polygon, so you're still stuck with too many coordinates). The next issue would be to copy out the polygon IDs, or the extracted values, as a raster - here the forthcoming raster package on R-Forge may be what you need. Roger
I am using Linux 64bit (opensuse 11) with 4 gig swap and 4 gig Ram and R 2.8.0. The load monitor shows that most of Ram is used up and then when Swap use starts increasing, R returns the error. Is there anything I should do within R to circumvent this? Any help appreciated Thanks Herry
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no
Herry,
This is how you can do it in package{raster}. (revision 209; a build
should be available within 24 hours).
Following Edzer's example:
require(raster)
library(maptools)
# read a SpatialPolygonsDataFrame
nc <- readShapePoly(system.file("shapes/sids.shp",
package="maptools")[1], proj4string=CRS("+proj=longlat +datum=NAD27"))
# create a new RasterLayer object from the polygon bounding box and
set row / cols.
rs <- rasterFromBbox(nc, nrows=54, ncols=176)
# transfer polygons to raster, use values of column 13 in the polygon dataframe
rs <- polygonsToRaster(nc, rs, field=13)
# plot, either directly
plot(rs)
plot(nc, add=T, border="blue")
# or via sp for a real map
x11()
spplot(asSpGrid(rs), col.regions=bpy.colors())
also see the example in ?polygonsToRaster
The polygonsToRaster function works for very large rasters (row by row
processing if you provide an output file name). I have only tested if
for a very limited number of very simple cases, so user beware...
The algorithm needs optimization for speed, so that might be a problem
for your very large grids. (particularly of your polygons are also
complex). It also needs some tweaking (and options) for deciding when
a polygon is in a grid cell. As is, the intention is that a polygon
has to overlap with the center of a cell to be considered inside.
Robert
On Thu, Jan 29, 2009 at 3:09 PM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
On Thu, 29 Jan 2009, Alexander.Herr at csiro.au wrote:
Hi List,
I get an error using readGDAL{rgdal}: cannot allocate vector of size 3.1
Gb
This is a tile of your 73K by 80K raster, right? One possibility is to use smaller tiles, another to get more memory (as Edzer wrote), a third to use lower level functions in rgdal to avoid duplication (and repeated gc()) - in readGDAL the data read by getRasterData() are copied, so at least doubling memory usage. Do you need to read the raster? If this is the overlay problem, you should be able to use the output of GDALinfo for your raster to build a GridTopology and SpatialGrid, and overlay (tiles of) that on the SpatialPolygons (tiles of that because overlay() will generate cell centre coordinates and do point in polygon, so you're still stuck with too many coordinates). The next issue would be to copy out the polygon IDs, or the extracted values, as a raster - here the forthcoming raster package on R-Forge may be what you need. Roger
I am using Linux 64bit (opensuse 11) with 4 gig swap and 4 gig Ram and R 2.8.0. The load monitor shows that most of Ram is used up and then when Swap use starts increasing, R returns the error. Is there anything I should do within R to circumvent this? Any help appreciated Thanks Herry
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
-- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
On Thu, 29 Jan 2009, Robert Hijmans wrote:
Herry,
This is how you can do it in package{raster}. (revision 209; a build
should be available within 24 hours).
Following Edzer's example:
require(raster)
library(maptools)
# read a SpatialPolygonsDataFrame
nc <- readShapePoly(system.file("shapes/sids.shp",
package="maptools")[1], proj4string=CRS("+proj=longlat +datum=NAD27"))
# create a new RasterLayer object from the polygon bounding box and
set row / cols.
rs <- rasterFromBbox(nc, nrows=54, ncols=176)
# transfer polygons to raster, use values of column 13 in the polygon dataframe
rs <- polygonsToRaster(nc, rs, field=13)
# plot, either directly
plot(rs)
plot(nc, add=T, border="blue")
# or via sp for a real map
x11()
spplot(asSpGrid(rs), col.regions=bpy.colors())
also see the example in ?polygonsToRaster
The polygonsToRaster function works for very large rasters (row by row
processing if you provide an output file name). I have only tested if
for a very limited number of very simple cases, so user beware...
The algorithm needs optimization for speed, so that might be a problem
for your very large grids. (particularly of your polygons are also
complex). It also needs some tweaking (and options) for deciding when
a polygon is in a grid cell. As is, the intention is that a polygon
has to overlap with the center of a cell to be considered inside.
It would be very useful to compare the output of this procedure, which looks very promising, with Starspan: http://starspan.casil.ucdavis.edu/doku/ to see which subpixel heuristics may be helpful. Roger
Robert On Thu, Jan 29, 2009 at 3:09 PM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
On Thu, 29 Jan 2009, Alexander.Herr at csiro.au wrote:
Hi List,
I get an error using readGDAL{rgdal}: cannot allocate vector of size 3.1
Gb
This is a tile of your 73K by 80K raster, right? One possibility is to use smaller tiles, another to get more memory (as Edzer wrote), a third to use lower level functions in rgdal to avoid duplication (and repeated gc()) - in readGDAL the data read by getRasterData() are copied, so at least doubling memory usage. Do you need to read the raster? If this is the overlay problem, you should be able to use the output of GDALinfo for your raster to build a GridTopology and SpatialGrid, and overlay (tiles of) that on the SpatialPolygons (tiles of that because overlay() will generate cell centre coordinates and do point in polygon, so you're still stuck with too many coordinates). The next issue would be to copy out the polygon IDs, or the extracted values, as a raster - here the forthcoming raster package on R-Forge may be what you need. Roger
I am using Linux 64bit (opensuse 11) with 4 gig swap and 4 gig Ram and R 2.8.0. The load monitor shows that most of Ram is used up and then when Swap use starts increasing, R returns the error. Is there anything I should do within R to circumvent this? Any help appreciated Thanks Herry
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
-- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no
Hi Edzer, I didn't expect R to do magic (although it does it better than most prgs). What surprised me was that under Linux there was no use of swap space - I hoped memory handling would be including swap space. Cheers Herry -----Original Message----- From: Edzer Pebesma [mailto:edzer.pebesma at uni-muenster.de] Sent: Thursday, January 29, 2009 6:01 PM To: Herr, Alexander Herr - Herry (CSE, Gungahlin) Cc: r-sig-geo at stat.math.ethz.ch Subject: Re: [R-sig-Geo] memory issue on R with Linux 64 Well, this doesn't come as a surprise; if it did for you then you didn't read the list archives well. R has been designed for analysing statistical data, which usually doesn't outnumber billions of observations, and not for analysis/processing of large grids/imagery. rgdal has infrastructure to let you go through huge grids by reading and writing only parts at a time; you can find pointers to this in the rgdal documentation, examples on the list. I don't know of functions that do this automatically for you; maybe the raster package on r-forge? Another option is to buy more ram. I am using debian on a 32 Gb ram workstation; I was surprised (really) how little it costed. It saves me time. -- Edzer
Alexander.Herr at csiro.au wrote:
Hi List,
I get an error using readGDAL{rgdal}: cannot allocate vector of size
3.1 Gb
I am using Linux 64bit (opensuse 11) with 4 gig swap and 4 gig Ram and R 2.8.0.
The load monitor shows that most of Ram is used up and then when Swap use starts increasing, R returns the error.
Is there anything I should do within R to circumvent this?
Any help appreciated
Thanks
Herry
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
-- Edzer Pebesma Institute for Geoinformatics (ifgi), University of M?nster Weseler Stra?e 253, 48151 M?nster, Germany. Phone: +49 251 8333081, Fax: +49 251 8339763 http://ifgi.uni-muenster.de/ http://www.springer.com/978-0-387-78170-9 e.pebesma at wwu.de
Thanks Robert, PolygonsToRaster may be a way, but I only want to transfer onto the raster where the raster is not NA (ie a subset of the polygon area). So what I would need is a why to exclude all polygon parts where the raster is NA. In theory I could do the whole transfer and than stamp out all pixels of the new raster where the original raster has NA or I could delete the polygons where raster has NA prior to transfer. However, it would be easier to do the transfer in one step. Cheers Herry Dr Alexander Herr - Herry CSIRO, Sustainable Ecosystems Gungahlin Homestead Bellenden Street GPO Box 284 Crace, ACT 2601 Phone/www (02) 6242 1542; 6242 1705(fax) 0408679811 (mob) home: www.csiro.au/people/Alexander.Herr Webadmin ABS: http://ausbats.org.au Sustainable Ecosystems: www.cse.csiro.au -------------------------------------------- -----Original Message----- From: Robert Hijmans [mailto:r.hijmans at gmail.com] Sent: Thursday, January 29, 2009 7:58 PM To: Herr, Alexander Herr - Herry (CSE, Gungahlin) Cc: r-sig-geo at stat.math.ethz.ch Subject: Re: [R-sig-Geo] memory issue on R with Linux 64 Herry, This is how you can do it in package{raster}. (revision 209; a build should be available within 24 hours). Following Edzer's example: require(raster) library(maptools) # read a SpatialPolygonsDataFrame nc <- readShapePoly(system.file("shapes/sids.shp", package="maptools")[1], proj4string=CRS("+proj=longlat +datum=NAD27")) # create a new RasterLayer object from the polygon bounding box and set row / cols. rs <- rasterFromBbox(nc, nrows=54, ncols=176) # transfer polygons to raster, use values of column 13 in the polygon dataframe rs <- polygonsToRaster(nc, rs, field=13) # plot, either directly plot(rs) plot(nc, add=T, border="blue") # or via sp for a real map x11() spplot(asSpGrid(rs), col.regions=bpy.colors()) also see the example in ?polygonsToRaster The polygonsToRaster function works for very large rasters (row by row processing if you provide an output file name). I have only tested if for a very limited number of very simple cases, so user beware... The algorithm needs optimization for speed, so that might be a problem for your very large grids. (particularly of your polygons are also complex). It also needs some tweaking (and options) for deciding when a polygon is in a grid cell. As is, the intention is that a polygon has to overlap with the center of a cell to be considered inside. Robert
On Thu, Jan 29, 2009 at 3:09 PM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
On Thu, 29 Jan 2009, Alexander.Herr at csiro.au wrote:
Hi List,
I get an error using readGDAL{rgdal}: cannot allocate vector of size
3.1 Gb
This is a tile of your 73K by 80K raster, right? One possibility is to use smaller tiles, another to get more memory (as Edzer wrote), a third to use lower level functions in rgdal to avoid duplication (and repeated gc()) - in readGDAL the data read by getRasterData() are copied, so at least doubling memory usage. Do you need to read the raster? If this is the overlay problem, you should be able to use the output of GDALinfo for your raster to build a GridTopology and SpatialGrid, and overlay (tiles of) that on the SpatialPolygons (tiles of that because overlay() will generate cell centre coordinates and do point in polygon, so you're still stuck with too many coordinates). The next issue would be to copy out the polygon IDs, or the extracted values, as a raster - here the forthcoming raster package on R-Forge may be what you need. Roger
I am using Linux 64bit (opensuse 11) with 4 gig swap and 4 gig Ram and R 2.8.0. The load monitor shows that most of Ram is used up and then when Swap use starts increasing, R returns the error. Is there anything I should do within R to circumvent this? Any help appreciated Thanks Herry
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
-- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo