Memory limit problems in R / import of maps
On Tue, 22 Apr 2008, Roger Bivand wrote:
On Tue, 22 Apr 2008, Edzer Pebesma wrote:
Tomislav Hengl wrote:
Just one last thing,
Two?
if R is reporting an error message, that does not necessarily mean that there is a memory limit problem with the machine
Correct, the error message should give a hint,
- shouldn't there be a way to implement memory handling in R in a more efficient way?
R is open source, so go ahead and modify it. As an advice, first consider the resources you have, and consider the other options just kindly provided to you. PC's with 8 Gb RAM now start at 500 euros, so why process massive data sets on your 2 Gb notebook.
Even on my 2001 1GB desktop (dual xeon, but hey, not exactly high end
now!), reading the 25 1000x1450 rasters went like a song:
library(rgdal)
grd <- GridTopology(c(0.5, 0.5), c(1,1), c(1000, 1450))
set.seed(1)
for (i in 1:25) {
dta <- sample(1:10, prod(slot(grd, "cells.dim")), replace=TRUE)
SGDF <- SpatialGridDataFrame(grd, data=data.frame(band1=dta))
fn <- paste("kasc", i, ".tif", sep="")
writeGDAL(SGDF, fn, drivername="GTiff", type="Byte")
}
gc()
fnames0 <- list.files(pattern="kasc*")
fnames <- gsub("\\.tif", "", fnames0)
r1 <- readGDAL(fnames0[1], silent=TRUE)
Grd <- slot(r1, "grid")
n <- dim(slot(r1, "data"))[1]
indata <- matrix(0, nrow=n, ncol=length(fnames0))
for (i in 1:length(fnames0)) {
ingrid <- readGDAL(fnames0[i], silent=TRUE)
indata[,i] <- ingrid[[1]]
cat(i, "\n")
gc()
}
gc()
colnames(indata) <- fnames
str(indata)
df <- as.data.frame(indata)
gc()
rm(indata)
str(df)
gc()
ingrid <- SpatialGridDataFrame(Grd, data=df)
gc()
rm(df)
gc()
library(adehabitat)
outkasc <- spixdf2kasc(ingrid)
...
Your problem is in spixdf2kasc() in adehabitat, which makes many copies of
the input object. It may even be possible to inject the
readGDAL(fnames0[i], silent=TRUE)[[1]]
line into:
lll <- lapply(1:length(uu), function(i) c(as.matrix(sg[i]))
^^^^^
in spixdf2kasc(), which is arguably not using the best syntax for just
getting the data out of the columns in its copy sg of ingrid. So
contributing an optimised version of spixdf2kasc would be helpful - but
maybe 2GB would work - I was swapping at 1.9GB, but I only have 1GB, so
maybe you'd get through. It's mostly a matter of watching where copying
may occur and avoiding it.
With a little tidying, spixdf2kasc() will run on 1GB for this 370MB SpatialGridDataFrame, taking just another 370MB by copying the data frame just once. If anyone would like a copy, please contact me off-line. The kasc object is in fact just the SGDF data frame with the rows in reversed order, but since enfa() in adehabitat uses a kasc object, you probably need to go this way. Probably you'll be using gc() a good deal without a little more memory, though. Getting the output out to an SGDF object ought to be possible too, ask about that later if need be. Roger
The new Braun & Murdoch introduction to statistical programming with R is a very useful reference in cases like these - in particular assign large objects once and fill them up, and if they are already OK, don't over-check them. In addition, Dylan and Edzer made good points about the potential spuriousness of apparent resolution - suitability is in patches, isn't it, and the outcome won't be more or less significant with greater n? If you aren't using proximity, you could just train on a sample from the 25-layer full data set, and predict back from the fitted model, couldn't you? The conversion function to kasc does accept SpatialPixelsDataFrame objects, but unfortunately promotes them to full grids, so the sample would need to be a rectangular subset, I'm afraid. Maybe try the adelist for more help on their side, the adehabitat maintainer is helpful when possible. Hope this helps, Roger
-- Edzer
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no