'xyValues' from 'raster' package slow
Karl and others: There is a new version of xyValues (in raster version 0.9.8-31) that --in a single test-- is 30 times faster then the previous version (for files read with rgdal; I have not yet updated the function for the 'native formats'). Robert
On Tue, Dec 1, 2009 at 10:29 AM, Robert J. Hijmans <r.hijmans at gmail.com> wrote:
Dear Karl,
I agree that xyValues is not very fast; I can probably speed it up a
bit (there can be a lot of '==' in there, I do not know where all the
gc()s comes from), but for now here are some suggestions:
# given
x = seq(0, 20, length = 100)
y = seq(55, 70, length = 100)
xy = expand.grid(x, y)
# reading all values into memory first will speed things up
# but this may not be feasible (memory wise):
r = raster("file.ers", values=TRUE)
z = xyValues(r, xy)
An alternative, assuming that what you want is all values for a
rectangular area (I cannot determine that from the example).
r = raster("file.ers")
sr <- rowFromY(r, min(x))
er <- rowFromY(r, max(x))
nr <- er-sr+1
sc <- colFromX(r, min(y))
ec <- colFromX(r, max(y))
r <- readBlock(r, sr, nr, sc, ec)
z <- values(r)
## Or this (same assumption; but much less verbose):
r = raster("file.ers")
rr <- crop(r, extent(min(x), max(x), min(y), max(y)))
z <- values(rr)
# If you do not want all the values for the rectangle, it could still be
# quicker to do the above, ?followed by xyValues,
# under the assumption that rr is small enough to keep all values
# in memory while r was not.
r = raster("file.ers")
rr <- crop(r, extent(min(x), max(x), min(y), max(y)))
z = xyValues(rr, xy)
Hope this helps,
Robert
On Tue, Dec 1, 2009 at 4:11 AM, Karl Ove Hufthammer <karl at huftis.org> wrote:
Dear list members
I'm using the 'raster' package to load a rather large (~2 GB) file of
bathymetry data. To extract elevation/depth data for various positions,
I use the 'xyValues' function.
However, it turns that this operations is extremely slow for extracting
a even moderately large number of values. Here is in essence my code:
r = raster("file.ers")
x = seq(0, 20, length = 100)
y = seq(55, 70, length = 100)
xy = expand.grid(x, y)
z = xyValues(r, xy)
Am I doing something wrong here? Using 'readBin' or 'readBinFragments'
on the same file, to extract the same values, takes almost no time. But
then I have to calculate the indices, positions &c. myself, which is
tedious and easy get wrong (having to take into account half-pixel
shifts and other details).
Using Rprof indicates that a lot of 'self.time' is taken up by 'gc' (and
'=='), which does sound a bit strange.
--
Karl Ove Hufthammer
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo