hi all I need to perform a zonal statistics by overlapping a DEM raster (ESRI grid binary) with some polygons (ESRI polygons); the problem is that I first of all need to read the raster, which is quite big: about 6 Gbyte; I know about the function readGDAL() by the fantastic package "rgdal" which is usually working very well with smaller size files but this time I got completely stuck (the pc freezes invariably) do you have any advice on how to deal with such a problem (big file)? thank you regards
how to read in R a big raster of about 6 Gb
7 messages · Massimo Bressan, Michael Sumner, Dominik Schneider
You can use the raster package, which will leave the values on disk if they are too big for memory. Use the function raster() to read the DEM, then use readOGR() to read the polygon shape file. You can then use extract() to get statistics on your DEM. I'm guessing it'll take a while to extract from disk, so if you are planning to extract multiple times, you should consider copying your DEM and setting the values equal to the cell numbers with values(DEM) <- 1:ncell(DEM). Extract this once, keep track of which cell numbers are in which polygon and then use the cell numbers to index the DEM when you need values by polygon. Hope this helps, Dominik On Thu, Oct 15, 2015 at 9:07 AM, Massimo Bressan <mbressan at arpa.veneto.it> wrote:
hi all I need to perform a zonal statistics by overlapping a DEM raster (ESRI grid binary) with some polygons (ESRI polygons); the problem is that I first of all need to read the raster, which is quite big: about 6 Gbyte; I know about the function readGDAL() by the fantastic package "rgdal" which is usually working very well with smaller size files but this time I got completely stuck (the pc freezes invariably) do you have any advice on how to deal with such a problem (big file)? thank you regards
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
hi, thanks for your prompt reply in fact by using raster() funciontion "reading" DEM is quite fast and also reading the polygons by readOGR() went quite smooth... the problem then come with the extract() function that is taking definitely too long... I'm not quite sure I'm completely grasping your hint about values(DEM) <- 1:ncell(DEM) for speeding up the process please also consider that the SpatialPolygonDataFrame that I readOGR() has 23 fields; I need finally to associate the DEM values with the levels represented by one of these fields: do you think is a good idea eliminating the not necessary fields? max Il giorno Thu, 15/10/2015 alle 09.26 -0600, Dominik Schneider ha scritto:
You can use the raster package, which will leave the values on disk
if they are too big for memory.
Use the function raster() to read the DEM, then use readOGR() to read
the polygon shape file. You can then use extract() to get statistics
on your DEM. I'm guessing it'll take a while to extract from disk, so
if you are planning to extract multiple times, you should consider
copying your DEM and setting the values equal to the cell numbers with
values(DEM) <- 1:ncell(DEM). Extract this once, keep track of which
cell numbers are in which polygon and then use the cell numbers to
index the DEM when you need values by polygon.
Hope this helps,
Dominik
On Thu, Oct 15, 2015 at 9:07 AM, Massimo Bressan
<mbressan at arpa.veneto.it> wrote:
hi all
I need to perform a zonal statistics by overlapping a DEM
raster (ESRI
grid binary) with some polygons (ESRI polygons);
the problem is that I first of all need to read the raster,
which is
quite big: about 6 Gbyte;
I know about the function readGDAL() by the fantastic package
"rgdal"
which is usually working very well with smaller size files but
this time
I got completely stuck (the pc freezes invariably)
do you have any advice on how to deal with such a problem (big
file)?
thank you
regards
_______________________________________________
R-sig-Geo mailing list
R-sig-Geo at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
I need finally to associate the DEM values with the levels represented by one of these fields: do you think is a good idea eliminating the not necessary fields?
if you mean that you don't need information for all the polygons then yes, you should subset the spatialpolygonsdF. If you're just concerned about multiple attribute fields then I don't think it'll make a difference. I'm not quite sure I'm completely grasping your hint about
values(DEM) <- 1:ncell(DEM) for speeding up the process
DEMcopy=DEM values(DEMcopy) <- 1:ncell(DEMcopy) list_cells=extract(DEMcopy,sppolydF) list_cells will be a list with the cell numbers associated with each polygon in a separate list element. The initial extract() will not take less time, but it should be useful in the future, e.g. you could do poly1=list_cells[[1]] elev1=DEM[poly1] Another option I've had luck with is rasterizing polygons. Using the GDAL commandline utility gdal_rasterize you can convert the shape file of polygons to a classified raster. In R, you can then use zonal() with some aggregation function to get stats of DEM using your classified raster. On Thu, Oct 15, 2015 at 10:31 AM, Massimo Bressan <mbressan at arpa.veneto.it> wrote:
hi, thanks for your prompt reply in fact by using raster() funciontion "reading" DEM is quite fast and also reading the polygons by readOGR() went quite smooth... the problem then come with the extract() function that is taking definitely too long... I'm not quite sure I'm completely grasping your hint about values(DEM) <- 1:ncell(DEM) for speeding up the process please also consider that the SpatialPolygonDataFrame that I readOGR() has 23 fields; I need finally to associate the DEM values with the levels represented by one of these fields: do you think is a good idea eliminating the not necessary fields? max Il giorno Thu, 15/10/2015 alle 09.26 -0600, Dominik Schneider ha scritto:
You can use the raster package, which will leave the values on disk
if they are too big for memory.
Use the function raster() to read the DEM, then use readOGR() to read
the polygon shape file. You can then use extract() to get statistics
on your DEM. I'm guessing it'll take a while to extract from disk, so
if you are planning to extract multiple times, you should consider
copying your DEM and setting the values equal to the cell numbers with
values(DEM) <- 1:ncell(DEM). Extract this once, keep track of which
cell numbers are in which polygon and then use the cell numbers to
index the DEM when you need values by polygon.
Hope this helps,
Dominik
On Thu, Oct 15, 2015 at 9:07 AM, Massimo Bressan
<mbressan at arpa.veneto.it> wrote:
hi all
I need to perform a zonal statistics by overlapping a DEM
raster (ESRI
grid binary) with some polygons (ESRI polygons);
the problem is that I first of all need to read the raster,
which is
quite big: about 6 Gbyte;
I know about the function readGDAL() by the fantastic package
"rgdal"
which is usually working very well with smaller size files but
this time
I got completely stuck (the pc freezes invariably)
do you have any advice on how to deal with such a problem (big
file)?
thank you
regards
_______________________________________________
R-sig-Geo mailing list
R-sig-Geo at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Fwiw, the worst slow downs I see with extract are when the raster is natively tiled. (Been meaning to write the fix and/or report to Robert). Can you check if your file is tiled? Use gdalinfo command line utility. Extract, at least for points and I assume also for polygons, scans the file line by line but this needs to be done tile by tile to avoid repetitive and wasteful I/O. Cheers, Mike On Friday, October 16, 2015, Dominik Schneider <
Dominik.Schneider at colorado.edu> wrote:
I need finally to associate the DEM values with the levels represented by one of these fields: do you think is a good idea eliminating the not necessary fields?
if you mean that you don't need information for all the polygons then yes, you should subset the spatialpolygonsdF. If you're just concerned about multiple attribute fields then I don't think it'll make a difference. I'm not quite sure I'm completely grasping your hint about
values(DEM) <- 1:ncell(DEM) for speeding up the process
DEMcopy=DEM values(DEMcopy) <- 1:ncell(DEMcopy) list_cells=extract(DEMcopy,sppolydF) list_cells will be a list with the cell numbers associated with each polygon in a separate list element. The initial extract() will not take less time, but it should be useful in the future, e.g. you could do poly1=list_cells[[1]] elev1=DEM[poly1] Another option I've had luck with is rasterizing polygons. Using the GDAL commandline utility gdal_rasterize you can convert the shape file of polygons to a classified raster. In R, you can then use zonal() with
some
aggregation function to get stats of DEM using your classified raster. On Thu, Oct 15, 2015 at 10:31 AM, Massimo Bressan <mbressan at arpa.veneto.it wrote:
hi, thanks for your prompt reply in fact by using raster() funciontion "reading" DEM is quite fast and also reading the polygons by readOGR() went quite smooth... the problem then come with the extract() function that is taking definitely too long... I'm not quite sure I'm completely grasping your hint about values(DEM) <- 1:ncell(DEM) for speeding up the process please also consider that the SpatialPolygonDataFrame that I readOGR()
has
23 fields; I need finally to associate the DEM values with the levels represented by one of these fields: do you think is a good idea eliminating the not necessary fields? max Il giorno Thu, 15/10/2015 alle 09.26 -0600, Dominik Schneider ha scritto:
You can use the raster package, which will leave the values on disk
if they are too big for memory.
Use the function raster() to read the DEM, then use readOGR() to read
the polygon shape file. You can then use extract() to get statistics
on your DEM. I'm guessing it'll take a while to extract from disk, so
if you are planning to extract multiple times, you should consider
copying your DEM and setting the values equal to the cell numbers with
values(DEM) <- 1:ncell(DEM). Extract this once, keep track of which
cell numbers are in which polygon and then use the cell numbers to
index the DEM when you need values by polygon.
Hope this helps,
Dominik
On Thu, Oct 15, 2015 at 9:07 AM, Massimo Bressan
<mbressan at arpa.veneto.it> wrote:
hi all
I need to perform a zonal statistics by overlapping a DEM
raster (ESRI
grid binary) with some polygons (ESRI polygons);
the problem is that I first of all need to read the raster,
which is
quite big: about 6 Gbyte;
I know about the function readGDAL() by the fantastic package
"rgdal"
which is usually working very well with smaller size files but
this time
I got completely stuck (the pc freezes invariably)
do you have any advice on how to deal with such a problem (big
file)?
thank you
regards
_______________________________________________
R-sig-Geo mailing list
R-sig-Geo at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
[[alternative HTML version deleted]]
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Michael Sumner Software and Database Engineer Australian Antarctic Division Hobart, Australia e-mail: mdsumner at gmail.com [[alternative HTML version deleted]]
I do not understand exactly why but I had to pass through a preliminary step by transforming the binary grid into an ascii grid format then read it in R by means of readAsciiGrid(), package "maptools", and finally convert it into a RasterLayer with the function raster() of the package "rgdal" after all that the extract() function was returning sensible results, otherwise not at all (and I do not know why) thank you all for your support best regards max Il giorno Thu, 15/10/2015 alle 13.23 -0600, Dominik Schneider ha scritto:
I need finally to associate the DEM values with the levels
represented by one of these fields:
do you think is a good idea eliminating the not necessary
fields?
if you mean that you don't need information for all the polygons then
yes, you should subset the spatialpolygonsdF. If you're just concerned
about multiple attribute fields then I don't think it'll make a
difference.
I'm not quite sure I'm completely grasping your hint about
values(DEM) <- 1:ncell(DEM)
for speeding up the process
DEMcopy=DEM
values(DEMcopy) <- 1:ncell(DEMcopy)
list_cells=extract(DEMcopy,sppolydF)
list_cells will be a list with the cell numbers associated with each
polygon in a separate list element. The initial extract() will not
take less time, but it should be useful in the future, e.g. you could
do
poly1=list_cells[[1]]
elev1=DEM[poly1]
Another option I've had luck with is rasterizing polygons. Using the
GDAL commandline utility gdal_rasterize you can convert the shape file
of polygons to a classified raster. In R, you can then use zonal()
with some aggregation function to get stats of DEM using your
classified raster.
On Thu, Oct 15, 2015 at 10:31 AM, Massimo Bressan
<mbressan at arpa.veneto.it> wrote:
hi, thanks for your prompt reply
in fact by using raster() funciontion "reading" DEM is quite
fast and
also reading the polygons by readOGR() went quite smooth...
the problem then come with the extract() function that is
taking definitely too
long...
I'm not quite sure I'm completely grasping your hint about
values(DEM) <- 1:ncell(DEM)
for speeding up the process
please also consider that the SpatialPolygonDataFrame that I
readOGR() has 23 fields;
I need finally to associate the DEM values with the levels
represented by one of these fields:
do you think is a good idea eliminating the not necessary
fields?
max
Il giorno Thu, 15/10/2015 alle 09.26 -0600, Dominik Schneider
ha
scritto:
> You can use the raster package, which will leave the values
on disk
> if they are too big for memory.
> Use the function raster() to read the DEM, then use
readOGR() to read
> the polygon shape file. You can then use extract() to get
statistics
> on your DEM. I'm guessing it'll take a while to extract from
disk, so
> if you are planning to extract multiple times, you should
consider
> copying your DEM and setting the values equal to the cell
numbers with
> values(DEM) <- 1:ncell(DEM). Extract this once, keep track
of which
> cell numbers are in which polygon and then use the cell
numbers to
> index the DEM when you need values by polygon.
> Hope this helps,
> Dominik
>
>
>
> On Thu, Oct 15, 2015 at 9:07 AM, Massimo Bressan
> <mbressan at arpa.veneto.it> wrote:
> hi all
>
>
> I need to perform a zonal statistics by overlapping
a DEM
> raster (ESRI
> grid binary) with some polygons (ESRI polygons);
>
> the problem is that I first of all need to read the
raster,
> which is
> quite big: about 6 Gbyte;
>
> I know about the function readGDAL() by the
fantastic package
> "rgdal"
> which is usually working very well with smaller size
files but
> this time
> I got completely stuck (the pc freezes invariably)
>
> do you have any advice on how to deal with such a
problem (big
> file)?
>
> thank you
>
> regards
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
>
Michael, is the tiling issue also a problem if its tiled by layer? So for a netcdf it is my understanding that extracting a layer is faster if the tiles are nrow x ncol x 1 instead of something like nrow/2 x ncol/2 x 1 Domink
On Oct 16, 2015 01:42, "Michael Sumner" <mdsumner at gmail.com> wrote:
Fwiw, the worst slow downs I see with extract are when the raster is natively tiled. (Been meaning to write the fix and/or report to Robert). Can you check if your file is tiled? Use gdalinfo command line utility. Extract, at least for points and I assume also for polygons, scans the file line by line but this needs to be done tile by tile to avoid repetitive and wasteful I/O. Cheers, Mike On Friday, October 16, 2015, Dominik Schneider < Dominik.Schneider at colorado.edu> wrote:
I need finally to associate the DEM values with the levels represented
by
one of these fields: do you think is a good idea eliminating the not necessary fields?
if you mean that you don't need information for all the polygons then
yes,
you should subset the spatialpolygonsdF. If you're just concerned about multiple attribute fields then I don't think it'll make a difference. I'm not quite sure I'm completely grasping your hint about
values(DEM) <- 1:ncell(DEM) for speeding up the process
DEMcopy=DEM values(DEMcopy) <- 1:ncell(DEMcopy) list_cells=extract(DEMcopy,sppolydF) list_cells will be a list with the cell numbers associated with each polygon in a separate list element. The initial extract() will not take less time, but it should be useful in the future, e.g. you could do poly1=list_cells[[1]] elev1=DEM[poly1] Another option I've had luck with is rasterizing polygons. Using the GDAL commandline utility gdal_rasterize you can convert the shape file of polygons to a classified raster. In R, you can then use zonal() with
some
aggregation function to get stats of DEM using your classified raster. On Thu, Oct 15, 2015 at 10:31 AM, Massimo Bressan <
mbressan at arpa.veneto.it>
wrote:
hi, thanks for your prompt reply in fact by using raster() funciontion "reading" DEM is quite fast and also reading the polygons by readOGR() went quite smooth... the problem then come with the extract() function that is taking definitely too long... I'm not quite sure I'm completely grasping your hint about values(DEM) <- 1:ncell(DEM) for speeding up the process please also consider that the SpatialPolygonDataFrame that I readOGR()
has
23 fields; I need finally to associate the DEM values with the levels represented
by
one of these fields: do you think is a good idea eliminating the not necessary fields? max Il giorno Thu, 15/10/2015 alle 09.26 -0600, Dominik Schneider ha scritto:
You can use the raster package, which will leave the values on disk
if they are too big for memory.
Use the function raster() to read the DEM, then use readOGR() to read
the polygon shape file. You can then use extract() to get statistics
on your DEM. I'm guessing it'll take a while to extract from disk, so
if you are planning to extract multiple times, you should consider
copying your DEM and setting the values equal to the cell numbers with
values(DEM) <- 1:ncell(DEM). Extract this once, keep track of which
cell numbers are in which polygon and then use the cell numbers to
index the DEM when you need values by polygon.
Hope this helps,
Dominik
On Thu, Oct 15, 2015 at 9:07 AM, Massimo Bressan
<mbressan at arpa.veneto.it> wrote:
hi all
I need to perform a zonal statistics by overlapping a DEM
raster (ESRI
grid binary) with some polygons (ESRI polygons);
the problem is that I first of all need to read the raster,
which is
quite big: about 6 Gbyte;
I know about the function readGDAL() by the fantastic package
"rgdal"
which is usually working very well with smaller size files but
this time
I got completely stuck (the pc freezes invariably)
do you have any advice on how to deal with such a problem (big
file)?
thank you
regards
_______________________________________________
R-sig-Geo mailing list
R-sig-Geo at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
[[alternative HTML version deleted]]
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
-- Michael Sumner Software and Database Engineer Australian Antarctic Division Hobart, Australia e-mail: mdsumner at gmail.com