best practice for reading large shapefiles?
Would loading the shapefile into postgresql first and then use readOGR to read from postgres be a recommended approach? That is, would the bottleneck still occur? Thank you. -- Vinh
On Tue, Apr 26, 2016 at 11:18 AM, Vinh Nguyen <vinhdizzo at gmail.com> wrote:
Hi, I have a very large shapefile that I would like to read into R (dbf=5.6gb and shp=2.3gb). For reference, I downloaded the 30 shapefiles of the [Public Land Survey System](http://www.geocommunicator.gov/GeoComm/lsis_home/home/) and combined them into a single national file via gdal (ogr2ogr) as described [here](http://www.northrivergeographic.com/ogr2ogr-merge-shapefiles); I originally attempted to combine the files in R as described [here](https://stat.ethz.ch/pipermail/r-sig-geo/2011-May/011814.html), but ran out of memory about 80% in, but luckily discovered ogr2ogr. I'm reading in the combined file in R via readOGR, and it's been over an hour and R appears to hang. When I check the task manager, the R session currently consumes <10% CPU and 245MB. Not sure if any productive activity is going on, so I'm just waiting it out. [This](http://r-sig-geo.2731867.n2.nabble.com/Long-time-to-load-shapefiles-td7584869.html) thread describes that readOGR can be slow for large shapefiles, and suggested that the SpatialDataFrame be saved in an R format. My problem is getting the entire shapefile read in the first place before I could save it as an R object. Does anyone have any suggestions for reading this large shapefile into R? Thank you for your help. -- Vinh