This is not an R solution and I am not even sure if this speeds up your
process. But in the past I have used starspan for this kind of work. It
worked fairly well for me for large datasets. But it was a one off process
that I didn't mind spending couple of hours. I also did a naive
parallelization by breaking up the polygon files multiple parts and then
assembling them back, ?but if you really need the accurate proportion of
cell area of a cell that falls across two polygons, this strategy wont work.
http://starspan.projects.atlas.ca.gov/doku/doku.php
Nikhil Kaza
Asst. Professor,
City and Regional Planning
University of North Carolina
nikhil.list at gmail.com
On Jun 30, 2010, at 10:12 AM, Agustin Lobo wrote:
eugrd025EFDC <- readOGR(dsn="eugrd025EFDC",layer="eugrd025EFDC")
v <- polygonValues(p=eugrd025EFDC, Br, weights=TRUE)
where
str(eugrd025EFDC,max.level=2)
Formal class 'SpatialPolygonsDataFrame' [package "sp"] with 5 slots
?..@ data ? ? ? :'data.frame': ?18000 obs. of ?5 variables:
?..@ polygons ? :List of 18000
?.. .. [list output truncated]
?..@ plotOrder ?: int [1:18000] 17901 17900 17902 17903 17899 17898
17904 17897 17905 17906 ...
?..@ bbox ? ? ? : num [1:2, 1:2] 2484331 1314148 6575852 4328780
?.. ..- attr(*, "dimnames")=List of 2
?..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slots
Cells: ?13967442
NAs ?: ?0
Min. ? ? ? 0.00
1st Qu. ? ?0.00
Median ? ? 0.00
Mean ? ? ?48.82
3rd Qu. ? ?0.00
Max. ? ?4999.00
so quite large objects.
The problem is that ?polygonValues() has been running (and not
completed the task) for
more than 2 h on a intel core i7 machine with 16 Gb RAM (Dell
Precision M6500), so a pretty powerful machine.
Is there any way I could speed up this process?
Also, is there anything I could do in order to take better advantage
of the 8 processing threads?
Currently, I see only 1 cpu working for R processes and the rest
remain pretty inactive
Thanks
Agus