point.in.polygon() on massive datasets

Thu, Dec 13, 2007 7:23 AM

Dear all,
I have a dataset of about 50 million lat/lon coordinates each of which falls into one of 550 polygons.
I need to assign their memberships and have used point.in.polygon() for that purpose.
However, the simple way of looping over the 50 million points clearly takes a looong time; 1 million points took about 3-4 days on a fast Linux server with lots of memory.
Am I overlooking obvious ways of making this massive computation more efficient ? Would R trees help ?
Should I try to compile the C code for point.in.polygon() (available from gstat) and run it outside R as a standalone executable ?
I am already using apply() to mitigate the inefficiency of the for loop in R.

Any help would be greatly appreciated,

Thanks,

Markus

point.in.polygon() on massive datasets

Thread (5 messages)