finding euclidean proximate points in two datasets
On Thu, 20 May 2010, Alexander Shenkin wrote:
Hello All, I posted this over on R-help, but was then directed here. I've been pouring through the various spatial packages, but haven't come across the right thing yet. Given a set of points in 2-d space X, i'm trying to find the subset of points in Y proximate to each point in X. Furthermore, the proximity threshold of each point in X differs (X$threshold). I've constructed this myself already, but it's horrificly slow with a dataset of 40k+ points in one set, and a 700 in the other.
Could you get any further with nn2() in the RANN package? I realise that it isn't handling distances directly, but for some k=, it might help. Roger
A very inefficient example of what I'm looking for:
X = data.frame(x=c(1,2,3), y=c(2,3,1), threshold=c(1,2,4))
Y = data.frame(x=c(5,2,3,4,2,5,2,3), y=c(5,2,2,4,1,2,3,1))
proximate=list()
i=1
for (pt in 1:length(X$x)) {
proximate[[i]] <- sqrt((X[pt,]$x - Y$x)2 + (X[pt,]$y - Y$y)2)
> X[pt,]$threshold
i=i+1 } proximate Perhaps crossdist() in spatstat is what I should use, and then code a comparison with X$threshold after the cross-distances are computed. However, I was wondering if there was another tool I should be considering. David Winsemius suggested I first compare on in each coordinate to cull points outside the threshold on that axis first, before computing distances, which will help. Any and all thoughts are very welcome. Thanks in advance. Thanks, Allie
Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no