Hi all,
Say I have a matrix A with dimension m x 2 and matrix B with
dimension n x 2. I would like to find the row in A that is closest to
the each row in B. Here's an example (using a loop):
set.seed(1)
A <- matrix(runif(12), 6, 2) # 6 x 2
B <- matrix(runif(6), 3, 2) # 3 x 2
m <- vector("numeric", nrow(B))
for(j in 1:nrow(B)) {
d <- (A[, 1] - B[j, 1])^2 + (A[, 2] - B[j, 2])^2
m[j] <- which.min(d)
}
All I need is m[]. I would like to accomplish this without using the
loop if possible, since for my real data n > 140K and m > 1K. I hope
this makes sense.
Thanks,
Sundar
distance between two matrices
5 messages · Brian Ripley, Roger Bivand, Sean Davis +1 more
Sounds like knn classification. See function knn1 in package class.
knn(A, B, 1:nrow(A))
gives the same answers as your loop code, and is just a carefully tuned C equivalent. There are faster ways to do this by preprocessing set A discussed e.g. in my PRNN book but your numbers took only 11s on my PC.
On Tue, 27 Jan 2004, Sundar Dorai-Raj wrote:
Hi all,
Say I have a matrix A with dimension m x 2 and matrix B with
dimension n x 2. I would like to find the row in A that is closest to
the each row in B. Here's an example (using a loop):
set.seed(1)
A <- matrix(runif(12), 6, 2) # 6 x 2
B <- matrix(runif(6), 3, 2) # 3 x 2
m <- vector("numeric", nrow(B))
for(j in 1:nrow(B)) {
d <- (A[, 1] - B[j, 1])^2 + (A[, 2] - B[j, 2])^2
m[j] <- which.min(d)
}
All I need is m[]. I would like to accomplish this without using the
loop if possible, since for my real data n > 140K and m > 1K. I hope
this makes sense.
Thanks,
Sundar
______________________________________________ R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On Tue, 27 Jan 2004, Sundar Dorai-Raj wrote:
Hi all,
Say I have a matrix A with dimension m x 2 and matrix B with
dimension n x 2. I would like to find the row in A that is closest to
the each row in B. Here's an example (using a loop):
set.seed(1)
A <- matrix(runif(12), 6, 2) # 6 x 2
B <- matrix(runif(6), 3, 2) # 3 x 2
m <- vector("numeric", nrow(B))
for(j in 1:nrow(B)) {
d <- (A[, 1] - B[j, 1])^2 + (A[, 2] - B[j, 2])^2
m[j] <- which.min(d)
}
All I need is m[]. I would like to accomplish this without using the
loop if possible, since for my real data n > 140K and m > 1K. I hope
this makes sense.
I think you need a quadtree of the larger set of points, the do lookup for buckets of the smaller one. There is a good deal of information on http://www.cs.umd.edu/~brabec/quadtree/ This isn't an answer within R, the functionality in the gstat contributed package doesn't seem to be at the user level, but it does point to the same site at UMD. Roger
Thanks, Sundar
______________________________________________ R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Roger Bivand Econonic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Breiviksveien 40, N-5045 Bergen, Norway, voice: +47-55959355, fax: +47-55959393; Roger.Bivand at nhh.no
Sundar, I'm not sure how much faster (or slower) this might be (perhaps Professor Ripley will help on this one), but one can "trick" a function from the class package called knn1 into doing this for you, I think. From your example:
set.seed(1)
A <- matrix(runif(12), 6, 2) # 6 x 2
B <- matrix(runif(6), 3, 2) # 3 x 2
m <- vector("numeric", nrow(B))
for(j in 1:nrow(B)) {
+ d <- (A[, 1] - B[j, 1])^2 + (A[, 2] - B[j, 2])^2 + m[j] <- which.min(d) + }
m
[1] 3 2 3 Now using knn1: > knn1(A,B,seq(1,nrow(A),1)) [1] 3 2 3 Levels: 1 2 3 4 5 6 Sean ----- Original Message ----- From: "Sundar Dorai-Raj" <sundar.dorai-raj at pdf.com> To: "R-help" <r-help at stat.math.ethz.ch> Sent: Tuesday, January 27, 2004 3:00 PM Subject: [R] distance between two matrices
Hi all,
Say I have a matrix A with dimension m x 2 and matrix B with
dimension n x 2. I would like to find the row in A that is closest to
the each row in B. Here's an example (using a loop):
set.seed(1)
A <- matrix(runif(12), 6, 2) # 6 x 2
B <- matrix(runif(6), 3, 2) # 3 x 2
m <- vector("numeric", nrow(B))
for(j in 1:nrow(B)) {
d <- (A[, 1] - B[j, 1])^2 + (A[, 2] - B[j, 2])^2
m[j] <- which.min(d)
}
All I need is m[]. I would like to accomplish this without using the
loop if possible, since for my real data n > 140K and m > 1K. I hope
this makes sense.
Thanks,
Sundar
______________________________________________ R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
Thanks to Prof. Ripley, Jens, Roger, and Sean. knn1 is exactly what I'm looking for. Thanks again, Sundar
Prof Brian Ripley wrote:
Sounds like knn classification. See function knn1 in package class.
knn(A, B, 1:nrow(A))
gives the same answers as your loop code, and is just a carefully tuned C equivalent. There are faster ways to do this by preprocessing set A discussed e.g. in my PRNN book but your numbers took only 11s on my PC. On Tue, 27 Jan 2004, Sundar Dorai-Raj wrote:
Hi all,
Say I have a matrix A with dimension m x 2 and matrix B with
dimension n x 2. I would like to find the row in A that is closest to
the each row in B. Here's an example (using a loop):
set.seed(1)
A <- matrix(runif(12), 6, 2) # 6 x 2
B <- matrix(runif(6), 3, 2) # 3 x 2
m <- vector("numeric", nrow(B))
for(j in 1:nrow(B)) {
d <- (A[, 1] - B[j, 1])^2 + (A[, 2] - B[j, 2])^2
m[j] <- which.min(d)
}
All I need is m[]. I would like to accomplish this without using the
loop if possible, since for my real data n > 140K and m > 1K. I hope
this makes sense.
Thanks,
Sundar
______________________________________________ R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html