Skip to content

Matching a vector with a matrix row

7 messages · Luis Felipe Parra, Joshua Wiley, Niels Richard Hansen +2 more

#
Hi Felipe,

Since matrices are just a vector with dimensions, you could easily use
something like this (which at least on my system, is slightly faster):

results <- which(Matrix %in% LHS)

I'm not sure this is the fastest technique thought.  It will return a
vector of the positions in "Matrix" that match "LHS".  You can easily
convert to row numbers if you want since all columns have the same
number of rows.

HTH,

Josh

On Thu, Apr 21, 2011 at 8:56 PM, Luis Felipe Parra
<felipe.parra at quantil.com.co> wrote:

  
    
#
Joshua and Luis

Neither of you is exactly solving the problem as stated, see
below. Luis, could you clarify if you want rows that are _equal_
to a vector or rows with entries _contained_ in a vector?

If

m <- matrix(c("A", "B", "C", "B", "A", "A"), 3, 2)
LHS <- c("A", "B")

then LHS equals the first row only, while

apply(m, 1, function(x) all(x %in% LHS))
[1]  TRUE  TRUE FALSE

finds the rows with entries contained in LHS and

which(m %in% LHS)
[1] 1 2 4 5 6

finds all entries in m that equals an entry in LHS. While
you can turn the latter into the former, this will have some
computational costs too. The R-code

apply(m, 1, function(x) all(x == LHS))
[1]  TRUE FALSE FALSE

finds the rows that are equal to LHS.

- Niels
On 22/04/11 00.18, Joshua Wiley wrote:

  
    
#
Here is one solution:

rowmatch <- function(A,B) { 
# Rows in A that match the rows in B
    f <- function(...) paste(..., sep=":")
   if(!is.matrix(B)) B <- matrix(B, 1, length(B))
    a <- do.call("f", as.data.frame(A))
    b <- do.call("f", as.data.frame(B))
    match(b, a)
}

A <- matrix(1:1000, 100, 10, byrow=TRUE)
B <- matrix(21:40, 2, 10, byrow=TRUE)
rowmatch(A, B )

b <- 51:60
rowmatch(A, b)

Ravi.
1 day later
#
On Sat, Apr 23, 2011 at 08:56:33AM +0800, Luis Felipe Parra wrote:
This sounds like you want an equivalent of

  all(LHS %in% x)

However, in your original post, you used

  all(x %in% LHS)

What is correct?

If the equality of x and LHS should be tested, then try

   setequal(x, LHS)

If the rows may contain repeated elements and the number of
repetitions should also match, then try

  identical(sort(x), sort(LHS))

with a precomputed sort(LHS) for efficiency.

If the number of the different character values in the whole
matrix is not too large, then efficiency of the comparison
may be improved, if the matrix is converted to a matrix
consisting of integer codes instead of the original character
values. See ?factor for the meaning of "integer codes".
After this conversion, the comparison can be done by comparing
integers instead of character values, which is faster.

Hope this helps.

Petr Savicky.
#
I gave a solution previously with integer elements.  It also works well for real numbers.

rowMatch <- function(A,B) {
# Rows in A that match the rows in B
# The row indexes correspond to A
    f <- function(...) paste(..., sep=":")
   if(!is.matrix(B)) B <- matrix(B, 1, length(B))
    a <- do.call("f", as.data.frame(A))
    b <- do.call("f", as.data.frame(B))
    match(b, a)
}

A <- matrix(rnorm(100000), 5000, 20)
sel <- sample(1:nrow(A), size=100, replace=TRUE)
B <- A[sel,]

system.time(rows <- rowMatch(A, B ))
all.equal(sel, rows)

sel <- sample(1:nrow(A), size=1)
b <- c(A[sel,])
system.time(row <- rowMatch(A, b))
all.equal(sel, row)

I am curious to see if there are better/faster ways to do this.

Ravi.