An embedded and charset-unspecified text was scrubbed... Name: no disponible URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110422/b88650af/attachment.pl>
Matching a vector with a matrix row
7 messages · Luis Felipe Parra, Joshua Wiley, Niels Richard Hansen +2 more
Hi Felipe, Since matrices are just a vector with dimensions, you could easily use something like this (which at least on my system, is slightly faster): results <- which(Matrix %in% LHS) I'm not sure this is the fastest technique thought. It will return a vector of the positions in "Matrix" that match "LHS". You can easily convert to row numbers if you want since all columns have the same number of rows. HTH, Josh On Thu, Apr 21, 2011 at 8:56 PM, Luis Felipe Parra
<felipe.parra at quantil.com.co> wrote:
Hello I am trying to compare a vector with a Matrix's rows.The vector has the same length as the number of columns of the matrix, and I would like to find the row numbers where the matrix's row us the same as the given vector. What I am doing at the moment is using apply as follows: apply(Matrix,1,function(x)all(x%in%LHS)) but this isn't too fast actually. I would like ?to know if any body knows an efficient (fast) way of doing this? The matrix contains stings (not numbers). Thank you Felipe Parra ? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/
Joshua and Luis
Neither of you is exactly solving the problem as stated, see
below. Luis, could you clarify if you want rows that are _equal_
to a vector or rows with entries _contained_ in a vector?
If
m <- matrix(c("A", "B", "C", "B", "A", "A"), 3, 2)
LHS <- c("A", "B")
then LHS equals the first row only, while
apply(m, 1, function(x) all(x %in% LHS))
[1] TRUE TRUE FALSE
finds the rows with entries contained in LHS and
which(m %in% LHS)
[1] 1 2 4 5 6
finds all entries in m that equals an entry in LHS. While
you can turn the latter into the former, this will have some
computational costs too. The R-code
apply(m, 1, function(x) all(x == LHS))
[1] TRUE FALSE FALSE
finds the rows that are equal to LHS.
- Niels
On 22/04/11 00.18, Joshua Wiley wrote:
Hi Felipe, Since matrices are just a vector with dimensions, you could easily use something like this (which at least on my system, is slightly faster): results<- which(Matrix %in% LHS) I'm not sure this is the fastest technique thought. It will return a vector of the positions in "Matrix" that match "LHS". You can easily convert to row numbers if you want since all columns have the same number of rows. HTH, Josh On Thu, Apr 21, 2011 at 8:56 PM, Luis Felipe Parra <felipe.parra at quantil.com.co> wrote:
Hello I am trying to compare a vector with a Matrix's rows.The vector has
the same length as the number of columns of the matrix, and I would like to
find the row numbers where the matrix's row us the same as the given vector.
What I am doing at the moment is using apply as follows:
apply(Matrix,1,function(x)all(x%in%LHS))
but this isn't too fast actually. I would like to know if any body knows an
efficient (fast) way of doing this? The matrix contains stings (not
numbers).
Thank you
Felipe Parra
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Niels Richard Hansen Web: www.math.ku.dk/~richard Associate Professor Email: Niels.R.Hansen at math.ku.dk Department of Mathematical Sciences nielsrichardhansen at gmail.com University of Copenhagen Skype: nielsrichardhansen.dk Universitetsparken 5 Phone: +1 510 502 8161 2100 Copenhagen ? Denmark
An embedded and charset-unspecified text was scrubbed... Name: no disponible URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110423/d4a733a9/attachment.pl>
Here is one solution:
rowmatch <- function(A,B) {
# Rows in A that match the rows in B
f <- function(...) paste(..., sep=":")
if(!is.matrix(B)) B <- matrix(B, 1, length(B))
a <- do.call("f", as.data.frame(A))
b <- do.call("f", as.data.frame(B))
match(b, a)
}
A <- matrix(1:1000, 100, 10, byrow=TRUE)
B <- matrix(21:40, 2, 10, byrow=TRUE)
rowmatch(A, B )
b <- 51:60
rowmatch(A, b)
Ravi.
From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On Behalf Of Luis Felipe Parra [felipe.parra at quantil.com.co]
Sent: Friday, April 22, 2011 8:56 PM
To: Niels Richard Hansen
Cc: r-help
Subject: Re: [R] Matching a vector with a matrix row
Sent: Friday, April 22, 2011 8:56 PM
To: Niels Richard Hansen
Cc: r-help
Subject: Re: [R] Matching a vector with a matrix row
Hello Niels, I am trying to find the rows in Matrix which contain all of the
elements in LHS.
Thank you
Felipe Parra
On Fri, Apr 22, 2011 at 10:30 PM, Niels Richard Hansen <
Niels.R.Hansen+lists at math.ku.dk> wrote:
> Joshua and Luis
>
> Neither of you is exactly solving the problem as stated, see
> below. Luis, could you clarify if you want rows that are _equal_
> to a vector or rows with entries _contained_ in a vector?
>
> If
>
> m <- matrix(c("A", "B", "C", "B", "A", "A"), 3, 2)
> LHS <- c("A", "B")
>
> then LHS equals the first row only, while
>
> apply(m, 1, function(x) all(x %in% LHS))
> [1] TRUE TRUE FALSE
>
> finds the rows with entries contained in LHS and
>
> which(m %in% LHS)
> [1] 1 2 4 5 6
>
> finds all entries in m that equals an entry in LHS. While
> you can turn the latter into the former, this will have some
> computational costs too. The R-code
>
> apply(m, 1, function(x) all(x == LHS))
> [1] TRUE FALSE FALSE
>
> finds the rows that are equal to LHS.
>
> - Niels
>
>
> On 22/04/11 00.18, Joshua Wiley wrote:
>
>> Hi Felipe,
>>
>> Since matrices are just a vector with dimensions, you could easily use
>> something like this (which at least on my system, is slightly faster):
>>
>> results<- which(Matrix %in% LHS)
>>
>> I'm not sure this is the fastest technique thought. It will return a
>> vector of the positions in "Matrix" that match "LHS". You can easily
>> convert to row numbers if you want since all columns have the same
>> number of rows.
>>
>> HTH,
>>
>> Josh
>>
>> On Thu, Apr 21, 2011 at 8:56 PM, Luis Felipe Parra
>> <felipe.parra at quantil.com.co> wrote:
>>
>>> Hello I am trying to compare a vector with a Matrix's rows.The vector has
>>> the same length as the number of columns of the matrix, and I would like
>>> to
>>> find the row numbers where the matrix's row us the same as the given
>>> vector.
>>> What I am doing at the moment is using apply as follows:
>>>
>>> apply(Matrix,1,function(x)all(x%in%LHS))
>>>
>>> but this isn't too fast actually. I would like to know if any body knows
>>> an
>>> efficient (fast) way of doing this? The matrix contains stings (not
>>> numbers).
>>>
>>> Thank you
>>>
>>> Felipe Parra
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>
>>
> --
> Niels Richard Hansen Web: www.math.ku.dk/~richard
> Associate Professor Email: Niels.R.Hansen at math.ku.dk
> Department of Mathematical Sciences
> nielsrichardhansen at gmail.com
> University of Copenhagen Skype: nielsrichardhansen.dk
> Universitetsparken 5 Phone: +1 510 502 8161
> 2100 Copenhagen ?
> Denmark
>
>
>
>
>
>
>
>
1 day later
On Sat, Apr 23, 2011 at 08:56:33AM +0800, Luis Felipe Parra wrote:
Hello Niels, I am trying to find the rows in Matrix which contain all of the elements in LHS.
This sounds like you want an equivalent of all(LHS %in% x) However, in your original post, you used all(x %in% LHS) What is correct? If the equality of x and LHS should be tested, then try setequal(x, LHS) If the rows may contain repeated elements and the number of repetitions should also match, then try identical(sort(x), sort(LHS)) with a precomputed sort(LHS) for efficiency. If the number of the different character values in the whole matrix is not too large, then efficiency of the comparison may be improved, if the matrix is converted to a matrix consisting of integer codes instead of the original character values. See ?factor for the meaning of "integer codes". After this conversion, the comparison can be done by comparing integers instead of character values, which is faster. Hope this helps. Petr Savicky.
I gave a solution previously with integer elements. It also works well for real numbers.
rowMatch <- function(A,B) {
# Rows in A that match the rows in B
# The row indexes correspond to A
f <- function(...) paste(..., sep=":")
if(!is.matrix(B)) B <- matrix(B, 1, length(B))
a <- do.call("f", as.data.frame(A))
b <- do.call("f", as.data.frame(B))
match(b, a)
}
A <- matrix(rnorm(100000), 5000, 20)
sel <- sample(1:nrow(A), size=100, replace=TRUE)
B <- A[sel,]
system.time(rows <- rowMatch(A, B ))
all.equal(sel, rows)
sel <- sample(1:nrow(A), size=1)
b <- c(A[sel,])
system.time(row <- rowMatch(A, b))
all.equal(sel, row)
I am curious to see if there are better/faster ways to do this.
Ravi.
From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On Behalf Of Petr Savicky [savicky at praha1.ff.cuni.cz]
Sent: Sunday, April 24, 2011 5:13 AM
To: r-help at r-project.org
Subject: Re: [R] Matching a vector with a matrix row
Sent: Sunday, April 24, 2011 5:13 AM
To: r-help at r-project.org
Subject: Re: [R] Matching a vector with a matrix row
On Sat, Apr 23, 2011 at 08:56:33AM +0800, Luis Felipe Parra wrote: > Hello Niels, I am trying to find the rows in Matrix which contain all of the > elements in LHS. This sounds like you want an equivalent of all(LHS %in% x) However, in your original post, you used all(x %in% LHS) What is correct? If the equality of x and LHS should be tested, then try setequal(x, LHS) If the rows may contain repeated elements and the number of repetitions should also match, then try identical(sort(x), sort(LHS)) with a precomputed sort(LHS) for efficiency. If the number of the different character values in the whole matrix is not too large, then efficiency of the comparison may be improved, if the matrix is converted to a matrix consisting of integer codes instead of the original character values. See ?factor for the meaning of "integer codes". After this conversion, the comparison can be done by comparing integers instead of character values, which is faster. Hope this helps. Petr Savicky. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.