Dear all, maybe a simple problem but I found no solution for my problem. I have a matrix Y with 23 000 rows and 220 colums. The entries are "A", "B" or "C". I want to extract all rows (as a matrix ) of the matrix Y where all entries of a row are (for example) "A". Is there any solution? I tried the stringr- package but i doesn't work out. All the best and thanks! Peter
Deleting rows with special character
7 messages · Peter Kupfer, Sarah Goslee, John Kane +2 more
Hi Peter,
On Fri, Nov 16, 2012 at 9:04 AM, Peter Kupfer <peter.kupfer at me.com> wrote:
Dear all, maybe a simple problem but I found no solution for my problem. I have a matrix Y with 23 000 rows and 220 colums. The entries are "A", "B" or "C".
A reproducible example with sample data is helpful.
I want to extract all rows (as a matrix ) of the matrix Y where all entries of a row are (for example) "A".
Really? Why not just make a new matrix with the right number of "A" values?
Is there any solution? I tried the stringr- package but i doesn't work out.
Of course there is. Here's one option. But I'm not sure you've really stated your actual problem. This extracts the rows where all values are "A", and might at least get you started toward your real problem. testdata <- matrix(c( "A", "B", "C", "B", "B", "B", "C", "A", "A", "A", "A", "A"), ncol=3, byrow=TRUE) testdata.A <- testdata[apply(testdata, 1, function(x)all(x == "A")), , drop=FALSE] -- Sarah Goslee http://www.functionaldiversity.org
Hey Sara, first: Thanks for the fast reply! I checked the apply function and I found my error. For sure: I forgot to send an sample data. After sending the mail I recognized it. Sorry about this! Once again: Thanks for the fast reply and your help. Best Peter Am 16.11.2012 um 15:26 schrieb Sarah Goslee <sarah.goslee at gmail.com>:
Hi Peter, On Fri, Nov 16, 2012 at 9:04 AM, Peter Kupfer <peter.kupfer at me.com> wrote:
Dear all, maybe a simple problem but I found no solution for my problem. I have a matrix Y with 23 000 rows and 220 colums. The entries are "A", "B" or "C".
A reproducible example with sample data is helpful.
I want to extract all rows (as a matrix ) of the matrix Y where all entries of a row are (for example) "A".
Really? Why not just make a new matrix with the right number of "A" values?
Is there any solution? I tried the stringr- package but i doesn't work out.
Of course there is. Here's one option. But I'm not sure you've really stated your actual problem. This extracts the rows where all values are "A", and might at least get you started toward your real problem. testdata <- matrix(c( "A", "B", "C", "B", "B", "B", "C", "A", "A", "A", "A", "A"), ncol=3, byrow=TRUE) testdata.A <- testdata[apply(testdata, 1, function(x)all(x == "A")), , drop=FALSE] -- Sarah Goslee http://www.functionaldiversity.org
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example John Kane Kingston ON Canada
-----Original Message----- From: peter.kupfer at me.com Sent: Fri, 16 Nov 2012 15:04:31 +0100 To: r-help at r-project.org Subject: [R] Deleting rows with special character Dear all, maybe a simple problem but I found no solution for my problem. I have a matrix Y with 23 000 rows and 220 colums. The entries are "A", "B" or "C". I want to extract all rows (as a matrix ) of the matrix Y where all entries of a row are (for example) "A". Is there any solution? I tried the stringr- package but i doesn't work out. All the best and thanks! Peter
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
____________________________________________________________ FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!
thanks John Kane Kingston ON Canada
-----Original Message----- From: peter.kupfer at me.com Sent: Fri, 16 Nov 2012 15:32:23 +0100 To: sarah.goslee at gmail.com Subject: Re: [R] Deleting rows with special character Hey Sara, first: Thanks for the fast reply! I checked the apply function and I found my error. For sure: I forgot to send an sample data. After sending the mail I recognized it. Sorry about this! Once again: Thanks for the fast reply and your help. Best Peter Am 16.11.2012 um 15:26 schrieb Sarah Goslee <sarah.goslee at gmail.com>:
Hi Peter, On Fri, Nov 16, 2012 at 9:04 AM, Peter Kupfer <peter.kupfer at me.com> wrote:
Dear all, maybe a simple problem but I found no solution for my problem. I have a matrix Y with 23 000 rows and 220 colums. The entries are "A", "B" or "C".
A reproducible example with sample data is helpful.
I want to extract all rows (as a matrix ) of the matrix Y where all entries of a row are (for example) "A".
Really? Why not just make a new matrix with the right number of "A" values?
Is there any solution? I tried the stringr- package but i doesn't work out.
Of course there is. Here's one option. But I'm not sure you've really stated your actual problem. This extracts the rows where all values are "A", and might at least get you started toward your real problem. testdata <- matrix(c( "A", "B", "C", "B", "B", "B", "C", "A", "A", "A", "A", "A"), ncol=3, byrow=TRUE) testdata.A <- testdata[apply(testdata, 1, function(x)all(x == "A")), , drop=FALSE] -- Sarah Goslee http://www.functionaldiversity.org
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
____________________________________________________________ GET FREE SMILEYS FOR YOUR IM & EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM?, MSN? Messenger, Yahoo!? Messenger, ICQ?, Google Talk? and most webmails
HI, Not sure how your dataset looks like: If it is like this: set.seed(18) mat1<-matrix(sample(LETTERS[1:3],54,replace=TRUE),ncol=3) ?mat1[apply(mat1,1,function(x) all(x=="A")),] #[1] "A" "A" "A" which(apply(mat1,1,function(x) all(x=="A")) ) #[1] 16 A.K. ----- Original Message ----- From: Peter Kupfer <peter.kupfer at me.com> To: "r-help at r-project.org" <r-help at r-project.org> Cc: Sent: Friday, November 16, 2012 9:04 AM Subject: [R] Deleting rows with special character Dear all, maybe a simple problem but I found no solution for my problem. I have a matrix Y with 23 000 rows and 220 colums. The entries are "A", "B" or "C". I want to extract all rows (as a matrix ) of the matrix Y where all entries of a row are (for example) "A". Is there any solution? I tried the stringr- package but i doesn't work out. All the best and thanks! Peter ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Nov 16, 2012, at 8:26 AM, Sarah Goslee <sarah.goslee at gmail.com> wrote:
Hi Peter, On Fri, Nov 16, 2012 at 9:04 AM, Peter Kupfer <peter.kupfer at me.com> wrote:
Dear all, maybe a simple problem but I found no solution for my problem. I have a matrix Y with 23 000 rows and 220 colums. The entries are "A", "B" or "C".
A reproducible example with sample data is helpful.
I want to extract all rows (as a matrix ) of the matrix Y where all entries of a row are (for example) "A".
Really? Why not just make a new matrix with the right number of "A" values?
Is there any solution? I tried the stringr- package but i doesn't work out.
Of course there is. Here's one option. But I'm not sure you've really stated your actual problem. This extracts the rows where all values are "A", and might at least get you started toward your real problem. testdata <- matrix(c( "A", "B", "C", "B", "B", "B", "C", "A", "A", "A", "A", "A"), ncol=3, byrow=TRUE) testdata.A <- testdata[apply(testdata, 1, function(x)all(x == "A")), , drop=FALSE]
Using something like rowSums() might be faster in this case, based upon brief testing.
Since using a boolean returns TRUE/FALSE, which have numeric equivalent values of 1/0, respectively, you can subset the matrix based upon the rowSums() values being equal to the number of columns in the matrix, which indicates that all values in the row match your desired value.
# Create a 230000 * 220 matrix with random values.
set.seed(1)
testdata <- matrix(sample(c("A", "B", "C"), 23000*220, replace = TRUE), ncol = 220)
# Set 100 random rows to all "A"s
set.seed(2)
testdata[sample(23000, 100), ] <- rep("A", 220)
system.time(Sub1 <-testdata[apply(testdata, 1, function(x)all(x == "A")), ,drop = FALSE])
user system elapsed 0.454 0.047 0.503
system.time(Sub2 <- testdata[rowSums(testdata == "A") == ncol(testdata), , drop = FALSE])
user system elapsed 0.089 0.001 0.090
str(Sub1)
chr [1:100, 1:220] "A" "A" "A" "A" "A" "A" "A" "A" ...
str(Sub2)
chr [1:100, 1:220] "A" "A" "A" "A" "A" "A" "A" "A" ...
identical(Sub1, Sub2)
[1] TRUE See ?rowSums, which uses a .Internal, so is fast code. Regards, Marc Schwartz