Dear ALl, I hope you could help me out on this simple problem. I have many thousand lines like this: NM_019397 // Egfl6 // EGF-like-domain, multiple 6 // X F5|X 71.5 cM // 54156 I want to extract the string inside the first // //, in this case is Egf16. How do I apply grep function? Thanks. Stephen HK Wong Stephen HK Wong Research Associate,Cleary Lab Lab Phone: 650-723-5340 MC 5457 Lokey Stem Cell Research Building 265 Campus Drive, Rm. G2035 Stanford, California 94305-5324
how to GREP out a string like this......THANKS.
4 messages · Hon Kit (Stephen) Wong, David Winsemius, William Dunlap +1 more
On May 20, 2013, at 4:45 PM, Hon Kit (Stephen) Wong wrote:
Dear ALl, I hope you could help me out on this simple problem. I have many thousand lines like this: NM_019397 // Egfl6 // EGF-like-domain, multiple 6 // X F5|X 71.5 cM // 54156 I want to extract the string inside the first // //, in this case is Egf16.
strsplit("NM_019397 // Egfl6 // EGF-like-domain, multiple 6 // X F5|X 71.5 cM // 54156", split=" // ")[[1]][2]
[1] "Egfl6" You can use; lapply( lines, function(l) strsplit(l, " // ")[[1]][2] )
How do I apply grep function?
Well, grep is only going to give you a test and you want a replacement or extraction function. sub or gsub would be possibilities but they are greedy so its a bit more difficult to constrain their targeting to only the first and second "//".
David Winsemius Alameda, CA, USA
You suggested > lapply( lines, function(l) strsplit(l, " // ")[[1]][2] ) strsplit is vectorized so the following is equivalent but simpler and quicker: lapply(strsplit(lines, " // "), function(x)x[2]) The OP probably wants a character vector, not a list so use sapply or vapply (safer than sapply and a bit quicker). Any of the following would do: vapply(strsplit(lines, " // "), `[`, 2, FUN.VALUE="") vapply(strsplit(lines, " // "), function(x)x[2], FUN.VALUE="") sapply(strsplit(lines, " // "), `[`, 2) # wrong answer if length(lines)==0 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of David Winsemius Sent: Monday, May 20, 2013 5:17 PM To: Hon Kit (Stephen) Wong Cc: r-help at r-project.org Subject: Re: [R] how to GREP out a string like this......THANKS. On May 20, 2013, at 4:45 PM, Hon Kit (Stephen) Wong wrote:
Dear ALl, I hope you could help me out on this simple problem. I have many thousand lines like
this:
NM_019397 // Egfl6 // EGF-like-domain, multiple 6 // X F5|X 71.5 cM // 54156 I want to extract the string inside the first // //, in this case is Egf16.
strsplit("NM_019397 // Egfl6 // EGF-like-domain, multiple 6 // X F5|X 71.5 cM //
54156", split=" // ")[[1]][2] [1] "Egfl6" You can use; lapply( lines, function(l) strsplit(l, " // ")[[1]][2] )
How do I apply grep function?
Well, grep is only going to give you a test and you want a replacement or extraction function. sub or gsub would be possibilities but they are greedy so its a bit more difficult to constrain their targeting to only the first and second "//". -- David Winsemius Alameda, CA, USA
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi,
May be this helps.
lines<- readLines(textConnection("NM_019397 // Egfl6 // EGF-like-domain, multiple 6 // X F5|X 71.5 cM // 54156
NM_019397 // Egfl7 // EGF-like-domain, multiple 6 // X F5|X 71.5 cM // 54158"))
library(stringr)
word(lines,2,sep=" // ")
#[1] "Egfl6" "Egfl7"
lines1<- readLines(textConnection("NM_019397 // Egfl6 // EGF-like-domain, multiple 6 // X F5|X 71.5 cM // 54156
NM_019397 // Egfl7 domain // EGF-like-domain, multiple 6 // X F5|X 71.5 cM // 54158"))
?word(lines1,2,sep=" // ")
#[1] "Egfl6"??????? "Egfl7 domain"
A.K.
----- Original Message -----
From: Hon Kit (Stephen) Wong <honkit at stanford.edu>
To: r-help at r-project.org
Cc:
Sent: Monday, May 20, 2013 7:45 PM
Subject: [R] how to GREP out a string like this......THANKS.
Dear ALl,
I hope you could help me out on this simple problem. I have many thousand lines like this:
NM_019397 // Egfl6 // EGF-like-domain, multiple 6 // X F5|X 71.5 cM // 54156
I want to extract the string inside the first // //, in this case is Egf16.
How do I apply grep function?
Thanks.
Stephen HK Wong
Stephen HK Wong
Research Associate,Cleary Lab
Lab Phone: 650-723-5340
MC 5457
Lokey Stem Cell Research Building
265 Campus Drive, Rm. G2035
Stanford, California 94305-5324
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.