Skip to content

Grep command

7 messages · Steven Yen, Jim Lemon, Omar André Gonzáles Díaz +4 more

#
Dear all
In the grep command below, is there a way to identify only "age" and
not "age2"? In other words, I like to greb "age" and "age2"
separately, one at a time. Thanks.

x<-c("abc","def","rst","xyz","age","age2")
x

[1] "abc"  "def"  "rst"  "xyz"  "age"  "age2"

grep("age2",x)

[1] 6

grep("age",x) # I need to grab "age" only, not "age2"

[1] 5 6
#
Hi Steven,
If this is just a one-off, you could do this:

grepl("age",x) & nchar(x)<4

returning a logical vector containing TRUE for "age" but not "age2"

Jim
On Wed, May 4, 2016 at 3:45 PM, Steven Yen <syen04 at gmail.com> wrote:
#
Hi Steven,

grep uses regex... so you can use this:

-grep("age$",x): it says: match "a", then "g", then "e" and stop.  The "$"
menas until here and no more.
[1] 5

2016-05-04 1:02 GMT-05:00 Jim Lemon <drjimlemon at gmail.com>:

  
  
#
Yes, but the answer is likely to depend on the actual patterns of strings in your real data, so the sooner you go find a book or tutorial on regular expressions the better.  This is decidedly not R specific and there are already lots of resources out there.

Given the example you provide,  the pattern "age$" should work. However, that is probably not sufficiently selective for a practical data set so start learning to fish (design regex patterns) yourself.
#
You asked this question yesterday, and received responses on this same response. Is there a reason this is reposted?



-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Steven Yen
Sent: Wednesday, May 04, 2016 1:46 AM
To: r-help <r-help at r-project.org>
Subject: [R] Grep command

Dear all
In the grep command below, is there a way to identify only "age" and not "age2"? In other words, I like to greb "age" and "age2"
separately, one at a time. Thanks.

x<-c("abc","def","rst","xyz","age","age2")
x

[1] "abc"  "def"  "rst"  "xyz"  "age"  "age2"

grep("age2",x)

[1] 6

grep("age",x) # I need to grab "age" only, not "age2"

[1] 5 6

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
@ Steven;

As is almost always the case I agree with Jeff. I found that reading Rhelp and attempting to answer regex-questions was the best method to learn them. In particular I found the postings by Gabor Grothendieck very helpful in getting some degree of competence in this area. I see that his grep-related postings still exceed my grep postings and I assure you that his will be more sophisticated than my efforts. I recommend the MarkMail Rhelp mirror interface as very useful in "mining" Rhelp for knowledge:

Gabor Grothendieck answers with either 'grep' pr 'regex' in their body:

http://markmail.org/search/?q=list%3Aorg.r-project.r-help+list%3Agrep+list%3Aregex+from%3A%22Gabor+Grothendieck
#
No matter how expert you are at writing regular expressions,
it is important to list which sorts of strings you want matched
and which you do not want matched.  Saying you want to match
"age" but not "age2" leads to lots of possibilities.  Saying how
you want to categorize each string in a vector of stirngs like
the following would narrow things down.
   c("age", "ages ago", "age 60", "An aged man", "page", "Age", "age1",
      "age2",  "dark age", "the aGE")
are thinking of and someone will be able to translate that into a regular
expression (or say that regular expressions cannot do the job).


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, May 4, 2016 at 9:59 AM, David Winsemius <dwinsemius at comcast.net>
wrote: