Skip to content

how to get to interesting part of pattern match

4 messages · Vadim Ogranovich, Jean Eid, Gabor Grothendieck +1 more

#
Hi,

I am looking for a way to extract an "interesting" part of the match to
a regular expression. For example the pattern "[./](*.)" matches a
substring that begins with either "." or "/" followed by anything. I am
interested in this "anything" w/o the "." or "/" prefix. If say I match
the pattern against "abc/foo" I want to get "foo", not "/foo". In Perl
one can simply wrap the "interesting" part in () and get it out of the
match. Is it possible to do a similar thing in R?

There seems to be a way to refer to the match, see below, but I couldn't
figure out how to make gsub return it.
[1] "abcfoo"


Thanks,
Vadim
#
sub(".*/", "", "abc/foo")
[1] "foo"


Jean
On Thu, 18 Nov 2004, Vadim Ogranovich wrote:

            
#
Vadim Ogranovich <vograno <at> evafunds.com> writes:


: I am looking for a way to extract an "interesting" part of the match to
: a regular expression. For example the pattern "[./](*.)" matches a
: substring that begins with either "." or "/" followed by anything. I am
: interested in this "anything" w/o the "." or "/" prefix. If say I match
: the pattern against "abc/foo" I want to get "foo", not "/foo". In Perl
: one can simply wrap the "interesting" part in () and get it out of the
: match. Is it possible to do a similar thing in R?
: 
: There seems to be a way to refer to the match, see below, but I couldn't
: figure out how to make gsub return it.
: > gsub("[./](*.)", "\\1", "abc/foo")
: [1] "abcfoo"

Assuming what was meant is the following (dot and star are 
transposed and gsub is sub):

	sub("[./](.*)", "\\1", "abc/foo")

then the regular expression matches /foo and the backreference
contains foo so it replaces /foo with foo which is why it returns
abcfoo .

To get just foo ensure that your regular expression matches 
the entire string so that the entire string is replaced with
the backreference:

	sub("[^./]*[./](.*)", "\\1", "abc/foo")
#
Vadim Ogranovich wrote:

            
what about:

gsub(".*[./](*.)", "\\1", "abc/foo")

output-start
[1] "foo"
output-end

or try:

strsplit("abc/foo","/")[[1]][2]

output-start
[1] "foo"
output-end

Peter Wolf