Back to formatted view
Raw Message

Message-ID: <971536df0902242106x333ff51dk31c131d6b37d2feb@mail.gmail.com>
Date: 2009-02-25T05:06:30Z
From: Gabor Grothendieck
Subject: regexp capturing group in R
In-Reply-To: <20090224172310.eabh80cpwwk8o44c@mail.demartines.com>

Try this:

library(gsubfn)

strapply("blah blah start=20080101 end=20090224", "start=(\\d{8})
end=(\\d{8})", c, perl = TRUE)[[1]]

or perhaps just:

strapply("blah blah start=20080101 end=20090224", "\\d{8}", perl = TRUE)[[1]]


On Tue, Feb 24, 2009 at 7:23 PM,  <pierre at demartines.com> wrote:
> Hello,
>
> Newbie question: how do you capture groups in a regexp in R?
>
> Let's say I have txt="blah blah start=20080101 end=20090224".
> I'd like to get the two dates start and end.
>
> In Perl, one would say:
>
> my ($start,$end) = ($txt =~ /start=(\d{8}).*end=(\d{8})/);
>
> I've tried:
>
> txt <- "blah blah start=20080101 end=20090224"
> m <- regexpr("start=(\\d{8}).*end=(\\d{8})", filename, perl=T);
> dates = substring(filename, m, m+attr(m,"match.length")-1);
>
> but I get the whole matching substring...
>
> Any idea?
>
> ~Pierre
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>