regexp capturing group in R

txt <- "blah blah start=20080101 end=20090224"
nums <- sub(".*start=(\\d+).*end=(\\d+).*", "\\1 \\2", txt, perl=TRUE)
nums <- strsplit(sub(".*start=(\\d+).*end=(\\d+).*", "\\1 \\2", txt, perl=TRUE), ' ')
nums
[[1]]
[1] "20080101" "20090224"
Hello,

Newbie question: how do you capture groups in a regexp in R?

Let's say I have txt="blah blah start=20080101 end=20090224".
I'd like to get the two dates start and end.

In Perl, one would say:

my ($start,$end) = ($txt =~ /start=(\d{8}).*end=(\d{8})/);

I've tried:

txt <- "blah blah start=20080101 end=20090224"
m <- regexpr("start=(\\d{8}).*end=(\\d{8})", filename, perl=T);
dates = substring(filename, m, m+attr(m,"match.length")-1);

but I get the whole matching substring...

Any idea?

~Pierre

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

regexp capturing group in R

Thread (4 messages)