Regexp: extract first occurrence of date in string
Use regexpr to get the offset into the string and its length and then use substr to pick extract it.
On Sat, Jan 2, 2010 at 10:43 AM, johannes rara <johannesraja at gmail.com> wrote:
Thanks, is the same possible using basic gsub/sub/grep etc. functions? -J 2010/1/2 Gabor Grothendieck <ggrothendieck at gmail.com>:
Try this which uses a slightly simpler regexp:
library(gsubfn)
strapply(txt, "(\\d{1,2}\\.\\d{1,2}\\.\\d{4}).*")[[1]]
[1] "05.12.2009" or we could convert it to Date class at the same time where we have assumed month.day.year:
strapply(txt, "(\\d{1,2}\\.\\d{1,2}\\.\\d{4}).*", ~ as.Date(x, "%m.%d.%Y"))[[1]]
[1] "2009-05-12" or this even simpler regexp extracting all the dates and then picking off the first:
strapply(txt, "\\d{1,2}\\.\\d{1,2}\\.\\d{4}")[[1]][1]
[1] "05.12.2009" On Sat, Jan 2, 2010 at 10:08 AM, johannes rara <johannesraja at gmail.com> wrote:
I would like to extract first date from a string:
txt <- "first date is 05.12.2009. Second date is 06.12.2009." txt
[1] "first date is 05.12.2009. Second date is 06.12.2009." I tried:
sub("^.*?\\s(\\d{1,2}\\.\\d{1,2}\\.\\d{4})", "\\1", txt, extended=T, perl=T)
[1] "05.12.2009. Second date is 06.12.2009."
How to modify this? -J
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.