regexpr - ignore all special characters and punctuation in a string
On Apr 20, 2015, at 8:59 AM, Dimitri Liakhovitski <dimitri.liakhovitski at gmail.com> wrote: Hello! Please point me in the right direction. I need to match 2 strings, but focusing ONLY on characters, ignoring all special characters and punctuation signs, including (), "", etc.. For example: I want the following to return: TRUE "What a nice day today! - Story of happiness: Part 2." == "What a nice day today: Story of happiness (Part 2)" -- Thank you! Dimitri Liakhovitski
Look at ?agrep: Vec1 <- "What a nice day today! - Story of happiness: Part 2." Vec2 <- "What a nice day today: Story of happiness (Part 2)? # Match the words, not the punctuation. # Not fully tested
agrep("What a nice day today Story of happiness Part 2", c(Vec1, Vec2))
[1] 1 2
agrep("What a nice day today Story of happiness Part 2", c(Vec1, Vec2),
value = TRUE) [1] "What a nice day today! - Story of happiness: Part 2." [2] "What a nice day today: Story of happiness (Part 2)? Also, possibly: http://cran.r-project.org/web/packages/stringdist Regards, Marc Schwartz