Skip to content

Extracting everything between two symbols in a string

2 messages · Gianluca Rossi, Rui Barradas

#
Hello,

I have a vector containing some names. I want to extract the title on 
every row, basically everything between the ", " (included the white 
space) and "."

     > head(combi$Name)
     [1] "Braund, Mr. Owen Harris"
     [2] "Cumings, Mrs. John Bradley (Florence Briggs Thayer)"
     [3] "Heikkinen, Miss. Laina"
     [4] "Futrelle, Mrs. Jacques Heath (Lily May Peel)"
     [5] "Allen, Mr. William Henry"
     [6] "Moran, Mr. James"

I suppose grep with the argument `value = TRUE` might come useful but I 
have difficulties on find the right regular expressions to accomplish my 
needs.

     combi$Title <- grep("", combi$Name, value = TRUE)

Many thanks,

Gianluca
#
Hello,

Try the following.

x <- "Braund, Mr. Owen Harris"
sub("^.*, (M[[:alpha:]]*)\\..*$", "\\1", x)


Hope this helps,

Rui Barradas

Em 16-02-2014 12:50, Gianluca Rossi escreveu: