Skip to content
Prev 245120 / 398503 Next

Numbers in a string

In S+ strsplit() has a keep=TRUE/FALSE argument to
specify whether to return the substrings that match
the pattern or to return the substrings between
matches to the pattern (the default).  E.g.,
"AB15E9SDF654VKBN?dvb.65")
[[1]]:
[1] "11"       "5.31e+34" "1.45"    

[[2]]:
[1] "15"  "9"   "654" "65"
[[1]]:
[1] "abcde. " " abc "   ", ("     ")"      

[[2]]:
[1] "AB"        "E"         "SDF"       "VKBN?dvb."

In R and S+ gregexpr can tell you the start points
and lengths of each match, but it is a pain to
pass this information to substring() to get the
matches themselves.  Should [g]regexpr() have a
value= argument like grep has?

In R the gsubfn package can do this sort of thing.
I don't know if it worth adding more to base R's
strsplit().

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com