named character question
On Aug 12, 2012, at 8:33 PM, Erin Hodgess wrote:
Dear R People: Here is a goofy question: I want to extract the zip code from an address and here is my work so far:
add1
results.formatted_address "200 W Rosamond St, Houston, TX 77076, USA"
add1[1][32:36]
<NA> <NA> <NA> <NA> <NA> NA NA NA NA NA
str(add1)
Named chr "200 W Rosamond St, Houston, TX 77076, USA" - attr(*, "names")= chr "results.formatted_address"
> ttt <- "200 W Rosamond St, Houston, TX 77076, USA"
> sub("^.+,.+,\\s[[:alpha:]]*\\s([[:digit:]]{5}).+", "\\1", ttt)
[1] "77076"
You will need to determine if all you addresses have two commas before
the two letter state designation. You may not need as specific a
pattern as this. An alternate pattern.
> sub("^.+\\s[[:alpha:]]{2}\\s([[:digit:]]{5}).+", "\\1", ttt)
[1] "77076"
David Winsemius, MD Alameda, CA, USA