Grabbing Specific Words from Content (basic text mining)
On Mon, Jan 14, 2013 at 4:30 AM, Sachinthaka Abeywardana
<sachin.abeywardana at gmail.com> wrote:
Hi all, Suppose I have a data frame with mixed content (name age and address). a<-"Name: John Smith Age: 35 Address: 32, street, sub, something" b<-data.frame(a) 1. The question is I want to extract the name age and address separately from this data frame (containing potentially more people). 2. Also just incase I have to deal with it how would the syntax change if I had "Name" as opposed to "Name:" (without the colon).
Try this:
library(gsubfn) a <- "Name: John Smith Age: 35 Address: 32, street, sub, something" b <- data.frame(a) strapplyc(as.character(b$a), "Name: (.*) Age: (.*) Address: (.*)")
[[1]] [1] "John Smith" "35" [3] "32, street, sub, something"
a. <- "Name John Smith Age 35 Address 32, street, sub, something" b. <- data.frame(a.) strapplyc(as.character(b.$a.), "Name (.*) Age (.*) Address (.*)")
[[1]] [1] "John Smith" "35" [3] "32, street, sub, something" -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com