An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20140809/a719abe8/attachment.pl>
R Package for Text Manipulation
3 messages · Omar André Gonzáles Díaz, Gabor Grothendieck, David Winsemius
On Sat, Aug 9, 2014 at 8:15 AM, Omar Andr? Gonz?les D?az
<oma.gonzales at gmail.com> wrote:
Hi all,
I want to know, where i can find a package to simulate the functions
"Search and Replace and "Find Words that contain - replace them with...",
that we can use in EXCEL.
I've look in other places and they say: "Reshape2" by Hadley Wickham. How
ever, i've investigated it and its not exactly what i'm looking (it's main
functions are "cast" and "melt", sure you know them).
May you help me please? I want to download data from Google Analytics and
clean it, what is the best approach?
[[alternative HTML version deleted]]
1. The gsubfn function in the gsubfn package can do that. These commands extract the words and then apply the function represented in formula notation in the second argument to them: library(gsubfn) # home page at http://gsubfn.googlecode.com s <- "The quick brown fox" # test data # replace the word quick with QUICK gsubfn("\\S+", ~ if (x == "quick") "QUICK" else x, s) ## [1] "The QUICK brown fox" # replace words containing o with ? gsubfn("\\S+", ~ if (grepl("o", x)) "?" else x, s) ## [1] "The quick ? ?" 2. It can also be done without packages: # replace quick with QUICK gsub("\\bquick\\b", "QUICK", s) ## [1] "The QUICK brown fox" # or the following which first split s into a vector of words and # operate on that pasting it back into a single string at the end words <- strsplit(s, "\\s+")[[1]] paste(replace(words, words == "quick", "QUICK"), collapse = " ") ## [1] "The QUICK brown fox" # replace words containing o with ?. Use `words` from above. paste(replace(words, grepl("o", words), "?"), collapse = " ") ## [1] "The quick ? ?"
Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
On Aug 9, 2014, at 5:15 AM, Omar Andr? Gonz?les D?az wrote:
Hi all, I want to know, where i can find a package to simulate the functions "Search and Replace and "Find Words that contain - replace them with...", that we can use in EXCEL. I've look in other places and they say: "Reshape2" by Hadley Wickham. How ever, i've investigated it and its not exactly what i'm looking (it's main functions are "cast" and "melt", sure you know them). May you help me please? I want to download data from Google Analytics and clean it, what is the best approach?
That request is on the vague side. You are advised in the Posting Guide to include code that begins an analysis and then requests assistance with specific difficulties. (You are also asked to do this in a plain text message since HTML tends to scramble messages.) The base package offers the `grep`, `sub`, and `gsub` functions which bring the power of regular expression to the R user. There are much more flexible that anything that Excel offers. Please look at: ?grep ?regex
[[alternative HTML version deleted]]
And do :
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
David Winsemius Alameda, CA, USA