Skip to content
Back to formatted view

Raw Message

Message-ID: <CAP01uRnEoxCAg23JC+Ont8C8erbU5AhcK29huW4m36KSbFyjRw@mail.gmail.com>
Date: 2014-08-09T13:01:34Z
From: Gabor Grothendieck
Subject: R Package for Text Manipulation
In-Reply-To: <CAM-xyZhpPrR_GByUBmQD1vvDOwcGQ7FOyj6VRL_Mma4OfpbVow@mail.gmail.com>

On Sat, Aug 9, 2014 at 8:15 AM, Omar Andr? Gonz?les D?az
<oma.gonzales at gmail.com> wrote:
> Hi all,
>
> I want to know, where i can find a package to simulate the functions
> "Search and Replace  and "Find Words that contain - replace them with...",
> that we can use in EXCEL.
>
> I've look in other places and they say: "Reshape2" by Hadley Wickham. How
> ever, i've investigated it and its not exactly what i'm looking (it's main
> functions are "cast" and "melt", sure you know them).
>
> May you help me please? I want to download data from Google Analytics and
> clean it, what is the best approach?
>
>         [[alternative HTML version deleted]]
>

1. The gsubfn function in the gsubfn package can do that.  These
commands extract the words and then apply the function represented in
formula notation in the second argument to them:

library(gsubfn) # home page at http://gsubfn.googlecode.com
s <- "The quick brown fox" # test data

# replace the word quick with QUICK

gsubfn("\\S+", ~ if (x == "quick") "QUICK" else x, s)
## [1] "The QUICK brown fox"

# replace words containing o with ?

gsubfn("\\S+", ~ if (grepl("o", x)) "?" else x, s)
## [1] "The quick ? ?"

2. It can also be done without packages:

# replace quick with QUICK

gsub("\\bquick\\b", "QUICK", s)
## [1] "The QUICK brown fox"

# or the following which first split s into a vector of words and
# operate on that pasting it back into a single string at the end

words <- strsplit(s, "\\s+")[[1]]
paste(replace(words, words == "quick", "QUICK"), collapse = " ")
## [1] "The QUICK brown fox"

# replace words containing o with ?.  Use `words` from above.

paste(replace(words, grepl("o", words), "?"), collapse = " ")
## [1] "The quick ? ?"

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com