Need content_transformer() called by tm_map() to change non-letters to spaces
Regex "[^a-zA-Z]" reads as "not a letter".
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
On April 23, 2015 1:10:41 PM PDT, Mike <mikehall at y7mail.com> wrote:
Hello, In the following code, any characters matching? "/|@| \\|") will be changed to a space.
library(tm) toSpace <- content_transformer(function(x, pattern) gsub(pattern, "
", x))
docs <- tm_map(docs, toSpace, "/|@| \\|")
What code would transform all non-letters to a space?? (What goes where the xxxxx's are.)It is very difficult to put all non-letters in a string...? So I'm doing the opposite of the above.
toSpace_2 <- content_transformer(function xxxxxxxxxxxxxxxxxxxxxxx)) docs <- tm_map(docs, toSpace_2, "abcdefghijklmnopqrstuvwxyz")
This needs to be done by a content_transformer() function to maintain the integrity of docs. Thanks ? [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.