Mike
2015-Apr-23 20:10 UTC
[R] Need content_transformer() called by tm_map() to change non-letters to spaces
Hello, In the following code, any characters matching? "/|@| \\|") will be changed to a space.> library(tm) > toSpace <- content_transformer(function(x, pattern) gsub(pattern, " ", x)) > docs <- tm_map(docs, toSpace, "/|@| \\|")What code would transform all non-letters to a space?? (What goes where the xxxxx's are.)It is very difficult to put all non-letters in a string...? So I'm doing the opposite of the above.> toSpace_2 <- content_transformer(function xxxxxxxxxxxxxxxxxxxxxxx)) > docs <- tm_map(docs, toSpace_2, "abcdefghijklmnopqrstuvwxyz")This needs to be done by a content_transformer() function to maintain the integrity of docs. Thanks ? [[alternative HTML version deleted]]
Jeff Newmiller
2015-Apr-24 04:09 UTC
[R] Need content_transformer() called by tm_map() to change non-letters to spaces
Regex "[^a-zA-Z]" reads as "not a letter".
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live
Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
On April 23, 2015 1:10:41 PM PDT, Mike <mikehall at y7mail.com>
wrote:>Hello,
>In the following code, any characters matching? "/|@| \\|") will
be
>changed to a space.
>> library(tm)
>> toSpace <- content_transformer(function(x, pattern) gsub(pattern,
"
>", x))
>> docs <- tm_map(docs, toSpace, "/|@| \\|")
>
>What code would transform all non-letters to a space?? (What goes where
>the xxxxx's are.)It is very difficult to put all non-letters in a
>string...? So I'm doing the opposite of the above.
>> toSpace_2 <- content_transformer(function xxxxxxxxxxxxxxxxxxxxxxx))
>> docs <- tm_map(docs, toSpace_2,
"abcdefghijklmnopqrstuvwxyz")
>
>This needs to be done by a content_transformer() function to maintain
>the integrity of docs.
>
>Thanks
>?
> [[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.