Displaying 1 result from an estimated 1 matches for "preprocesstext".
2013 Nov 06
1
Multiple String word replacements: Performance Issue
...ed as patterns look like this:
"\\bWORD1\\b|\\bWORD2\\b|\\bWORD3\\b|\\bWORD4\\b..."
Thus, those 'replacement vectors' are character vectors of length 1, each containing up to 800 words
**Main code:**
library("parallel")
library("stringr")
preprocessText<-function(x){
# Replace the 'html-and'
arguments<-list(pattern="\\&\\;",replacement="and",x=x, ignore.case=TRUE)
y<-do.call(gsub, arguments)
# Remove some special characters
arguments<-list(pattern="[...