Setup a regular expression to only keep what you want. This example
keep alpha, nums, spaces , commas and periods:
> x <- readLines(textConnection('I discovered that the following
works:
+ any(is.na(strsplit(readLines(FILE), "")))
+
+ I am wondering whether anyone has a better approach to this problem.
+
+ Dennis bullet ??????????????????
+
+ Dennis Fisher MD
+ P < (The "P Less Than" Company)
+ Phone: 1-866-PLessThan (1-866-753-7784)
+ Fax: 1-866-PLessThan (1-866-753-7784)
+ www.PLessThan.com'))> closeAllConnections()
> # replace characters not matching alphanum, space, period, comma
> gsub("[^[:alnum:][:space:][,.]", "", x) # regular
expression to change
[1] "I discovered that the following works"
[2] " anyis.nastrsplitreadLinesFILE, "
[3] ""
[4] "I am wondering whether anyone has a better approach to this
problem."
[5] ""
[6] "Dennis bullet "
[7] ""
[8] "Dennis Fisher MD"
[9] "P The P Less Than Company"
[10] "Phone 1866PLessThan 18667537784"
[11] "Fax 1866PLessThan 18667537784"
[12] "www.PLessThan.com">
>
On Thu, Feb 11, 2010 at 8:46 PM, Dennis Fisher <fisher at plessthan.com>
wrote:> Colleagues
>
> R 2.10.1 on a Mac
>
> I read in textfiles using readLines, then I process those files, then I use
R to execute another program. ?Occasionally those files contain characters other
than letter / numbers / ?routine punctuation marks. ?For example, a bullet
(option-8 on a Mac) triggers the problem.
>
> Although R can read and process those characters, the other program cannot
so I would like to identify these characters and exit gracefully with a warning.
>
> I discovered that the following works:
> ? ? ? ?any(is.na(strsplit(readLines(FILE), "")))
>
> I am wondering whether anyone has a better approach to this problem.
>
> Dennis
>
> Dennis Fisher MD
> P < (The "P Less Than" Company)
> Phone: 1-866-PLessThan (1-866-753-7784)
> Fax: 1-866-PLessThan (1-866-753-7784)
> www.PLessThan.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?