Hi, I want to remove all punctuation characters in a string. I was trying it use a regular expressions but it doesn't work. Here is a sample os what i want: str <- 'ABD - remove de punct, and dot characters.' str <- gsub('[:punct:]','',str) str "'ABD remove de punct and dot characters" is there any function that do this kind of thing? Thanks to all. Filipe Almeida [[alternative HTML version deleted]]
On Tue, 2006-05-09 at 16:50 +0100, Filipe Almeida wrote:> Hi, > > I want to remove all punctuation characters in a string. I was trying it use > a regular expressions but it doesn't work. > Here is a sample os what i want: > > str <- 'ABD - remove de punct, and dot characters.' > str <- gsub('[:punct:]','',str) > str > "'ABD remove de punct and dot characters" > > is there any function that do this kind of thing? > > Thanks to all. > > Filipe AlmeidaYou almost have it. Just need to double the brackets:> str[1] "ABD - remove de punct, and dot characters."> gsub("[[:punct:]]", "", str)[1] "ABD remove de punct and dot characters" Note the following in ?regex: For example, [[:alnum:]] means [0-9A-Za-z], except the latter depends upon the locale and the character encoding, whereas the former is independent of locale and character set. (Note that the brackets in these class names are part of the symbolic names, and must be included in addition to the brackets delimiting the bracket list.) Most metacharacters lose their special meaning inside lists. To include a literal ], place it first in the list. Similarly, to include a literal ^, place it anywhere but first. Finally, to include a literal -, place it first or last. (Only these and \ remain special inside character classes.) HTH, Marc Schwartz
Try gsub('[[:punct:]]', '', str) -Christos -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Filipe Almeida Sent: Tuesday, May 09, 2006 11:51 AM To: r-help at stat.math.ethz.ch Subject: [R] remove Punctuation characters Hi, I want to remove all punctuation characters in a string. I was trying it use a regular expressions but it doesn't work. Here is a sample os what i want: str <- 'ABD - remove de punct, and dot characters.' str <- gsub('[:punct:]','',str) str "'ABD remove de punct and dot characters" is there any function that do this kind of thing? Thanks to all. Filipe Almeida [[alternative HTML version deleted]] ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
You need double [[ and ]]: gsub('[[:punct:]]','',str) On 5/9/06, Filipe Almeida <milheiros at gmail.com> wrote:> Hi, > > I want to remove all punctuation characters in a string. I was trying it use > a regular expressions but it doesn't work. > Here is a sample os what i want: > > str <- 'ABD - remove de punct, and dot characters.' > str <- gsub('[:punct:]','',str) > str > "'ABD remove de punct and dot characters" > > is there any function that do this kind of thing? > > Thanks to all. > > Filipe Almeida > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >