The following code works, to gsub single quotes to double quotes: line <- gsub("'", '"', line) (that's a single quote within doubles then a double within singles if your viewer's font is not good). But The R Language Manual tells me that Quotes and other special characters within strings are specified using escape sequences: \' single quote \" double quote so why is the following wrong: gsub("\\\\'", "\\\\"", line)? That or any other number of backslashes (have tried all up to n=6 just for good measure). BTW is it documented anywhere that you need four backslashes in an RE to match one in the target, when it is being passed as an argument to gsub or grep? How would I know how many levels of doubling up to use for any other functions? (I got to 4 consecutive \ by trial and error in this case, but have a dim memory of having read about it somewhere.) TIA Simon Fear Senior Statistician Syne qua non Ltd Tel: +44 (0) 1379 644449 Fax: +44 (0) 1379 644445 email: Simon.Fear at synequanon.com web: http://www.synequanon.com Number of attachments included with this message: 0 This message (and any associated files) is confidential and\...{{dropped}}
On Tue, 12 Aug 2003, Simon Fear wrote:> The following code works, to gsub single quotes to double quotes: > > line <- gsub("'", '"', line) > > (that's a single quote within doubles then a double within singles if > your > viewer's font is not good). > > But The R Language Manual tells me that > > Quotes and other special characters within strings > are specified using escape sequences: > \' single quote > \" double quote > > so why is the following wrong: gsub("\\\\'", "\\\\"", line)? That or any > other number of backslashes (have tried all up to n=6 just for good > measure). > > BTW is it documented anywhere that you need four backslashes in an RE to > match one in the target, when it is being passed as an argument to gsub > or > grep?It's not true, so I hope it is not documented anywhere. You may need 6, as in the following from methods(): res <- sort(grep(gsub("([.[])", "\\\\\\1", name), an, value = TRUE)) since that is \\ \1 withou tthe space. Each backslash in the target only needs to be doubled. In your example gsub("\'", "\"", line) or even gsub("'", "\"", line) is all you need: only R strings need the escape. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
"Simon Fear" <Simon.Fear at synequanon.com> writes:> The following code works, to gsub single quotes to double quotes: > > line <- gsub("'", '"', line) > > (that's a single quote within doubles then a double within singles if > your > viewer's font is not good). > > But The R Language Manual tells me that > > Quotes and other special characters within strings > are specified using escape sequences: > \' single quote > \" double quote > > so why is the following wrong: gsub("\\\\'", "\\\\"", line)? That or any > other number of backslashes (have tried all up to n=6 just for good > measure).There's a backslash missing in the replacement. This works: line <- "ab\\\'cd" gsub("\\\\'", "\\\\\"", line) and will replace \' with \"> BTW is it documented anywhere that you need four backslashes in an RE to > match one in the target, when it is being passed as an argument to gsub > or > grep? How would I know how many levels of doubling up to use for any > other > functions? (I got to 4 consecutive \ by trial and error in this case, > but > have a dim memory of having read about it somewhere.)There are two levels because backslashes are escape characters both to R strings and regular expressions. So in the above, "line" is ab\'cd and the match pattern is \\' which matches \' and the replacement is \\" which becomes \" More interesting is> gsub("\\'", "a", line)[1] "ab\\'cda"> gsub("\\'", "a", line, perl=T)[1] "ab\\acd" so \' matches a single quote with PCRE but not with ordinary RE. (Yes, there's a reason...) -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Thank you. Single backslash version, first thing I tried (I thought) works just fine when I copy and paste, ergo I must have got confused by some stupid typo of mine. Sorry to waste everyone's time over this. (Still, I am probably not the only confused user when it comes to RE handling - I hope the examples posted will be of as much use to others as they are to me.) Simon> -----Original Message----- > From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] > Sent: 12 August 2003 17:13 > To: Simon Fear > Cc: r-help at stat.math.ethz.ch > Subject: Re: [R] grep and gsub on backslash and quotes > > > Security Warning: > If you are not sure an attachment is safe to open please contact > Andy on x234. There are 0 attachments with this message. > ________________________________________________________________ > > On Tue, 12 Aug 2003, Simon Fear wrote: > > > The following code works, to gsub single quotes to double quotes: > > > > line <- gsub("'", '"', line) > > > > (that's a single quote within doubles then a double within > singles if > > your > > viewer's font is not good). > > > > But The R Language Manual tells me that > > > > Quotes and other special characters within strings > > are specified using escape sequences: > > \' single quote > > \" double quote > > > > so why is the following wrong: gsub("\\\\'", "\\\\"", line)? That or > any > > other number of backslashes (have tried all up to n=6 just for good > > measure). > > > > BTW is it documented anywhere that you need four > backslashes in an RE > to > > match one in the target, when it is being passed as an argument to > gsub > > or > > grep? > > It's not true, so I hope it is not documented anywhere. You > may need 6, > as in the following from methods(): > > res <- sort(grep(gsub("([.[])", "\\\\\\1", name), an, > value = TRUE)) > > since that is \\ \1 withou tthe space. Each backslash in the target > only > needs to be doubled. > > In your example > > gsub("\'", "\"", line) or even gsub("'", "\"", line) > > is all you need: only R strings need the escape. > > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 >Simon Fear Senior Statistician Syne qua non Ltd Tel: +44 (0) 1379 644449 Fax: +44 (0) 1379 644445 email: Simon.Fear at synequanon.com web: http://www.synequanon.com Number of attachments included with this message: 0 This message (and any associated files) is confidential and\...{{dropped}}