Dear All, I have these data: exampledata <- c("This is one item", "This is Another One", "And so is This") I would like to find each occurence of a blank space followed by a Capital Letter and replace it by a blank space, a left curly brace, the respective Capital Letter, and then a right curly brace. I thought the following will do: gsub(pattern = " ([A-Z])", replacement = " {\\1}", x=exampledata, ignore.case=FALSE) Unfortunately, the actual output was: "This {i}s {o}ne {i}tem" "This {i}s {A}nother {O}ne" "And {s}o {i}s {T}his" But what I wanted was actually: "This is one item" "This is {A}nother {O}ne" "And so is {T}his" Can anyone tell me what I should change. Should be fairly easy for people with more experience than me using regular expressions, I guess. Thanks, Roland P.S. The background is my bibliography-file for BibTeX. If the title field has some content like "An analysis of Denmark", it would actually turn out to be "An analysis of denmark" in my dvi-document. Of course, R is not the appropriate tool for this. But apart from the little problem outlined above, I had a function doing what I wanted in less than 10 minutes. +++++ This mail has been sent through the MPI for Demographic Rese...{{dropped}}
This appears to be a bug. Please try gsub(pattern = " ([A-Z])", replacement = " {\\1}", x=exampledata, perl=TRUE) On Thu, 22 Jul 2004, Rau, Roland wrote:> Dear All, > > I have these data: > > exampledata <- c("This is one item", "This is Another One", "And so is > This") > > I would like to find each occurence of a blank space followed by a Capital > Letter and replace it by a blank space, a left curly brace, the respective > Capital Letter, and then a right curly brace. > > I thought the following will do: > gsub(pattern = " ([A-Z])", replacement = " {\\1}", x=exampledata, > ignore.case=FALSE) > > Unfortunately, the actual output was: > "This {i}s {o}ne {i}tem" "This {i}s {A}nother {O}ne" "And {s}o {i}s > {T}his" > > But what I wanted was actually: > "This is one item" "This is {A}nother {O}ne" "And so is {T}his" > > Can anyone tell me what I should change. Should be fairly easy for people > with more experience than me using regular expressions, I guess. > > Thanks, > Roland > > P.S. The background is my bibliography-file for BibTeX. If the title field > has some content like "An analysis of Denmark", it would actually turn out > to be "An analysis of denmark" in my dvi-document. Of course, R is not the > appropriate tool for this. But apart from the little problem outlined above, > I had a function doing what I wanted in less than 10 minutes. > > > +++++ > This mail has been sent through the MPI for Demographic Rese...{{dropped}} > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Another solution (that is correct in other locales than C, since I see you are not in an English locale). gsub(pattern = " ([[:upper:]])", replacement = " {\\1}", x=exampledata) I think this _is_ the problem, as in your locale (and in en_GB) the sort order is probably something like aAbB...zZ Or just try the C locale. On Thu, 22 Jul 2004, Prof Brian Ripley wrote:> This appears to be a bug. Please try > > gsub(pattern = " ([A-Z])", replacement = " {\\1}", x=exampledata, perl=TRUE) > > > > On Thu, 22 Jul 2004, Rau, Roland wrote: > > > Dear All, > > > > I have these data: > > > > exampledata <- c("This is one item", "This is Another One", "And so is > > This") > > > > I would like to find each occurence of a blank space followed by a Capital > > Letter and replace it by a blank space, a left curly brace, the respective > > Capital Letter, and then a right curly brace. > > > > I thought the following will do: > > gsub(pattern = " ([A-Z])", replacement = " {\\1}", x=exampledata, > > ignore.case=FALSE) > > > > Unfortunately, the actual output was: > > "This {i}s {o}ne {i}tem" "This {i}s {A}nother {O}ne" "And {s}o {i}s > > {T}his" > > > > But what I wanted was actually: > > "This is one item" "This is {A}nother {O}ne" "And so is {T}his" > > > > Can anyone tell me what I should change. Should be fairly easy for people > > with more experience than me using regular expressions, I guess. > > > > Thanks, > > Roland > > > > P.S. The background is my bibliography-file for BibTeX. If the title field > > has some content like "An analysis of Denmark", it would actually turn out > > to be "An analysis of denmark" in my dvi-document. Of course, R is not the > > appropriate tool for this. But apart from the little problem outlined above, > > I had a function doing what I wanted in less than 10 minutes. > > > > > > +++++ > > This mail has been sent through the MPI for Demographic Rese...{{dropped}} > > > > ______________________________________________ > > R-help at stat.math.ethz.ch mailing list > > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > > > > > >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Dear all thanks to the help of Prof. Brian Ripley, my little script is now working. I use now: gsub(pattern = " ([A-Z])", replacement = " {\\1}", x=exampledata, perl=TRUE) instead of: gsub(pattern = " ([A-Z])", replacement = " {\\1}", x=exampledata, ignore.case=FALSE) With regard to the question of Claus Ekstroem whether it is a problem of the version of R I am running: At work, I am using R 1.8.1 on Win32, at home I tried it with R 1.9.0 on Linux (Mandrake10) but this did not yield any varying results. However, it seems as if the version of R is relevant. In an older version (R1.4.1) the "solution" did not work, because of the "unused argument(s) (perl ...)". Thanks again, Roland> -----Original Message----- > From: Prof Brian Ripley [SMTP:ripley at stats.ox.ac.uk] > Sent: Thursday, July 22, 2004 7:58 PM > To: Rau, Roland > Cc: 'r-help at stat.math.ethz.ch' > Subject: Re: [R] Replace only Capital Letters > > Another solution (that is correct in other locales than C, since I see > you are not in an English locale). > > gsub(pattern = " ([[:upper:]])", replacement = " {\\1}", x=exampledata) > > I think this _is_ the problem, as in your locale (and in en_GB) the sort > order is probably something like > > aAbB...zZ > > Or just try the C locale. > > On Thu, 22 Jul 2004, Prof Brian Ripley wrote: > > > This appears to be a bug. Please try > > > > gsub(pattern = " ([A-Z])", replacement = " {\\1}", x=exampledata, > perl=TRUE) > > > > > > > > On Thu, 22 Jul 2004, Rau, Roland wrote: > > > > > Dear All, > > > > > > I have these data: > > > > > > exampledata <- c("This is one item", "This is Another One", "And so is > > > This") > > > > > > I would like to find each occurence of a blank space followed by a > Capital > > > Letter and replace it by a blank space, a left curly brace, the > respective > > > Capital Letter, and then a right curly brace. > > > > > > I thought the following will do: > > > gsub(pattern = " ([A-Z])", replacement = " {\\1}", x=exampledata, > > > ignore.case=FALSE) > > > > > > Unfortunately, the actual output was: > > > "This {i}s {o}ne {i}tem" "This {i}s {A}nother {O}ne" "And {s}o {i}s > > > {T}his" > > > > > > But what I wanted was actually: > > > "This is one item" "This is {A}nother {O}ne" "And so is {T}his" > > > > > > Can anyone tell me what I should change. Should be fairly easy for > people > > > with more experience than me using regular expressions, I guess. > > > > > > Thanks, > > > Roland > > > > > > P.S. The background is my bibliography-file for BibTeX. If the title > field > > > has some content like "An analysis of Denmark", it would actually turn > out > > > to be "An analysis of denmark" in my dvi-document. Of course, R is not > the > > > appropriate tool for this. But apart from the little problem outlined > above, > > > I had a function doing what I wanted in less than 10 minutes. > > > > > > > > > +++++ > > > This mail has been sent through the MPI for Demographic > Rese...{{dropped}} > > > > > > ______________________________________________ > > > R-help at stat.math.ethz.ch mailing list > > > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > > > > > > > > > > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595+++++ This mail has been sent through the MPI for Demographic Rese...{{dropped}}