Dear All,
I have these data:
exampledata <- c("This is one item", "This is Another
One", "And so is
This")
I would like to find each occurence of a blank space followed by a Capital
Letter and replace it by a blank space, a left curly brace, the respective
Capital Letter, and then a right curly brace.
I thought the following will do:
gsub(pattern = " ([A-Z])", replacement = " {\\1}",
x=exampledata,
ignore.case=FALSE)
Unfortunately, the actual output was:
"This {i}s {o}ne {i}tem" "This {i}s {A}nother {O}ne"
"And {s}o {i}s
{T}his"
But what I wanted was actually:
"This is one item" "This is {A}nother {O}ne" "And so
is {T}his"
Can anyone tell me what I should change. Should be fairly easy for people
with more experience than me using regular expressions, I guess.
Thanks,
Roland
P.S. The background is my bibliography-file for BibTeX. If the title field
has some content like "An analysis of Denmark", it would actually turn
out
to be "An analysis of denmark" in my dvi-document. Of course, R is not
the
appropriate tool for this. But apart from the little problem outlined above,
I had a function doing what I wanted in less than 10 minutes.
+++++
This mail has been sent through the MPI for Demographic Rese...{{dropped}}
This appears to be a bug. Please try
gsub(pattern = " ([A-Z])", replacement = " {\\1}",
x=exampledata, perl=TRUE)
On Thu, 22 Jul 2004, Rau, Roland wrote:
> Dear All,
>
> I have these data:
>
> exampledata <- c("This is one item", "This is Another
One", "And so is
> This")
>
> I would like to find each occurence of a blank space followed by a Capital
> Letter and replace it by a blank space, a left curly brace, the respective
> Capital Letter, and then a right curly brace.
>
> I thought the following will do:
> gsub(pattern = " ([A-Z])", replacement = " {\\1}",
x=exampledata,
> ignore.case=FALSE)
>
> Unfortunately, the actual output was:
> "This {i}s {o}ne {i}tem" "This {i}s {A}nother {O}ne"
"And {s}o {i}s
> {T}his"
>
> But what I wanted was actually:
> "This is one item" "This is {A}nother {O}ne"
"And so is {T}his"
>
> Can anyone tell me what I should change. Should be fairly easy for people
> with more experience than me using regular expressions, I guess.
>
> Thanks,
> Roland
>
> P.S. The background is my bibliography-file for BibTeX. If the title field
> has some content like "An analysis of Denmark", it would actually
turn out
> to be "An analysis of denmark" in my dvi-document. Of course, R
is not the
> appropriate tool for this. But apart from the little problem outlined
above,
> I had a function doing what I wanted in less than 10 minutes.
>
>
> +++++
> This mail has been sent through the MPI for Demographic Rese...{{dropped}}
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
Another solution (that is correct in other locales than C, since I see
you are not in an English locale).
gsub(pattern = " ([[:upper:]])", replacement = " {\\1}",
x=exampledata)
I think this _is_ the problem, as in your locale (and in en_GB) the sort
order is probably something like
aAbB...zZ
Or just try the C locale.
On Thu, 22 Jul 2004, Prof Brian Ripley wrote:
> This appears to be a bug. Please try
>
> gsub(pattern = " ([A-Z])", replacement = " {\\1}",
x=exampledata, perl=TRUE)
>
>
>
> On Thu, 22 Jul 2004, Rau, Roland wrote:
>
> > Dear All,
> >
> > I have these data:
> >
> > exampledata <- c("This is one item", "This is
Another One", "And so is
> > This")
> >
> > I would like to find each occurence of a blank space followed by a
Capital
> > Letter and replace it by a blank space, a left curly brace, the
respective
> > Capital Letter, and then a right curly brace.
> >
> > I thought the following will do:
> > gsub(pattern = " ([A-Z])", replacement = " {\\1}",
x=exampledata,
> > ignore.case=FALSE)
> >
> > Unfortunately, the actual output was:
> > "This {i}s {o}ne {i}tem" "This {i}s {A}nother
{O}ne" "And {s}o {i}s
> > {T}his"
> >
> > But what I wanted was actually:
> > "This is one item" "This is {A}nother {O}ne"
"And so is {T}his"
> >
> > Can anyone tell me what I should change. Should be fairly easy for
people
> > with more experience than me using regular expressions, I guess.
> >
> > Thanks,
> > Roland
> >
> > P.S. The background is my bibliography-file for BibTeX. If the title
field
> > has some content like "An analysis of Denmark", it would
actually turn out
> > to be "An analysis of denmark" in my dvi-document. Of
course, R is not the
> > appropriate tool for this. But apart from the little problem outlined
above,
> > I had a function doing what I wanted in less than 10 minutes.
> >
> >
> > +++++
> > This mail has been sent through the MPI for Demographic
Rese...{{dropped}}
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
> >
> >
>
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
Dear all
thanks to the help of Prof. Brian Ripley, my little script is now working.
I use now:
gsub(pattern = " ([A-Z])", replacement = " {\\1}",
x=exampledata, perl=TRUE)
instead of:
gsub(pattern = " ([A-Z])", replacement = " {\\1}",
x=exampledata,
ignore.case=FALSE)
With regard to the question of Claus Ekstroem whether it is a problem of the
version of R I am running:
At work, I am using R 1.8.1 on Win32, at home I tried it with R 1.9.0 on
Linux (Mandrake10) but this did not yield any varying results.
However, it seems as if the version of R is relevant. In an older version
(R1.4.1) the "solution" did not work, because of the "unused
argument(s)
(perl ...)".
Thanks again,
Roland
> -----Original Message-----
> From: Prof Brian Ripley [SMTP:ripley at stats.ox.ac.uk]
> Sent: Thursday, July 22, 2004 7:58 PM
> To: Rau, Roland
> Cc: 'r-help at stat.math.ethz.ch'
> Subject: Re: [R] Replace only Capital Letters
>
> Another solution (that is correct in other locales than C, since I see
> you are not in an English locale).
>
> gsub(pattern = " ([[:upper:]])", replacement = "
{\\1}", x=exampledata)
>
> I think this _is_ the problem, as in your locale (and in en_GB) the sort
> order is probably something like
>
> aAbB...zZ
>
> Or just try the C locale.
>
> On Thu, 22 Jul 2004, Prof Brian Ripley wrote:
>
> > This appears to be a bug. Please try
> >
> > gsub(pattern = " ([A-Z])", replacement = " {\\1}",
x=exampledata,
> perl=TRUE)
> >
> >
> >
> > On Thu, 22 Jul 2004, Rau, Roland wrote:
> >
> > > Dear All,
> > >
> > > I have these data:
> > >
> > > exampledata <- c("This is one item", "This is
Another One", "And so is
> > > This")
> > >
> > > I would like to find each occurence of a blank space followed by
a
> Capital
> > > Letter and replace it by a blank space, a left curly brace, the
> respective
> > > Capital Letter, and then a right curly brace.
> > >
> > > I thought the following will do:
> > > gsub(pattern = " ([A-Z])", replacement = "
{\\1}", x=exampledata,
> > > ignore.case=FALSE)
> > >
> > > Unfortunately, the actual output was:
> > > "This {i}s {o}ne {i}tem" "This {i}s {A}nother
{O}ne" "And {s}o {i}s
> > > {T}his"
> > >
> > > But what I wanted was actually:
> > > "This is one item" "This is {A}nother
{O}ne" "And so is {T}his"
> > >
> > > Can anyone tell me what I should change. Should be fairly easy
for
> people
> > > with more experience than me using regular expressions, I guess.
> > >
> > > Thanks,
> > > Roland
> > >
> > > P.S. The background is my bibliography-file for BibTeX. If the
title
> field
> > > has some content like "An analysis of Denmark", it
would actually turn
> out
> > > to be "An analysis of denmark" in my dvi-document. Of
course, R is not
> the
> > > appropriate tool for this. But apart from the little problem
outlined
> above,
> > > I had a function doing what I wanted in less than 10 minutes.
> > >
> > >
> > > +++++
> > > This mail has been sent through the MPI for Demographic
> Rese...{{dropped}}
> > >
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
> > >
> > >
> >
> >
>
> --
> Brian D. Ripley, ripley at stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
+++++
This mail has been sent through the MPI for Demographic Rese...{{dropped}}