Dear r-help members, I have a number in the form of a string, say: a<-"-01020.909200" I'd like to extract "1020." as well as ".9092" Front<-grep(pattern="[1-9]+[0-9]*\\.", value=TRUE, x=a, fixed=FALSE) End<-grep(pattern="\\.[0-9]*[1-9]+", value=TRUE, x=a, fixed=FALSE) However, both strings give "-01020.909200", exactly a. Could you please point me to what is wrong? Thanks and best regards H. van Lishaut
grep() returns the matches. You want regexpr() and regmatches() -- Bert On Tue, Aug 21, 2012 at 12:24 PM, Dr. Holger van Lishaut <H.v.Lishaut at gmx.de> wrote:> Dear r-help members, > > I have a number in the form of a string, say: > > a<-"-01020.909200" > > I'd like to extract "1020." as well as ".9092" > > Front<-grep(pattern="[1-9]+[0-9]*\\.", value=TRUE, x=a, fixed=FALSE) > End<-grep(pattern="\\.[0-9]*[1-9]+", value=TRUE, x=a, fixed=FALSE) > > However, both strings give "-01020.909200", exactly a. > Could you please point me to what is wrong? > > Thanks and best regards > H. van Lishaut > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
'grep' does not change strings. Use 'gsub' or
'regmatches':
# gsub
Front <- gsub("^.*?([1-9][0-9]*\\.).*?$", "\\1", a)
End <- gsub("^.*?(\\.[0-9]*[1-9]).*?$", "\\1", a)
# regexpr and regmatches (R >= 2.14.0)
Front <- regmatches(a, regexpr("[1-9][0-9]*\\.", a))
End <- regmatches(a, regexpr("\\.[0-9]*[1-9]", a))
Front
## [1] "1020."
End
## [1] ".9092"
--
Noia Raindrops
noia.raindrops at gmail.com
You're misreading the docs: from grep,
value: if ?FALSE?, a vector containing the (?integer?) indices of
the matches determined by ?grep? is returned, and if ?TRUE?,
a vector containing the matching elements themselves is
returned.
Since there's a match somewhere in a[1], all of a[1] is returned (it
is a matching element), not just the matching bit: grep(x, value TRUE) is
something like x[grepl(x)] to my mind.
I think you want ?regexpr or possibly just substitute out the
non-match with gsub.
Cheers,
Michael
On Tue, Aug 21, 2012 at 2:24 PM, Dr. Holger van Lishaut
<H.v.Lishaut at gmx.de> wrote:> Dear r-help members,
>
> I have a number in the form of a string, say:
>
> a<-"-01020.909200"
>
> I'd like to extract "1020." as well as ".9092"
>
> Front<-grep(pattern="[1-9]+[0-9]*\\.", value=TRUE, x=a,
fixed=FALSE)
> End<-grep(pattern="\\.[0-9]*[1-9]+", value=TRUE, x=a,
fixed=FALSE)
>
> However, both strings give "-01020.909200", exactly a.
> Could you please point me to what is wrong?
>
> Thanks and best regards
> H. van Lishaut
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
HI,
Try this:
gsub("^-\\d(\\d{4}.).*","\\1",a)
#[1] "1020."
gsub("^.*(.\\d{5}).","\\1",a)
#[1] ".90920"
A.K.
----- Original Message -----
From: Dr. Holger van Lishaut <H.v.Lishaut at gmx.de>
To: "r-help at r-project.org" <r-help at r-project.org>
Cc:
Sent: Tuesday, August 21, 2012 3:24 PM
Subject: [R] Regular Expressions in grep
Dear r-help members,
I have a number in the form of a string, say:
a<-"-01020.909200"
I'd like to extract "1020." as well as ".9092"
Front<-grep(pattern="[1-9]+[0-9]*\\.", value=TRUE, x=a,
fixed=FALSE)
End<-grep(pattern="\\.[0-9]*[1-9]+", value=TRUE, x=a, fixed=FALSE)
However, both strings give "-01020.909200", exactly a.
Could you please point me to what is wrong?
Thanks and best regards
H. van Lishaut
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Dr. Holger van Lishaut
2012-Aug-22 19:46 UTC
[R] Regular Expressions in grep - Solution and function to determine significant figures of a number
Dear all,
regmatches works.
And, since this has been asked here before:
SignifStellen<-function(x){
strx=as.character(x)
nchar(regmatches(strx,
regexpr("[1-9][0-9]*\\.[0-9]*[1-9]",strx)))-1
}
returns the significant figures of a number. Perhaps this can help someone.
Thanks & best regards
H. van Lishaut
Bert Gunter
2012-Aug-22 19:53 UTC
[R] Regular Expressions in grep - Solution and function to determine significant figures of a number
... On Wed, Aug 22, 2012 at 12:46 PM, Dr. Holger van Lishaut <H.v.Lishaut at gmx.de> wrote:> Dear all, > > regmatches works. > > And, since this has been asked here before: > > SignifStellen<-function(x){ > strx=as.character(x) > nchar(regmatches(strx, regexpr("[1-9][0-9]*\\.[0-9]*[1-9]",strx)))-1 > } > > returns the significant figures of a number. Perhaps this can help someone.except that ?signif already does this, no? -- Bert> > Thanks & best regards > H. van Lishaut-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
Dr. Holger van Lishaut
2012-Aug-23 18:43 UTC
[R] Regular Expressions in grep - Solution and function to determine significant figures of a number
Am 22.08.2012, 21:46 Uhr, schrieb Dr. Holger van Lishaut <H.v.Lishaut at gmx.de>:> SignifStellen<-function(x){ > strx=as.character(x) > nchar(regmatches(strx, regexpr("[1-9][0-9]*\\.[0-9]*[1-9]",strx)))-1 > } > > returns the significant figures of a number. Perhaps this can help > someone.Sorry, to work, it must read: SignifStellen<-function(x){ strx=as.character(x) intFront <- nchar(regmatches(strx, regexpr("[1-9][0-9]*\\.", strx))) intEnd <- nchar(regmatches(strx, regexpr("\\.[0-9]*[1-9]", strx))) intFront+intEnd-2 } Best regards H. van Lishaut