thr3ads.net - R help - [R] strsplit and regex [Oct 2008]

If this information is useful, please help other people find it:
Share via:

Redding, Matthew

2008-Oct-15 21:54 UTC

[R] strsplit and regex

Hi All, 

Is there a means to extract the "10" from "23:10:34" in one
pass using
strsplit (or something else)?
tst <- "23:10:34"

For example my attempt
strsplit(as.character(tst),"^[0-9]*:")
gives
[[1]]
[1] ""   ""   "34"

Obviously it is matching the first two instances of [0-9]. Note that
there may be only one digit before the first ":".

How do I anchor the match to the begginning or better still, just
extract the number I want in one pass?

I can see that I can add "begin" to the beginning of the string, and
match that and do something similar at the end, getting rid of empty
strings
etc - but I think it would take about 3 passess  - and the files are
large. And besides that code would be unlovely.

Kind regards,


Matt Redding
********************************DISCLAIMER**************...{{dropped:15}}

Redding, Matthew

2008-Oct-15 22:13 UTC

head link

[R] strsplit and regex

Hi All, 

Just to make that question a bit harder - how 
do I apply that string extraction to vector of these time strings?

Thanks,

Matt Redding
>-----Original Message-----
>From: r-help-bounces at r-project.org 
>[mailto:r-help-bounces at r-project.org] On Behalf Of Redding, Matthew
>Sent: Thursday, 16 October 2008 7:54 AM
>To: r-help at r-project.org
>Subject: [R] strsplit and regex
>
>Hi All, 
>
>Is there a means to extract the "10" from "23:10:34" in
one
>pass using strsplit (or something else)?
>tst <- "23:10:34"
>
>For example my attempt
>strsplit(as.character(tst),"^[0-9]*:")
>gives
>[[1]]
>[1] ""   ""   "34"
>
>Obviously it is matching the first two instances of [0-9]. 
>Note that there may be only one digit before the first ":".
>
>How do I anchor the match to the begginning or better still, 
>just extract the number I want in one pass?
>
>I can see that I can add "begin" to the beginning of the 
>string, and match that and do something similar at the end, 
>getting rid of empty strings etc - but I think it would take 
>about 3 passess  - and the files are large. And besides that 
>code would be unlovely.
>
>Kind regards,
>
>
>Matt Redding
>********************************DISCLAIMER**************...{{dr
>opped:15...{{dropped:19}}

Erik Iverson

2008-Oct-15 22:23 UTC

head link

[R] strsplit and regex

Matthew -

Redding, Matthew wrote:> Hi All, 
> 
> Is there a means to extract the "10" from "23:10:34" in
one pass using
> strsplit (or something else)?
> tst <- "23:10:34"
> 
> For example my attempt
> strsplit(as.character(tst),"^[0-9]*:")
> gives
> [[1]]
> [1] ""   ""   "34"
Why not simply,

strsplit(tst, ":")

at which point you can subscript to what you want?

To apply to a length n vector

tst2 <- c("23:10:34", "12:08:04", "1:02:03")

strsplit(tst2, ":")

And to extract the second item of each element of the resulting list,

sapply(strsplit(tst2, ":"), "[", 2)

Does this help?

Erik
> 
> Obviously it is matching the first two instances of [0-9]. Note that
> there may be only one digit before the first ":".
> 
> How do I anchor the match to the begginning or better still, just
> extract the number I want in one pass?
> 
> I can see that I can add "begin" to the beginning of the string,
and
> match that and do something similar at the end, getting rid of empty
> strings
> etc - but I think it would take about 3 passess  - and the files are
> large. And besides that code would be unlovely.
> 
> Kind regards,
> 
> 
> Matt Redding
> ********************************DISCLAIMER**************...{{dropped:15}}
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Henrique Dallazuanna

2008-Oct-15 22:35 UTC

head link

[R] strsplit and regex

Try this:

format(strptime(tst, "%H:%M:%S"), "%M")


On Wed, Oct 15, 2008 at 6:54 PM, Redding, Matthew <
Matthew.Redding@dpi.qld.gov.au> wrote:
> Hi All,
>
> Is there a means to extract the "10" from "23:10:34" in
one pass using
> strsplit (or something else)?
> tst <- "23:10:34"
>
> For example my attempt
> strsplit(as.character(tst),"^[0-9]*:")
> gives
> [[1]]
> [1] ""   ""   "34"
>
> Obviously it is matching the first two instances of [0-9]. Note that
> there may be only one digit before the first ":".
>
> How do I anchor the match to the begginning or better still, just
> extract the number I want in one pass?
>
> I can see that I can add "begin" to the beginning of the string,
and
> match that and do something similar at the end, getting rid of empty
> strings
> etc - but I think it would take about 3 passess  - and the files are
> large. And besides that code would be unlovely.
>
> Kind regards,
>
>
> Matt Redding
> ********************************DISCLAIMER**************...{{dropped:15}}
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

	[[alternative HTML version deleted]]

Gabor Grothendieck

2008-Oct-15 22:37 UTC

head link

[R] strsplit and regex

Here are several solutions:

#1
# replace first three and last 3 characters with nothing
x <- c("23:10:34", "01:02:03")
gsub("^...|...$", "", x)

#2
# backref = -1 says extract the portion in parens
# it would have returned a list so we use simplify = c
library(gsubfn)
strapply(x, ":(..):", backref = -1, simplify = c)

#3
# convert to POSIXct and use POSIXct's format
library(chron)
format(as.POSIXct(as.chron(times(x))), "%M")

#4
# chron times are fractions of a day
# this gives numeric result whereas others give character
library(chron)
floor(24 * 60 * as.numeric(times(x))) %% 60


On Wed, Oct 15, 2008 at 5:54 PM, Redding, Matthew
<Matthew.Redding at dpi.qld.gov.au> wrote:> Hi All,
>
> Is there a means to extract the "10" from "23:10:34" in
one pass using
> strsplit (or something else)?
> tst <- "23:10:34"
>
> For example my attempt
> strsplit(as.character(tst),"^[0-9]*:")
> gives
> [[1]]
> [1] ""   ""   "34"
>
> Obviously it is matching the first two instances of [0-9]. Note that
> there may be only one digit before the first ":".
>
> How do I anchor the match to the begginning or better still, just
> extract the number I want in one pass?
>
> I can see that I can add "begin" to the beginning of the string,
and
> match that and do something similar at the end, getting rid of empty
> strings
> etc - but I think it would take about 3 passess  - and the files are
> large. And besides that code would be unlovely.
>
> Kind regards,
>
>
> Matt Redding
> ********************************DISCLAIMER**************...{{dropped:15}}
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Possibly Parallel Threads

Search for more seemingly similar threads

R help - Oct 2008 - strsplit and regex

[R] strsplit and regex

[R] strsplit and regex

[R] strsplit and regex

[R] strsplit and regex

[R] strsplit and regex

Possibly Parallel Threads