thr3ads.net - R help - [R] String manipulation with regexpr, got to be a better way [Sep 2011]

If this information is useful, please help other people find it:
Share via:

Chris Conner

2011-Sep-29 14:23 UTC

[R] String manipulation with regexpr, got to be a better way

Help-Rs,
 
I'm doing some string manipulation in a file where I converted a string date
in mm/dd/yyyy format and returned the date yyyy.
 
I've used regexpr (hat tip to Gabor G for a very nice earlier post on this
function) in steps (I've un-nested the code and provided it and an example
of what I did below.  My question is: is there a more efficient way to do this. 
Specifically is there a way to use regexpr or some other string function to
return not the first instance, but the 2nd (or for that matter 3rd, 4th or 5th
instance) of a certain string?
 
 #first find the first occurence of "/" and create a variable for this
firstslash <- unlist(regexpr("/", dates, fixed = TRUE)) #then use
frist/ to cut the string field into an intermediate variable e.g., from 1/1/2008
to 1/2008. step1 <- substr( dates,  (firstslash + 1), nchar(dates) ) #then
repeat steps 1 and 2...there's got to be a better way step2 <-
unlist(regexpr("/", step1, fixed = TRUE)) #then use step2 to cut
string into final product e.g., from 1/2008 to 2008. final <-
substring(step1,step2 + 1, nchar(step1) )
 
Thx!
C
	[[alternative HTML version deleted]]

Jean V Adams

2011-Sep-29 15:18 UTC

head link

[R] String manipulation with regexpr, got to be a better way

Chris Conner wrote on 09/29/2011 09:23:02 AM:> 
> Help-Rs,
>  
> I'm doing some string manipulation in a file where I converted a 
> string date in mm/dd/yyyy format and returned the date yyyy.
>  
> I've used regexpr (hat tip to Gabor G for a very nice earlier post 
> on this function) in steps (I've un-nested the code and provided it 
> and an example of what I did below.  My question is: is there a more
> efficient way to do this.  Specifically is there a way to use 
> regexpr or some other string function to return not the first 
> instance, but the 2nd (or for that matter 3rd, 4th or 5th instance) 
> of a certain string?
>  
>  #first find the first occurence of "/" and create a variable for
> this firstslash <- unlist(regexpr("/", dates, fixed = TRUE))
#then
> use frist/ to cut the string field into an intermediate variable 
> e.g., from 1/1/2008 to 1/2008. step1 <- substr( dates,  (firstslash 
> + 1), nchar(dates) ) #then repeat steps 1 and 2...there's got to be 
> a better way step2 <- unlist(regexpr("/", step1, fixed =
TRUE))
> #then use step2 to cut string into final product e.g., from 1/2008 
> to 2008. final <- substring(step1,step2 + 1, nchar(step1) )
>  
> Thx!
> C

# a couple example dates
dates <- c("09/10/2003", "10/22/2005")

# split the dates
dates.split <- strsplit(dates, "/")

# extract the years
sapply(dates.split, "[", 3)

Jean
	[[alternative HTML version deleted]]

Eik Vettorazzi

2011-Sep-30 08:45 UTC

head link

[R] String manipulation with regexpr, got to be a better way

Hi Chris,
why not using routines for dates
dates <- c("09/10/2003", "10/22/2005")
format(strptime(dates,format="%m/%d/%Y"),"%Y")

or take just the last 4 chars from dates
gsub(".*([0-9]{4})$","\\1",dates)

cheers

Am 29.09.2011 16:23, schrieb Chris Conner:> Help-Rs,
>  
> I'm doing some string manipulation in a file where I converted a string
date in mm/dd/yyyy format and returned the date yyyy.
>  
> I've used regexpr (hat tip to Gabor G for a very nice earlier post on
this function) in steps (I've un-nested the code and provided it and an
example of what I did below.  My question is: is there a more efficient way to
do this.  Specifically is there a way to use regexpr or some other string
function to return not the first instance, but the 2nd (or for that matter 3rd,
4th or 5th instance) of a certain string?
>  
>  #first find the first occurence of "/" and create a variable for
this firstslash <- unlist(regexpr("/", dates, fixed = TRUE)) #then
use frist/ to cut the string field into an intermediate variable e.g., from
1/1/2008 to 1/2008. step1 <- substr( dates,  (firstslash + 1), nchar(dates) )
#then repeat steps 1 and 2...there's got to be a better way step2 <-
unlist(regexpr("/", step1, fixed = TRUE)) #then use step2 to cut
string into final product e.g., from 1/2008 to 2008. final <-
substring(step1,step2 + 1, nchar(step1) )
>  
> Thx!
> C
> 	[[alternative HTML version deleted]]
> 
> 
> 
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Eik Vettorazzi
Institut f?r Medizinische Biometrie und Epidemiologie
Universit?tsklinikum Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/7410-58243
F ++49/40/7410-57790

--
Pflichtangaben gem?? Gesetz ?ber elektronische Handelsregister und
Genossenschaftsregister sowie das Unternehmensregister (EHUG):

Universit?tsklinikum Hamburg-Eppendorf; K?rperschaft des ?ffentlichen Rechts;
Gerichtsstand: Hamburg

Vorstandsmitglieder: Prof. Dr. J?rg F. Debatin (Vorsitzender), Dr. Alexander
Kirstein, Joachim Pr?l?, Prof. Dr. Dr. Uwe Koch-Gromus

Apparently Analagous Threads

Search for more seemingly similar threads

R help - Sep 2011 - String manipulation with regexpr, got to be a better way

[R] String manipulation with regexpr, got to be a better way

[R] String manipulation with regexpr, got to be a better way

[R] String manipulation with regexpr, got to be a better way

Apparently Analagous Threads