Chris Conner
2011-Sep-29 14:23 UTC
[R] String manipulation with regexpr, got to be a better way
Help-Rs,
I'm doing some string manipulation in a file where I converted a string date
in mm/dd/yyyy format and returned the date yyyy.
I've used regexpr (hat tip to Gabor G for a very nice earlier post on this
function) in steps (I've un-nested the code and provided it and an example
of what I did below. My question is: is there a more efficient way to do this.
Specifically is there a way to use regexpr or some other string function to
return not the first instance, but the 2nd (or for that matter 3rd, 4th or 5th
instance) of a certain string?
#first find the first occurence of "/" and create a variable for this
firstslash <- unlist(regexpr("/", dates, fixed = TRUE)) #then use
frist/ to cut the string field into an intermediate variable e.g., from 1/1/2008
to 1/2008. step1 <- substr( dates, (firstslash + 1), nchar(dates) ) #then
repeat steps 1 and 2...there's got to be a better way step2 <-
unlist(regexpr("/", step1, fixed = TRUE)) #then use step2 to cut
string into final product e.g., from 1/2008 to 2008. final <-
substring(step1,step2 + 1, nchar(step1) )
Thx!
C
[[alternative HTML version deleted]]
Jean V Adams
2011-Sep-29 15:18 UTC
[R] String manipulation with regexpr, got to be a better way
Chris Conner wrote on 09/29/2011 09:23:02 AM:> > Help-Rs, > > I'm doing some string manipulation in a file where I converted a > string date in mm/dd/yyyy format and returned the date yyyy. > > I've used regexpr (hat tip to Gabor G for a very nice earlier post > on this function) in steps (I've un-nested the code and provided it > and an example of what I did below. My question is: is there a more > efficient way to do this. Specifically is there a way to use > regexpr or some other string function to return not the first > instance, but the 2nd (or for that matter 3rd, 4th or 5th instance) > of a certain string? > > #first find the first occurence of "/" and create a variable for > this firstslash <- unlist(regexpr("/", dates, fixed = TRUE)) #then > use frist/ to cut the string field into an intermediate variable > e.g., from 1/1/2008 to 1/2008. step1 <- substr( dates, (firstslash > + 1), nchar(dates) ) #then repeat steps 1 and 2...there's got to be > a better way step2 <- unlist(regexpr("/", step1, fixed = TRUE)) > #then use step2 to cut string into final product e.g., from 1/2008 > to 2008. final <- substring(step1,step2 + 1, nchar(step1) ) > > Thx! > C# a couple example dates dates <- c("09/10/2003", "10/22/2005") # split the dates dates.split <- strsplit(dates, "/") # extract the years sapply(dates.split, "[", 3) Jean [[alternative HTML version deleted]]
Eik Vettorazzi
2011-Sep-30 08:45 UTC
[R] String manipulation with regexpr, got to be a better way
Hi Chris,
why not using routines for dates
dates <- c("09/10/2003", "10/22/2005")
format(strptime(dates,format="%m/%d/%Y"),"%Y")
or take just the last 4 chars from dates
gsub(".*([0-9]{4})$","\\1",dates)
cheers
Am 29.09.2011 16:23, schrieb Chris Conner:> Help-Rs,
>
> I'm doing some string manipulation in a file where I converted a string
date in mm/dd/yyyy format and returned the date yyyy.
>
> I've used regexpr (hat tip to Gabor G for a very nice earlier post on
this function) in steps (I've un-nested the code and provided it and an
example of what I did below. My question is: is there a more efficient way to
do this. Specifically is there a way to use regexpr or some other string
function to return not the first instance, but the 2nd (or for that matter 3rd,
4th or 5th instance) of a certain string?
>
> #first find the first occurence of "/" and create a variable for
this firstslash <- unlist(regexpr("/", dates, fixed = TRUE)) #then
use frist/ to cut the string field into an intermediate variable e.g., from
1/1/2008 to 1/2008. step1 <- substr( dates, (firstslash + 1), nchar(dates) )
#then repeat steps 1 and 2...there's got to be a better way step2 <-
unlist(regexpr("/", step1, fixed = TRUE)) #then use step2 to cut
string into final product e.g., from 1/2008 to 2008. final <-
substring(step1,step2 + 1, nchar(step1) )
>
> Thx!
> C
> [[alternative HTML version deleted]]
>
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Eik Vettorazzi
Institut f?r Medizinische Biometrie und Epidemiologie
Universit?tsklinikum Hamburg-Eppendorf
Martinistr. 52
20246 Hamburg
T ++49/40/7410-58243
F ++49/40/7410-57790
--
Pflichtangaben gem?? Gesetz ?ber elektronische Handelsregister und
Genossenschaftsregister sowie das Unternehmensregister (EHUG):
Universit?tsklinikum Hamburg-Eppendorf; K?rperschaft des ?ffentlichen Rechts;
Gerichtsstand: Hamburg
Vorstandsmitglieder: Prof. Dr. J?rg F. Debatin (Vorsitzender), Dr. Alexander
Kirstein, Joachim Pr?l?, Prof. Dr. Dr. Uwe Koch-Gromus
Reasonably Related Threads
- couting events by subject with "black out" windows
- simplest way (set of functions) to parse a file
- Regular expression \ String Extraction help
- NMDS plot and Adonis (PerMANOVA) of community composition with presence absence and relative intensity
- Help: Input of form 1 to hidden field in form 2?