Chris Conner
2011-Sep-29 14:23 UTC
[R] String manipulation with regexpr, got to be a better way
Help-Rs, I'm doing some string manipulation in a file where I converted a string date in mm/dd/yyyy format and returned the date yyyy. I've used regexpr (hat tip to Gabor G for a very nice earlier post on this function) in steps (I've un-nested the code and provided it and an example of what I did below. My question is: is there a more efficient way to do this. Specifically is there a way to use regexpr or some other string function to return not the first instance, but the 2nd (or for that matter 3rd, 4th or 5th instance) of a certain string? #first find the first occurence of "/" and create a variable for this firstslash <- unlist(regexpr("/", dates, fixed = TRUE)) #then use frist/ to cut the string field into an intermediate variable e.g., from 1/1/2008 to 1/2008. step1 <- substr( dates, (firstslash + 1), nchar(dates) ) #then repeat steps 1 and 2...there's got to be a better way step2 <- unlist(regexpr("/", step1, fixed = TRUE)) #then use step2 to cut string into final product e.g., from 1/2008 to 2008. final <- substring(step1,step2 + 1, nchar(step1) ) Thx! C [[alternative HTML version deleted]]
Jean V Adams
2011-Sep-29 15:18 UTC
[R] String manipulation with regexpr, got to be a better way
Chris Conner wrote on 09/29/2011 09:23:02 AM:> > Help-Rs, > > I'm doing some string manipulation in a file where I converted a > string date in mm/dd/yyyy format and returned the date yyyy. > > I've used regexpr (hat tip to Gabor G for a very nice earlier post > on this function) in steps (I've un-nested the code and provided it > and an example of what I did below. My question is: is there a more > efficient way to do this. Specifically is there a way to use > regexpr or some other string function to return not the first > instance, but the 2nd (or for that matter 3rd, 4th or 5th instance) > of a certain string? > > #first find the first occurence of "/" and create a variable for > this firstslash <- unlist(regexpr("/", dates, fixed = TRUE)) #then > use frist/ to cut the string field into an intermediate variable > e.g., from 1/1/2008 to 1/2008. step1 <- substr( dates, (firstslash > + 1), nchar(dates) ) #then repeat steps 1 and 2...there's got to be > a better way step2 <- unlist(regexpr("/", step1, fixed = TRUE)) > #then use step2 to cut string into final product e.g., from 1/2008 > to 2008. final <- substring(step1,step2 + 1, nchar(step1) ) > > Thx! > C# a couple example dates dates <- c("09/10/2003", "10/22/2005") # split the dates dates.split <- strsplit(dates, "/") # extract the years sapply(dates.split, "[", 3) Jean [[alternative HTML version deleted]]
Eik Vettorazzi
2011-Sep-30 08:45 UTC
[R] String manipulation with regexpr, got to be a better way
Hi Chris, why not using routines for dates dates <- c("09/10/2003", "10/22/2005") format(strptime(dates,format="%m/%d/%Y"),"%Y") or take just the last 4 chars from dates gsub(".*([0-9]{4})$","\\1",dates) cheers Am 29.09.2011 16:23, schrieb Chris Conner:> Help-Rs, > > I'm doing some string manipulation in a file where I converted a string date in mm/dd/yyyy format and returned the date yyyy. > > I've used regexpr (hat tip to Gabor G for a very nice earlier post on this function) in steps (I've un-nested the code and provided it and an example of what I did below. My question is: is there a more efficient way to do this. Specifically is there a way to use regexpr or some other string function to return not the first instance, but the 2nd (or for that matter 3rd, 4th or 5th instance) of a certain string? > > #first find the first occurence of "/" and create a variable for this firstslash <- unlist(regexpr("/", dates, fixed = TRUE)) #then use frist/ to cut the string field into an intermediate variable e.g., from 1/1/2008 to 1/2008. step1 <- substr( dates, (firstslash + 1), nchar(dates) ) #then repeat steps 1 and 2...there's got to be a better way step2 <- unlist(regexpr("/", step1, fixed = TRUE)) #then use step2 to cut string into final product e.g., from 1/2008 to 2008. final <- substring(step1,step2 + 1, nchar(step1) ) > > Thx! > C > [[alternative HTML version deleted]] > > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Eik Vettorazzi Institut f?r Medizinische Biometrie und Epidemiologie Universit?tsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 -- Pflichtangaben gem?? Gesetz ?ber elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universit?tsklinikum Hamburg-Eppendorf; K?rperschaft des ?ffentlichen Rechts; Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. J?rg F. Debatin (Vorsitzender), Dr. Alexander Kirstein, Joachim Pr?l?, Prof. Dr. Dr. Uwe Koch-Gromus
Possibly Parallel Threads
- couting events by subject with "black out" windows
- simplest way (set of functions) to parse a file
- Regular expression \ String Extraction help
- NMDS plot and Adonis (PerMANOVA) of community composition with presence absence and relative intensity
- Help: Input of form 1 to hidden field in form 2?