Dear list, Say I have a vector that has two different types of string test <- c('aaa.bb.cc','aaa.dd') I want to extract the first part of the string (aaa) as a name and save the rest of the string as another name. I was thinking something like sub('(.*)\\.(.*)','\\1',test) but doesn't give me what I want. Appreciate any comments. Thanks. Jun [[alternative HTML version deleted]]
> On Oct 23, 2015, at 2:17 PM, Jun Shen <jun.shen.ut at gmail.com> wrote: > > Dear list, > > Say I have a vector that has two different types of string > > test <- c('aaa.bb.cc','aaa.dd') > > I want to extract the first part of the string (aaa) as a name and save the > rest of the string as another name. > > I was thinking something like > > sub('(.*)\\.(.*)','\\1',test) but doesn't give me what I want. > > > Appreciate any comments. Thanks. > > JunHow about something like this, which presumes that the characters (besides the periods) are only letters:> gsub("^([[:alpha:]]+)\\.(.*)$", "\\1|\\2", test)[1] "aaa|bb.cc" "aaa|dd" or> sub("^([[:alpha:]]+)\\.(.*)$", "\\1|\\2", test)[1] "aaa|bb.cc" "aaa|dd" The above takes the two components, before and after the first '.', adds the "|" as a character in between, to then be used in strsplit():> strsplit(gsub("^([[:alpha:]]+)\\.(.*)$", "\\1|\\2", test), split = "\\|")[[1]] [1] "aaa" "bb.cc" [[2]] [1] "aaa" "dd" See ?regex Regards, Marc Schwartz
> test <- c('aaa.bb.cc','aaa.dd', 'aaa', 'aaa.', '.eee', '') > sub("([^.]*)(.*)", "\\1", test)[1] "aaa" "aaa" "aaa" "aaa" "" ""> sub("([^.]*)(.*)", "\\2", test)[1] ".bb.cc" ".dd" "" "." ".eee" "" Bill Dunlap TIBCO Software wdunlap tibco.com On Fri, Oct 23, 2015 at 12:17 PM, Jun Shen <jun.shen.ut at gmail.com> wrote:> Dear list, > > Say I have a vector that has two different types of string > > test <- c('aaa.bb.cc','aaa.dd') > > I want to extract the first part of the string (aaa) as a name and save the > rest of the string as another name. > > I was thinking something like > > sub('(.*)\\.(.*)','\\1',test) but doesn't give me what I want. > > > Appreciate any comments. Thanks. > > Jun > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi Marc/Bill Thanks for reply. That's exactly what I am looking for. Jun On Fri, Oct 23, 2015 at 3:53 PM, William Dunlap <wdunlap at tibco.com> wrote:> > test <- c('aaa.bb.cc','aaa.dd', 'aaa', 'aaa.', '.eee', '') > > sub("([^.]*)(.*)", "\\1", test) > [1] "aaa" "aaa" "aaa" "aaa" "" "" > > sub("([^.]*)(.*)", "\\2", test) > [1] ".bb.cc" ".dd" "" "." ".eee" "" > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > > On Fri, Oct 23, 2015 at 12:17 PM, Jun Shen <jun.shen.ut at gmail.com> wrote: > > Dear list, > > > > Say I have a vector that has two different types of string > > > > test <- c('aaa.bb.cc','aaa.dd') > > > > I want to extract the first part of the string (aaa) as a name and save > the > > rest of the string as another name. > > > > I was thinking something like > > > > sub('(.*)\\.(.*)','\\1',test) but doesn't give me what I want. > > > > > > Appreciate any comments. Thanks. > > > > Jun > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]