I am lousy at simple regex and I have not found a solution to a simple problem. I have a vector with some character values that I want to split. Sample data dd1 <- c( "XXY (mat harry)","XXY (jim bob)", "CAMP (joe blow)", "ALP (max jack)") Desired result dd2 <- data.frame( xx = c("XXY", "XXY", "CAMP", "ALP"), yy = c("mat harry", "jim bob" , "joe blow", "max jack")) I thought I should be able to split the characters with strsplit but either I am misunderstanding the function or don't know how to escape a "(" properly in an effort to at least get "XXY" "(mat harry)" Any pointers would be appreciated Thanks John Kane Kingston ON Canada ____________________________________________________________ FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your desktop!
Hello, Try the following. open.par <- " \\(" # with a blank before '(' close.par <- "\\)" result <- strsplit(sub(close.par, "", dd1), open.par) Why the two '\\'? Because '(' is a meta-character so it must be escaped. But '\' is a meta character so it must also be escaped. Then choose the right way to separate the two, maybe something like ix <- rep(c(TRUE, FALSE), length(result)) unlist(result)[ix] unlist(result)[!ix] Hope this helps, Rui Barradas Em 07-07-2012 22:37, John Kane escreveu:> I am lousy at simple regex and I have not found a solution to a simple problem. > > I have a vector with some character values that I want to split. > Sample data > dd1 <- c( "XXY (mat harry)","XXY (jim bob)", "CAMP (joe blow)", "ALP (max jack)") > > Desired result > dd2 <- data.frame( xx = c("XXY", "XXY", "CAMP", "ALP"), yy = c("mat harry", "jim bob" , "joe blow", "max jack")) > > I thought I should be able to split the characters with strsplit but either I am misunderstanding the function or don't know how to escape a "(" properly in an effort to at least get "XXY" "(mat harry)" > > Any pointers would be appreciated > Thanks > John Kane > Kingston ON Canada > > ____________________________________________________________ > FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your desktop! > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Thanks Rui It works perfectly so far on the test and real data. The annoying thing is that I had tried , or thought I'd tried the open.par format and keep getting an error. It looks like I had failed to add the '''', in the term. What is it doing? John Kane Kingston ON Canada> -----Original Message----- > From: ruipbarradas at sapo.pt > Sent: Sat, 07 Jul 2012 22:55:41 +0100 > To: jrkrideau at inbox.com > Subject: Re: [R] Splitting a character vector. > > Hello, > > Try the following. > > open.par <- " \\(" # with a blank before '(' > close.par <- "\\)" > result <- strsplit(sub(close.par, "", dd1), open.par) > > > Why the two '\\'? Because '(' is a meta-character so it must be escaped. > But '\' is a meta character so it must also be escaped. > > Then choose the right way to separate the two, maybe something like > > ix <- rep(c(TRUE, FALSE), length(result)) > unlist(result)[ix] > unlist(result)[!ix] > > > Hope this helps, > > Rui Barradas > > Em 07-07-2012 22:37, John Kane escreveu: >> I am lousy at simple regex and I have not found a solution to a simple >> problem. >> >> I have a vector with some character values that I want to split. >> Sample data >> dd1 <- c( "XXY (mat harry)","XXY (jim bob)", "CAMP (joe blow)", "ALP >> (max jack)") >> >> Desired result >> dd2 <- data.frame( xx = c("XXY", "XXY", "CAMP", "ALP"), yy = c("mat >> harry", "jim bob" , "joe blow", "max jack")) >> >> I thought I should be able to split the characters with strsplit but >> either I am misunderstanding the function or don't know how to escape a >> "(" properly in an effort to at least get "XXY" "(mat harry)" >> >> Any pointers would be appreciated >> Thanks >> John Kane >> Kingston ON Canada >> >> ____________________________________________________________ >> FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on >> your desktop! >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >>____________________________________________________________ FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your desktop!
Oh, right! The close parenthesis isn't doing nothing in the result, t could be done after but since we're to it... Rui Barradas Em 07-07-2012 23:10, Mark Leeds escreveu:> Hi Rui: I think he's asking about your replacement with blanks. > > > On Sat, Jul 7, 2012 at 6:08 PM, Rui Barradas <ruipbarradas at sapo.pt > <mailto:ruipbarradas at sapo.pt>> wrote: > > Hello, > > Sorry, but I don't understand, you're asking about 4 single quotes, > the double quotes in open.par are just opening and closing the > pattern, a character string. > > Rui Barradas > > Em 07-07-2012 23:03, John Kane escreveu: > > Thanks Rui > It works perfectly so far on the test and real data. > > The annoying thing is that I had tried , or thought I'd tried > the open.par format and keep getting an error. > > It looks like I had failed to add the '''', in the term. > What is it doing? > > > > John Kane > Kingston ON Canada > > > -----Original Message----- > From: ruipbarradas at sapo.pt <mailto:ruipbarradas at sapo.pt> > Sent: Sat, 07 Jul 2012 22:55:41 +0100 > To: jrkrideau at inbox.com <mailto:jrkrideau at inbox.com> > Subject: Re: [R] Splitting a character vector. > > Hello, > > Try the following. > > open.par <- " \\(" # with a blank before '(' > close.par <- "\\)" > result <- strsplit(sub(close.par, "", dd1), open.par) > > > Why the two '\\'? Because '(' is a meta-character so it must > be escaped. > But '\' is a meta character so it must also be escaped. > > Then choose the right way to separate the two, maybe > something like > > ix <- rep(c(TRUE, FALSE), length(result)) > unlist(result)[ix] > unlist(result)[!ix] > > > Hope this helps, > > Rui Barradas > > Em 07-07-2012 22:37, John Kane escreveu: > > I am lousy at simple regex and I have not found a > solution to a simple > problem. > > I have a vector with some character values that I want > to split. > Sample data > dd1 <- c( "XXY (mat harry)","XXY (jim bob)", "CAMP > (joe blow)", "ALP > (max jack)") > > Desired result > dd2 <- data.frame( xx = c("XXY", "XXY", "CAMP", > "ALP"), yy = c("mat > harry", "jim bob" , "joe blow", "max jack")) > > I thought I should be able to split the characters with > strsplit but > either I am misunderstanding the function or don't know > how to escape a > "(" properly in an effort to at least get "XXY" "(mat > harry)" > > Any pointers would be appreciated > Thanks > John Kane > Kingston ON Canada > > ______________________________________________________________ > FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, > sharks & orcas on > your desktop! > > ________________________________________________ > R-help at r-project.org <mailto:R-help at r-project.org> > mailing list > https://stat.ethz.ch/mailman/__listinfo/r-help > <https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide > http://www.R-project.org/__posting-guide.html > <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, > reproducible code. > > > ______________________________________________________________ > FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & > orcas on your desktop! > Check it out at http://www.inbox.com/__marineaquarium > <http://www.inbox.com/marineaquarium> > > > > ________________________________________________ > R-help at r-project.org <mailto:R-help at r-project.org> mailing list > https://stat.ethz.ch/mailman/__listinfo/r-help > <https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide > http://www.R-project.org/__posting-guide.html > <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > >
No sorry Rui, In the expression result <- strsplit(sub(close.par, "", dd1), open.par) there is close.par, ''", open.par I probably am just blind but I don't understand what it is doing. John Kane Kingston ON Canada> -----Original Message----- > From: ruipbarradas at sapo.pt > Sent: Sat, 07 Jul 2012 23:08:19 +0100 > To: jrkrideau at inbox.com > Subject: Re: [R] Splitting a character vector. > > Hello, > > Sorry, but I don't understand, you're asking about 4 single quotes, the > double quotes in open.par are just opening and closing the pattern, a > character string. > > Rui Barradas > > Em 07-07-2012 23:03, John Kane escreveu: >> Thanks Rui >> It works perfectly so far on the test and real data. >> >> The annoying thing is that I had tried , or thought I'd tried the >> open.par format and keep getting an error. >> >> It looks like I had failed to add the '''', in the term. What is it >> doing? >> >> >> >> John Kane >> Kingston ON Canada >> >> >>> -----Original Message----- >>> From: ruipbarradas at sapo.pt >>> Sent: Sat, 07 Jul 2012 22:55:41 +0100 >>> To: jrkrideau at inbox.com >>> Subject: Re: [R] Splitting a character vector. >>> >>> Hello, >>> >>> Try the following. >>> >>> open.par <- " \\(" # with a blank before '(' >>> close.par <- "\\)" >>> result <- strsplit(sub(close.par, "", dd1), open.par) >>> >>> >>> Why the two '\\'? Because '(' is a meta-character so it must be >>> escaped. >>> But '\' is a meta character so it must also be escaped. >>> >>> Then choose the right way to separate the two, maybe something like >>> >>> ix <- rep(c(TRUE, FALSE), length(result)) >>> unlist(result)[ix] >>> unlist(result)[!ix] >>> >>> >>> Hope this helps, >>> >>> Rui Barradas >>> >>> Em 07-07-2012 22:37, John Kane escreveu: >>>> I am lousy at simple regex and I have not found a solution to a simple >>>> problem. >>>> >>>> I have a vector with some character values that I want to split. >>>> Sample data >>>> dd1 <- c( "XXY (mat harry)","XXY (jim bob)", "CAMP (joe blow)", "ALP >>>> (max jack)") >>>> >>>> Desired result >>>> dd2 <- data.frame( xx = c("XXY", "XXY", "CAMP", "ALP"), yy = c("mat >>>> harry", "jim bob" , "joe blow", "max jack")) >>>> >>>> I thought I should be able to split the characters with strsplit but >>>> either I am misunderstanding the function or don't know how to escape >>>> a >>>> "(" properly in an effort to at least get "XXY" "(mat harry)" >>>> >>>> Any pointers would be appreciated >>>> Thanks >>>> John Kane >>>> Kingston ON Canada >>>> >>>> ____________________________________________________________ >>>> FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas >>>> on >>>> your desktop! >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >> >> ____________________________________________________________ >> FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on >> your desktop! >> Check it out at http://www.inbox.com/marineaquarium >> >>____________________________________________________________ FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your desktop!
Thanks Jeff. I actually had that figured out after a good hour of pounding my head against the wall but I still could not seem to get the syntax correct. I think I misunderstand strpsplt() just enough to keep making dumb mistakes. John Kane Kingston ON Canada> -----Original Message----- > From: jdnewmil at dcn.davis.ca.us > Sent: Sat, 07 Jul 2012 15:12:16 -0700 > To: ruipbarradas at sapo.pt, jrkrideau at inbox.com > Subject: Re: [R] Splitting a character vector. > > Just to clarify, the regex engine wants to see a \ before the ( if it is > to treat it as an ordinary character. However, the source code > interpreter also treats \ as an escape character. In order to get a \ > into the string, you have to escape it. So it takes two \ characters in > source code to obtain one \ character in memory where the regex code can > "see" it. > --------------------------------------------------------------------------- > Jeff Newmiller The ..... ..... Go > Live... > DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live > Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/Batteries O.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. > rocks...1k > --------------------------------------------------------------------------- > Sent from my phone. Please excuse my brevity. > > Rui Barradas <ruipbarradas at sapo.pt> wrote: > > >Hello, >> > >Try the following. >> > >open.par <- " \\(" # with a blank before '(' > >close.par <- "\\)" > >result <- strsplit(sub(close.par, "", dd1), open.par) >> >> > >Why the two '\\'? Because '(' is a meta-character so it must be > >escaped. > >But '\' is a meta character so it must also be escaped. >> > >Then choose the right way to separate the two, maybe something like >> > >ix <- rep(c(TRUE, FALSE), length(result)) > >unlist(result)[ix] > >unlist(result)[!ix] >> >> > >Hope this helps, >> > >Rui Barradas >> > >Em 07-07-2012 22:37, John Kane escreveu: >>> I am lousy at simple regex and I have not found a solution to a > >simple problem. >>> >>> I have a vector with some character values that I want to split. >>> Sample data >>> dd1 <- c( "XXY (mat harry)","XXY (jim bob)", "CAMP (joe blow)", > >"ALP (max jack)") >>> >>> Desired result >>> dd2 <- data.frame( xx = c("XXY", "XXY", "CAMP", "ALP"), yy = c("mat > >harry", "jim bob" , "joe blow", "max jack")) >>> >>> I thought I should be able to split the characters with strsplit but > >either I am misunderstanding the function or don't know how to escape a > >"(" properly in an effort to at least get "XXY" "(mat harry)" >>> >>> Any pointers would be appreciated >>> Thanks >>> John Kane >>> Kingston ON Canada >>> >>> ____________________________________________________________ >>> FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas > >on your desktop! >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide > >http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> > >______________________________________________ > >R-help at r-project.org mailing list > >https://stat.ethz.ch/mailman/listinfo/r-help > >PLEASE do read the posting guide > >http://www.R-project.org/posting-guide.html > >and provide commented, minimal, self-contained, reproducible code.____________________________________________________________ FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!
Ah, I think Mark may have it. See my earlier post. Why the space? John Kane Kingston ON Canada> -----Original Message----- > From: ruipbarradas at sapo.pt > Sent: Sat, 07 Jul 2012 23:12:46 +0100 > To: markleeds2 at gmail.com > Subject: Re: [R] Splitting a character vector. > > Oh, right! > > The close parenthesis isn't doing nothing in the result, t could be done > after but since we're to it... > > Rui Barradas > > Em 07-07-2012 23:10, Mark Leeds escreveu: >> Hi Rui: I think he's asking about your replacement with blanks. >> >> >> On Sat, Jul 7, 2012 at 6:08 PM, Rui Barradas <ruipbarradas at sapo.pt >> <mailto:ruipbarradas at sapo.pt>> wrote: >> >> Hello, >> >> Sorry, but I don't understand, you're asking about 4 single quotes, >> the double quotes in open.par are just opening and closing the >> pattern, a character string. >> >> Rui Barradas >> >> Em 07-07-2012 23:03, John Kane escreveu: >> >> Thanks Rui >> It works perfectly so far on the test and real data. >> >> The annoying thing is that I had tried , or thought I'd tried >> the open.par format and keep getting an error. >> >> It looks like I had failed to add the '''', in the term. >> What is it doing? >> >> >> >> John Kane >> Kingston ON Canada >> >> >> -----Original Message----- >> From: ruipbarradas at sapo.pt <mailto:ruipbarradas at sapo.pt> >> Sent: Sat, 07 Jul 2012 22:55:41 +0100 >> To: jrkrideau at inbox.com <mailto:jrkrideau at inbox.com> >> Subject: Re: [R] Splitting a character vector. >> >> Hello, >> >> Try the following. >> >> open.par <- " \\(" # with a blank before '(' >> close.par <- "\\)" >> result <- strsplit(sub(close.par, "", dd1), open.par) >> >> >> Why the two '\\'? Because '(' is a meta-character so it must >> be escaped. >> But '\' is a meta character so it must also be escaped. >> >> Then choose the right way to separate the two, maybe >> something like >> >> ix <- rep(c(TRUE, FALSE), length(result)) >> unlist(result)[ix] >> unlist(result)[!ix] >> >> >> Hope this helps, >> >> Rui Barradas >> >> Em 07-07-2012 22:37, John Kane escreveu: >> >> I am lousy at simple regex and I have not found a >> solution to a simple >> problem. >> >> I have a vector with some character values that I want >> to split. >> Sample data >> dd1 <- c( "XXY (mat harry)","XXY (jim bob)", "CAMP >> (joe blow)", "ALP >> (max jack)") >> >> Desired result >> dd2 <- data.frame( xx = c("XXY", "XXY", "CAMP", >> "ALP"), yy = c("mat >> harry", "jim bob" , "joe blow", "max jack")) >> >> I thought I should be able to split the characters with >> strsplit but >> either I am misunderstanding the function or don't know >> how to escape a >> "(" properly in an effort to at least get "XXY" "(mat >> harry)" >> >> Any pointers would be appreciated >> Thanks >> John Kane >> Kingston ON Canada >> >> >> ______________________________________________________________ >> FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, >> sharks & orcas on >> your desktop! >> >> ________________________________________________ >> R-help at r-project.org <mailto:R-help at r-project.org> >> mailing list >> https://stat.ethz.ch/mailman/__listinfo/r-help >> <https://stat.ethz.ch/mailman/listinfo/r-help> >> PLEASE do read the posting guide >> http://www.R-project.org/__posting-guide.html >> <http://www.R-project.org/posting-guide.html> >> and provide commented, minimal, self-contained, >> reproducible code. >> >> >> ______________________________________________________________ >> FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & >> orcas on your desktop! >> Check it out at http://www.inbox.com/__marineaquarium >> <http://www.inbox.com/marineaquarium> >> >> >> >> ________________________________________________ >> R-help at r-project.org <mailto:R-help at r-project.org> mailing list >> https://stat.ethz.ch/mailman/__listinfo/r-help >> <https://stat.ethz.ch/mailman/listinfo/r-help> >> PLEASE do read the posting guide >> http://www.R-project.org/__posting-guide.html >> <http://www.R-project.org/posting-guide.html> >> and provide commented, minimal, self-contained, reproducible code. >> >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.____________________________________________________________ FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family! Visit http://www.inbox.com/photosharing to find out more!
How totaly obvious once you tell me! I would have spend days trying to figure it out. I think I have a total mental block on regex and their derivatives. Thanks very much. John Kane Kingston ON Canada> -----Original Message----- > From: ruipbarradas at sapo.pt > Sent: Sat, 07 Jul 2012 23:21:19 +0100 > To: jrkrideau at inbox.com > Subject: Re: [R] Splitting a character vector. > > It's an empty character string, meant to substitute nothing for > close.par, to get rid of it. > > Rui Barradas > > Em 07-07-2012 23:17, John Kane escreveu: >> No sorry Rui, >> >> In the expression result <- strsplit(sub(close.par, "", dd1), open.par) >> there is close.par, ''", open.par >> >> I probably am just blind but I don't understand what it is doing. >> >> >> >> John Kane >> Kingston ON Canada >> >> >>> -----Original Message----- >>> From: ruipbarradas at sapo.pt >>> Sent: Sat, 07 Jul 2012 23:08:19 +0100 >>> To: jrkrideau at inbox.com >>> Subject: Re: [R] Splitting a character vector. >>> >>> Hello, >>> >>> Sorry, but I don't understand, you're asking about 4 single quotes, the >>> double quotes in open.par are just opening and closing the pattern, a >>> character string. >>> >>> Rui Barradas >>> >>> Em 07-07-2012 23:03, John Kane escreveu: >>>> Thanks Rui >>>> It works perfectly so far on the test and real data. >>>> >>>> The annoying thing is that I had tried , or thought I'd tried the >>>> open.par format and keep getting an error. >>>> >>>> It looks like I had failed to add the '''', in the term. What is >>>> it >>>> doing? >>>> >>>> >>>> >>>> John Kane >>>> Kingston ON Canada >>>> >>>> >>>>> -----Original Message----- >>>>> From: ruipbarradas at sapo.pt >>>>> Sent: Sat, 07 Jul 2012 22:55:41 +0100 >>>>> To: jrkrideau at inbox.com >>>>> Subject: Re: [R] Splitting a character vector. >>>>> >>>>> Hello, >>>>> >>>>> Try the following. >>>>> >>>>> open.par <- " \\(" # with a blank before '(' >>>>> close.par <- "\\)" >>>>> result <- strsplit(sub(close.par, "", dd1), open.par) >>>>> >>>>> >>>>> Why the two '\\'? Because '(' is a meta-character so it must be >>>>> escaped. >>>>> But '\' is a meta character so it must also be escaped. >>>>> >>>>> Then choose the right way to separate the two, maybe something like >>>>> >>>>> ix <- rep(c(TRUE, FALSE), length(result)) >>>>> unlist(result)[ix] >>>>> unlist(result)[!ix] >>>>> >>>>> >>>>> Hope this helps, >>>>> >>>>> Rui Barradas >>>>> >>>>> Em 07-07-2012 22:37, John Kane escreveu: >>>>>> I am lousy at simple regex and I have not found a solution to a >>>>>> simple >>>>>> problem. >>>>>> >>>>>> I have a vector with some character values that I want to split. >>>>>> Sample data >>>>>> dd1 <- c( "XXY (mat harry)","XXY (jim bob)", "CAMP (joe blow)", >>>>>> "ALP >>>>>> (max jack)") >>>>>> >>>>>> Desired result >>>>>> dd2 <- data.frame( xx = c("XXY", "XXY", "CAMP", "ALP"), yy >>>>>> c("mat >>>>>> harry", "jim bob" , "joe blow", "max jack")) >>>>>> >>>>>> I thought I should be able to split the characters with strsplit but >>>>>> either I am misunderstanding the function or don't know how to >>>>>> escape >>>>>> a >>>>>> "(" properly in an effort to at least get "XXY" "(mat harry)" >>>>>> >>>>>> Any pointers would be appreciated >>>>>> Thanks >>>>>> John Kane >>>>>> Kingston ON Canada >>>>>> >>>>>> ____________________________________________________________ >>>>>> FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas >>>>>> on >>>>>> your desktop! >>>>>> >>>>>> ______________________________________________ >>>>>> R-help at r-project.org mailing list >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>> PLEASE do read the posting guide >>>>>> http://www.R-project.org/posting-guide.html >>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>> >>>> >>>> ____________________________________________________________ >>>> FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas >>>> on >>>> your desktop! >>>> Check it out at http://www.inbox.com/marineaquarium >>>> >>>> >> >> ____________________________________________________________ >> FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on >> your desktop! >> Check it out at http://www.inbox.com/marineaquarium >> >>____________________________________________________________ GET FREE SMILEYS FOR YOUR IM & EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM?, MSN? Messenger, Yahoo!? Messenger, ICQ?, Google Talk? and most webmails
I think I'm geting it a bit. Anyway time to shut down and have a beer. Life will be much nice tomorrow or Monday when I get back to cleaning up the data from that spreadsheet. Many thanks and have a good weekend. John Kane Kingston ON Canada> -----Original Message----- > From: ruipbarradas at sapo.pt > Sent: Sat, 07 Jul 2012 23:28:26 +0100 > To: jrkrideau at inbox.com > Subject: Re: [R] Splitting a character vector. > > The space is for a different reason, strsplit doesn't put the split > pattern in the result, so if a space is included it will be > automatically deleted. For instance in "XXY (mat harry)" without the > space it would become "XXY " and "mat harry)" but we want "XXY" so > include the space in the pattern. > > Another example, this one artificial: > > "123AB456" ---> "123" and "456" > > strsplit("123AB456", "B") ---> "123A" and "456" > > So include the "A" in the pattern. It's _exactly_ the same thing. > > Rui Barradas > > Em 07-07-2012 23:21, John Kane escreveu: >> Ah, I think Mark may have it. See my earlier post. Why the space? >> >> John Kane >> Kingston ON Canada >> >> >>> -----Original Message----- >>> From: ruipbarradas at sapo.pt >>> Sent: Sat, 07 Jul 2012 23:12:46 +0100 >>> To: markleeds2 at gmail.com >>> Subject: Re: [R] Splitting a character vector. >>> >>> Oh, right! >>> >>> The close parenthesis isn't doing nothing in the result, t could be >>> done >>> after but since we're to it... >>> >>> Rui Barradas >>> >>> Em 07-07-2012 23:10, Mark Leeds escreveu: >>>> Hi Rui: I think he's asking about your replacement with blanks. >>>> >>>> >>>> On Sat, Jul 7, 2012 at 6:08 PM, Rui Barradas <ruipbarradas at sapo.pt >>>> <mailto:ruipbarradas at sapo.pt>> wrote: >>>> >>>> Hello, >>>> >>>> Sorry, but I don't understand, you're asking about 4 single >>>> quotes, >>>> the double quotes in open.par are just opening and closing the >>>> pattern, a character string. >>>> >>>> Rui Barradas >>>> >>>> Em 07-07-2012 23:03, John Kane escreveu: >>>> >>>> Thanks Rui >>>> It works perfectly so far on the test and real data. >>>> >>>> The annoying thing is that I had tried , or thought I'd tried >>>> the open.par format and keep getting an error. >>>> >>>> It looks like I had failed to add the '''', in the term. >>>> What is it doing? >>>> >>>> >>>> >>>> John Kane >>>> Kingston ON Canada >>>> >>>> >>>> -----Original Message----- >>>> From: ruipbarradas at sapo.pt <mailto:ruipbarradas at sapo.pt> >>>> Sent: Sat, 07 Jul 2012 22:55:41 +0100 >>>> To: jrkrideau at inbox.com <mailto:jrkrideau at inbox.com> >>>> Subject: Re: [R] Splitting a character vector. >>>> >>>> Hello, >>>> >>>> Try the following. >>>> >>>> open.par <- " \\(" # with a blank before '(' >>>> close.par <- "\\)" >>>> result <- strsplit(sub(close.par, "", dd1), open.par) >>>> >>>> >>>> Why the two '\\'? Because '(' is a meta-character so it >>>> must >>>> be escaped. >>>> But '\' is a meta character so it must also be escaped. >>>> >>>> Then choose the right way to separate the two, maybe >>>> something like >>>> >>>> ix <- rep(c(TRUE, FALSE), length(result)) >>>> unlist(result)[ix] >>>> unlist(result)[!ix] >>>> >>>> >>>> Hope this helps, >>>> >>>> Rui Barradas >>>> >>>> Em 07-07-2012 22:37, John Kane escreveu: >>>> >>>> I am lousy at simple regex and I have not found a >>>> solution to a simple >>>> problem. >>>> >>>> I have a vector with some character values that I >>>> want >>>> to split. >>>> Sample data >>>> dd1 <- c( "XXY (mat harry)","XXY (jim bob)", "CAMP >>>> (joe blow)", "ALP >>>> (max jack)") >>>> >>>> Desired result >>>> dd2 <- data.frame( xx = c("XXY", "XXY", "CAMP", >>>> "ALP"), yy = c("mat >>>> harry", "jim bob" , "joe blow", "max jack")) >>>> >>>> I thought I should be able to split the characters >>>> with >>>> strsplit but >>>> either I am misunderstanding the function or don't >>>> know >>>> how to escape a >>>> "(" properly in an effort to at least get "XXY" >>>> "(mat >>>> harry)" >>>> >>>> Any pointers would be appreciated >>>> Thanks >>>> John Kane >>>> Kingston ON Canada >>>> >>>> >>>> ______________________________________________________________ >>>> FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, >>>> sharks & orcas on >>>> your desktop! >>>> >>>> ________________________________________________ >>>> R-help at r-project.org <mailto:R-help at r-project.org> >>>> mailing list >>>> https://stat.ethz.ch/mailman/__listinfo/r-help >>>> <https://stat.ethz.ch/mailman/listinfo/r-help> >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/__posting-guide.html >>>> <http://www.R-project.org/posting-guide.html> >>>> and provide commented, minimal, self-contained, >>>> reproducible code. >>>> >>>> >>>> >>>> ______________________________________________________________ >>>> FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks >>>> & >>>> orcas on your desktop! >>>> Check it out at http://www.inbox.com/__marineaquarium >>>> <http://www.inbox.com/marineaquarium> >>>> >>>> >>>> >>>> ________________________________________________ >>>> R-help at r-project.org <mailto:R-help at r-project.org> mailing list >>>> https://stat.ethz.ch/mailman/__listinfo/r-help >>>> <https://stat.ethz.ch/mailman/listinfo/r-help> >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/__posting-guide.html >>>> <http://www.R-project.org/posting-guide.html> >>>> and provide commented, minimal, self-contained, reproducible >>>> code. >>>> >>>> >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> ____________________________________________________________ >> FREE ONLINE PHOTOSHARING - Share your photos online with your friends >> and family! >> Visit http://www.inbox.com/photosharing to find out more! >> >> >____________________________________________________________ GET FREE SMILEYS FOR YOUR IM & EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM?, MSN? Messenger, Yahoo!? Messenger, ICQ?, Google Talk? and most webmails
On Jul 7, 2012, at 5:37 PM, John Kane wrote:> I am lousy at simple regex and I have not found a solution to a > simple problem. > > I have a vector with some character values that I want to split. > Sample data > dd1 <- c( "XXY (mat harry)","XXY (jim bob)", "CAMP (joe blow)", > "ALP (max jack)") > > Desired result > dd2 <- data.frame( xx = c("XXY", "XXY", "CAMP", "ALP"), yy = > c("mat harry", "jim bob" , "joe blow", "max jack"))data.frame(xx=sub("(\\s\\(.+$)", "", dd1), yy=sub("(.+)(\\s\\()(.+)(\\)$)", "\\3", dd1) ) xx yy 1 XXY mat harry 2 XXY jim bob 3 CAMP joe blow 4 ALP max jack> > I thought I should be able to split the characters with strsplit but > either I am misunderstanding the function or don't know how to > escape a "(" properly in an effort to at least get "XXY" "(mat > harry)" >David Winsemius, MD Heritage Laboratories West Hartford, CT
Works perfectly. Thank you very much indeed. John Kane Kingston ON Canada> -----Original Message----- > From: dwinsemius at comcast.net > Sent: Sat, 7 Jul 2012 21:45:58 -0400 > To: jrkrideau at inbox.com > Subject: Re: [R] Splitting a character vector. > > > On Jul 7, 2012, at 5:37 PM, John Kane wrote: > >> I am lousy at simple regex and I have not found a solution to a >> simple problem. >> >> I have a vector with some character values that I want to split. >> Sample data >> dd1 <- c( "XXY (mat harry)","XXY (jim bob)", "CAMP (joe blow)", >> "ALP (max jack)") >> >> Desired result >> dd2 <- data.frame( xx = c("XXY", "XXY", "CAMP", "ALP"), yy >> c("mat harry", "jim bob" , "joe blow", "max jack")) > > data.frame(xx=sub("(\\s\\(.+$)", "", dd1), > yy=sub("(.+)(\\s\\()(.+)(\\)$)", "\\3", dd1) ) > xx yy > 1 XXY mat harry > 2 XXY jim bob > 3 CAMP joe blow > 4 ALP max jack > > >> >> I thought I should be able to split the characters with strsplit but >> either I am misunderstanding the function or don't know how to >> escape a "(" properly in an effort to at least get "XXY" "(mat >> harry)" >> > > > David Winsemius, MD > Heritage Laboratories > West Hartford, CT____________________________________________________________ FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!
Right, I see it now. Thanks. Who knows in another 100 years I may understand regex. John Kane Kingston ON Canada> -----Original Message----- > From: smartpink111 at yahoo.com > Sent: Sat, 7 Jul 2012 16:19:54 -0700 (PDT) > To: jrkrideau at inbox.com > Subject: Re: [R] Splitting a character vector. > > > > HI John, > If I understand the post, in your original data, there is a space between > XXY and (. > If there was no space, > > dd1? <-? c( "XXY(mat harry)","XXY(jim bob)", "CAMP(joe blow)", "ALP(max > jack)") > > #Rui's original code will produce > result<-strsplit(sub(close.par,"",dd1),open.par) >> result > [[1]] > [1] "XXY(mat harry" > > [[2]] > [1] "XXY(jim bob" > > [[3]] > [1] "CAMP(joe blow" > > [[4]] > [1] "ALP(max jack" > > > #But, if I wanted to get the result as in the original data strsplit, > result<-strsplit(sub(close.par," ",dd1),open.par) >> result > [[1]] > [1] "XXY"??????? "mat harry " > > [[2]] > [1] "XXY"????? "jim bob " > > [[3]] > [1] "CAMP"????? "joe blow " > > [[4]] > [1] "ALP"?????? "max jack " > > > A.K. > > > > > > > ----- Original Message ----- > From: John Kane <jrkrideau at inbox.com> > To: Rui Barradas <ruipbarradas at sapo.pt> > Cc: r-help at r-project.org > Sent: Saturday, July 7, 2012 6:33 PM > Subject: Re: [R] Splitting a character vector. > > I think I'm geting it a bit. Anyway time to shut down and have a beer. > Life will be much nice tomorrow or Monday when I get back to cleaning up > the data from that spreadsheet. > > Many thanks and have a good weekend. > > John Kane > Kingston ON Canada > > >> -----Original Message----- >> From: ruipbarradas at sapo.pt >> Sent: Sat, 07 Jul 2012 23:28:26 +0100 >> To: jrkrideau at inbox.com >> Subject: Re: [R] Splitting a character vector. >> >> The space is for a different reason, strsplit doesn't put the split >> pattern in the result, so if a space is included it will be >> automatically deleted. For instance in "XXY (mat harry)" without the >> space it would become "XXY " and "mat harry)" but we want "XXY" so >> include the space in the pattern. >> >> Another example, this one artificial: >> >> "123AB456" ---> "123" and "456" >> >> strsplit("123AB456", "B") ---> "123A" and "456" >> >> So include the "A" in the pattern. It's _exactly_ the same thing. >> >> Rui Barradas >> >> Em 07-07-2012 23:21, John Kane escreveu: >>> Ah, I think Mark may have it.? See my earlier post.? Why the space? >>> >>> John Kane >>> Kingston ON Canada >>> >>> >>>> -----Original Message----- >>>> From: ruipbarradas at sapo.pt >>>> Sent: Sat, 07 Jul 2012 23:12:46 +0100 >>>> To: markleeds2 at gmail.com >>>> Subject: Re: [R] Splitting a character vector. >>>> >>>> Oh, right! >>>> >>>> The close parenthesis isn't doing nothing in the result, t could be >>>> done >>>> after but since we're to it... >>>> >>>> Rui Barradas >>>> >>>> Em 07-07-2012 23:10, Mark Leeds escreveu: >>>>> Hi Rui: I think he's asking about your replacement with blanks. >>>>> >>>>> >>>>> On Sat, Jul 7, 2012 at 6:08 PM, Rui Barradas <ruipbarradas at sapo.pt >>>>> <mailto:ruipbarradas at sapo.pt>> wrote: >>>>> > >>>>? ? ? Hello, >>>>> > >>>>? ? ? Sorry, but I don't understand, you're asking about 4 single >>>>> quotes, > >>>>? ? ? the double quotes in open.par are just opening and closing the > >>>>? ? ? pattern, a character string. >>>>> > >>>>? ? ? Rui Barradas >>>>> > >>>>? ? ? Em 07-07-2012 23:03, John Kane escreveu: >>>>> > >>>>? ? ? ? ? Thanks Rui > >>>>? ? ? ? ? It works perfectly so far on the test and real data. >>>>> > >>>>? ? ? ? ? The annoying thing is that I had tried , or thought I'd > tried > >>>>? ? ? ? ? the open.par format and keep getting an error. >>>>> > >>>>? ? ? ? ? ? It looks like I had failed to add the '''',? in the > term. > >>>>? ? ? ? ? What is it doing? >>>>> >>>>> >>>>> > >>>>? ? ? ? ? John Kane > >>>>? ? ? ? ? Kingston ON Canada >>>>> >>>>> > >>>>? ? ? ? ? ? ? -----Original Message----- > >>>>? ? ? ? ? ? ? From: ruipbarradas at sapo.pt > <mailto:ruipbarradas at sapo.pt> > >>>>? ? ? ? ? ? ? Sent: Sat, 07 Jul 2012 22:55:41 +0100 > >>>>? ? ? ? ? ? ? To: jrkrideau at inbox.com <mailto:jrkrideau at inbox.com> > >>>>? ? ? ? ? ? ? Subject: Re: [R] Splitting a character vector. >>>>> > >>>>? ? ? ? ? ? ? Hello, >>>>> > >>>>? ? ? ? ? ? ? Try the following. >>>>> > >>>>? ? ? ? ? ? ? open.par <- " \\("? # with a blank before '(' > >>>>? ? ? ? ? ? ? close.par <- "\\)" > >>>>? ? ? ? ? ? ? result <- strsplit(sub(close.par, "", dd1), open.par) >>>>> >>>>> > >>>>? ? ? ? ? ? ? Why the two '\\'? Because '(' is a meta-character so it >>>>> must > >>>>? ? ? ? ? ? ? be escaped. > >>>>? ? ? ? ? ? ? But '\' is a meta character so it must also be escaped. >>>>> > >>>>? ? ? ? ? ? ? Then choose the right way to separate the two, maybe > >>>>? ? ? ? ? ? ? something like >>>>> > >>>>? ? ? ? ? ? ? ix <- rep(c(TRUE, FALSE), length(result)) > >>>>? ? ? ? ? ? ? unlist(result)[ix] > >>>>? ? ? ? ? ? ? unlist(result)[!ix] >>>>> >>>>> > >>>>? ? ? ? ? ? ? Hope this helps, >>>>> > >>>>? ? ? ? ? ? ? Rui Barradas >>>>> > >>>>? ? ? ? ? ? ? Em 07-07-2012 22:37, John Kane escreveu: >>>>> > >>>>? ? ? ? ? ? ? ? ? I am lousy at simple regex and I have not found a > >>>>? ? ? ? ? ? ? ? ? solution to a simple > >>>>? ? ? ? ? ? ? ? ? problem. >>>>> > >>>>? ? ? ? ? ? ? ? ? I have a vector with some character values that I >>>>> want > >>>>? ? ? ? ? ? ? ? ? to split. > >>>>? ? ? ? ? ? ? ? ? Sample data > >>>>? ? ? ? ? ? ? ? ? dd1? <-? c( "XXY (mat harry)","XXY (jim bob)", > "CAMP > >>>>? ? ? ? ? ? ? ? ? (joe blow)", "ALP > >>>>? ? ? ? ? ? ? ? ? (max jack)") >>>>> > >>>>? ? ? ? ? ? ? ? ? Desired result > >>>>? ? ? ? ? ? ? ? ? dd2? <-? data.frame( xx = c("XXY", "XXY", "CAMP", > >>>>? ? ? ? ? ? ? ? ? "ALP"), yy = c("mat > >>>>? ? ? ? ? ? ? ? ? harry", "jim bob" , "joe blow", "max jack")) >>>>> > >>>>? ? ? ? ? ? ? ? ? I thought I should be able to split the characters >>>>> with > >>>>? ? ? ? ? ? ? ? ? strsplit but > >>>>? ? ? ? ? ? ? ? ? either I am misunderstanding the function or don't >>>>> know > >>>>? ? ? ? ? ? ? ? ? how to escape a > >>>>? ? ? ? ? ? ? ? ? "(" properly in an effort to at least get? "XXY" >>>>> "(mat > >>>>? ? ? ? ? ? ? ? ? harry)" >>>>> > >>>>? ? ? ? ? ? ? ? ? Any pointers would be appreciated > >>>>? ? ? ? ? ? ? ? ? Thanks > >>>>? ? ? ? ? ? ? ? ? John Kane > >>>>? ? ? ? ? ? ? ? ? Kingston ON Canada >>>>> >>>>> >>>>> ______________________________________________________________ > >>>>? ? ? ? ? ? ? ? ? FREE 3D MARINE AQUARIUM SCREENSAVER - Watch > dolphins, > >>>>? ? ? ? ? ? ? ? ? sharks & orcas on > >>>>? ? ? ? ? ? ? ? ? your desktop! >>>>> > >>>>? ? ? ? ? ? ? ? ? ________________________________________________ > >>>>? ? ? ? ? ? ? ? ? R-help at r-project.org <mailto:R-help at r-project.org> > >>>>? ? ? ? ? ? ? ? ? mailing list > >>>>? ? ? ? ? ? ? ? ? https://stat.ethz.ch/mailman/__listinfo/r-help > >>>>? ? ? ? ? ? ? ? ? <https://stat.ethz.ch/mailman/listinfo/r-help> > >>>>? ? ? ? ? ? ? ? ? PLEASE do read the posting guide > >>>>? ? ? ? ? ? ? ? ? http://www.R-project.org/__posting-guide.html > >>>>? ? ? ? ? ? ? ? ? <http://www.R-project.org/posting-guide.html> > >>>>? ? ? ? ? ? ? ? ? and provide commented, minimal, self-contained, > >>>>? ? ? ? ? ? ? ? ? reproducible code. >>>>> >>>>> >>>>> >>>>> ______________________________________________________________ > >>>>? ? ? ? ? FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, > sharks >>>>> & > >>>>? ? ? ? ? orcas on your desktop! > >>>>? ? ? ? ? Check it out at http://www.inbox.com/__marineaquarium > >>>>? ? ? ? ? <http://www.inbox.com/marineaquarium> >>>>> >>>>> >>>>> > >>>>? ? ? ________________________________________________ > >>>>? ? ? R-help at r-project.org <mailto:R-help at r-project.org> mailing list > >>>>? ? ? https://stat.ethz.ch/mailman/__listinfo/r-help > >>>>? ? ? <https://stat.ethz.ch/mailman/listinfo/r-help> > >>>>? ? ? PLEASE do read the posting guide > >>>>? ? ? http://www.R-project.org/__posting-guide.html > >>>>? ? ? <http://www.R-project.org/posting-guide.html> > >>>>? ? ? and provide commented, minimal, self-contained, reproducible >>>>> code. >>>>> >>>>> >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> ____________________________________________________________ >>> FREE ONLINE PHOTOSHARING - Share your photos online with your friends >>> and family! >>> Visit http://www.inbox.com/photosharing to find out more! >>> >>> >> > > ____________________________________________________________ > GET FREE SMILEYS FOR YOUR IM & EMAIL - Learn more at > http://www.inbox.com/smileys > Works with AIM?, MSN? Messenger, Yahoo!? Messenger, ICQ?, Google Talk? > and most webmails > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >____________________________________________________________ FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your desktop!