Jun Shen
2016-Sep-07 01:20 UTC
[R] element wise pattern recognition and string substitution
Hi Jeff, Thanks for the reply. I tried your suggestion and it doesn't seem to work and I tried a simple pattern as follows and it works as expected sub("(3\\.mg\\.kg)\\.(>50-70\\.kg)\\.(.*)", '\\1', "3.mg.kg.>50-70.kg.P05") [1] "3.mg.kg" sub("(3\\.mg\\.kg)\\.(>50-70\\.kg)\\.(.*)", '\\2', "3.mg.kg.>50-70.kg.P05") [1] ">50-70.kg" sub("(3\\.mg\\.kg)\\.(>50-70\\.kg)\\.(.*)", '\\3', "3.mg.kg.>50-70.kg.P05") [1] "P05" My problem is the pattern has to be dynamically constructed on the input data of the function I am writing. It's actually not too difficult to assemble the final.pattern with some code like the following sort.var <- c('TX','WTCUT') combn.sort.var <- do.call(expand.grid, lapply(sort.var, function(x)paste('(',gsub('\\.','\\\\.',unlist(unique(all.exposure[x]))), ')', sep=''))) all.patterns <- do.call(paste, c(combn.sort.var, '(.*)', sep='\\.')) final.pattern <- paste0(all.patterns, collapse='|') You cannot run the code directly since the data object "all.exposure" is not provided here. Jun On Tue, Sep 6, 2016 at 8:18 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:> I am not near my computer today, but each parenthesis gets its own result > number, so you should put the parenthesis around the whole pattern of > alternatives instead of having many parentheses. > > I recommend thinking in terms of what common information you expect to > find in these various strings, and place your parentheses to capture that > information. There is no other reason to put parentheses in the pattern... > they are not grouping symbols. > -- > Sent from my phone. Please excuse my brevity. > > On September 6, 2016 5:01:04 PM PDT, Bert Gunter <bgunter.4567 at gmail.com> > wrote: > >Jun: > > > >1. Tell us your desired result from your test vector and maybe someone > >will help. > > > >2. As we played this game once already (you couldn't do it; I showed > >you how), this seems to be a function of your limitations with regular > >expressions. I'm probably not much better, but in any case, I don't > >intend to be your consultant. See if you can find someone locally to > >help you if you do not receive a satisfactory reply from the list. > >There are many people here who are pretty good at this sort of thing, > >but I don't know if they'll reply. Regex's are certainly complex. PERL > >people tend to be pretty good at them, I believe. There are numerous > >web sites and books on them if you need to acquire expertise for your > >work. > > > >Cheers, > >Bert > >Bert Gunter > > > >"The trouble with having an open mind is that people keep coming along > >and sticking things into it." > >-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > > > >On Tue, Sep 6, 2016 at 3:59 PM, Jun Shen <jun.shen.ut at gmail.com> wrote: > >> Hi Bert, > >> > >> I still couldn't make the multiple patterns to work. Here is an > >example. I > >> make the pattern as follows > >> > >> final.pattern <- > >> > >"(240\\.m\\.g)\\.(>50-70\\.kg)\\.(.*)|(3\\.mg\\.kg)\\.(> > 50-70\\.kg)\\.(.*)|(240\\.m\\.g)\\.(>70-90\\.kg)\\.(.*)|(3\\ > .mg\\.kg)\\.(>70-90\\.kg)\\.(.*)|(240\\.m\\.g)\\.(>90-110\\. > kg)\\.(.*)|(3\\.mg\\.kg)\\.(>90-110\\.kg)\\.(.*)|(240\\.m\\ > .g)\\.(50\\.kg\\.or\\.less)\\.(.*)|(3\\.mg\\.kg)\\.(50\\.kg\ > \.or\\.less)\\.(.*)|(240\\.m\\.g)\\.(>110\\.kg)\\.(.*)|(3\\. > mg\\.kg)\\.(>110\\.kg)\\.(.*)" > >> > >> test.string <- c('240.m.g.>110.kg.geo.mean', '3.mg.kg.>110.kg.P05', > >> '240.m.g.>50-70.kg.geo.mean') > >> > >> sub(final.pattern, '\\1', test.string) > >> sub(final.pattern, '\\2', test.string) > >> sub(final.pattern, '\\3', test.string) > >> > >> Only the third string has been correctly parsed, which matches the > >first > >> pattern. It seems the rest of the patterns are not called. > >> > >> Jun > >> > >> > >> On Mon, Sep 5, 2016 at 10:21 PM, Bert Gunter <bgunter.4567 at gmail.com> > >wrote: > >>> > >>> Just noticed: My clumsy do.call() line in my previously posted code > >>> below should be replaced with: > >>> pat <- paste(pat,collapse = "|") > >>> > >>> > >>> > pat <- c(pat1,pat2) > >>> > paste(pat,collapse="|") > >>> [1] "a+\\.*a+|b+\\.*b+" > >>> > >>> ************ replace this ************************** > >>> > pat <- do.call(paste,c(as.list(pat), sep="|")) > >>> ******************************************** > >>> > sub(paste0("^[^b]*(",pat,").*$"),"\\1",z) > >>> [1] "a.a" "bb" "b.bbb" > >>> > >>> > >>> -- Bert > >>> Bert Gunter > >>> > >>> "The trouble with having an open mind is that people keep coming > >along > >>> and sticking things into it." > >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >>> > >>> > >>> On Mon, Sep 5, 2016 at 12:11 PM, Bert Gunter > ><bgunter.4567 at gmail.com> > >>> wrote: > >>> > Jun: > >>> > > >>> > You need to provide a clear specification via regular expressions > >of > >>> > the patterns you wish to match -- at least for me to decipher it. > >>> > Others may be smarter than I, though... > >>> > > >>> > Jeff: Thanks. I have now convinced myself that it can be done (a > >>> > "proof" of sorts): If pat1, pat2,..., patn are m different > >patterns > >>> > (in a vector of patterns) to be matched in a vector of n strings, > >>> > where only one of the patterns will match in any string, then use > >>> > paste() (probably via do.call()) or otherwise to paste them > >together > >>> > separated by "|" to form the concatenated pattern, pat. Then > >>> > > >>> > sub(paste0("^.*(",pat, ").*$"),"\\1",thevector) > >>> > > >>> > should extract the matching pattern in each (perhaps with a little > >>> > fiddling due to precedence rules); e.g. > >>> > > >>> >> z <-c(".fg.h.g.a.a", "bb..dd.ef.tgf.", "foo...b.bbb.tgy") > >>> > > >>> >> pat1 <- "a+\\.*a+" > >>> >> pat2 <-"b+\\.*b+" > >>> >> pat <- c(pat1,pat2) > >>> > > >>> >> pat <- do.call(paste,c(as.list(pat), sep="|")) > >>> >> pat > >>> > [1] "a+\\.*a+|b+\\.*b+" > >>> > > >>> >> sub(paste0("^[^b]*(",pat,").*$"), "\\1", z) > >>> > [1] "a.a" "bb" "b.bbb" > >>> > > >>> > Cheers, > >>> > Bert > >>> > > >>> > > >>> > Bert Gunter > >>> > > >>> > "The trouble with having an open mind is that people keep coming > >along > >>> > and sticking things into it." > >>> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >>> > > >>> > > >>> > On Mon, Sep 5, 2016 at 9:56 AM, Jun Shen <jun.shen.ut at gmail.com> > >wrote: > >>> >> Thanks for the reply, Bert. > >>> >> > >>> >> Your solution solves the example. I actually have a more general > >>> >> situation > >>> >> where I have this dot concatenated string from multiple > >variables. The > >>> >> problem is those variables may have values with dots in there. > >The > >>> >> number of > >>> >> dots are not consistent for all values of a variable. So I am > >thinking > >>> >> to > >>> >> define a vector of patterns for the vector of the string and > >hopefully > >>> >> to > >>> >> find a way to use a pattern from the pattern vector for each > >value of > >>> >> the > >>> >> string vector. The only way I can think of is "for" loop, which > >can be > >>> >> slow. > >>> >> Also these are happening in a function I am writing. Just wonder > >if > >>> >> there is > >>> >> another more efficient way. Thanks a lot. > >>> >> > >>> >> Jun > >>> >> > >>> >> On Mon, Sep 5, 2016 at 1:41 AM, Bert Gunter > ><bgunter.4567 at gmail.com> > >>> >> wrote: > >>> >>> > >>> >>> Well, he did provide an example, and... > >>> >>> > >>> >>> > >>> >>> > z <- c('TX.WT.CUT.mean','mg.tx.cv') > >>> >>> > >>> >>> > sub("^.+?\\.(.+)\\.[^.]+$","\\1",z) > >>> >>> [1] "WT.CUT" "tx" > >>> >>> > >>> >>> > >>> >>> ## seems to do what was requested. > >>> >>> > >>> >>> Jeff would have to amplify on his initial statement however: do > >you > >>> >>> mean that separate patterns can always be combined via "|" ? Or > >>> >>> something deeper? > >>> >>> > >>> >>> Cheers, > >>> >>> Bert > >>> >>> Bert Gunter > >>> >>> > >>> >>> "The trouble with having an open mind is that people keep coming > >along > >>> >>> and sticking things into it." > >>> >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip > >) > >>> >>> > >>> >>> > >>> >>> On Sun, Sep 4, 2016 at 9:30 PM, Jeff Newmiller > >>> >>> <jdnewmil at dcn.davis.ca.us> > >>> >>> wrote: > >>> >>> > Your opening assertion is false. > >>> >>> > > >>> >>> > Provide a reproducible example and someone will demonstrate. > >>> >>> > -- > >>> >>> > Sent from my phone. Please excuse my brevity. > >>> >>> > > >>> >>> > On September 4, 2016 9:06:59 PM PDT, Jun Shen > >>> >>> > <jun.shen.ut at gmail.com> > >>> >>> > wrote: > >>> >>> >>Dear list, > >>> >>> >> > >>> >>> >>I have a vector of strings that cannot be described by one > >pattern. > >>> >>> >> So > >>> >>> >>let's say I construct a vector of patterns in the same length > >as the > >>> >>> >>vector > >>> >>> >>of strings, can I do the element wise pattern recognition and > >string > >>> >>> >>substitution. > >>> >>> >> > >>> >>> >>For example, > >>> >>> >> > >>> >>> >>pattern1 <- "([^.]*)\\.([^.]*\\.[^.]*)\\.(.*)" > >>> >>> >>pattern2 <- "([^.]*)\\.([^.]*)\\.(.*)" > >>> >>> >> > >>> >>> >>patterns <- c(pattern1,pattern2) > >>> >>> >>strings <- c('TX.WT.CUT.mean','mg.tx.cv') > >>> >>> >> > >>> >>> >>Say I want to extract "WT.CUT" from the first string and "tx" > >from > >>> >>> >> the > >>> >>> >>second string. If I do > >>> >>> >> > >>> >>> >>sub(patterns, '\\2', strings), only the first pattern will be > >used. > >>> >>> >> > >>> >>> >>looping the patterns doesn't work the way I want. Appreciate > >any > >>> >>> >>comments. > >>> >>> >>Thanks. > >>> >>> >> > >>> >>> >>Jun > >>> >>> >> > >>> >>> >> [[alternative HTML version deleted]] > >>> >>> >> > >>> >>> >>______________________________________________ > >>> >>> >>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, > >see > >>> >>> >>https://stat.ethz.ch/mailman/listinfo/r-help > >>> >>> >>PLEASE do read the posting guide > >>> >>> >>http://www.R-project.org/posting-guide.html > >>> >>> >>and provide commented, minimal, self-contained, reproducible > >code. > >>> >>> > > >>> >>> > ______________________________________________ > >>> >>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, > >see > >>> >>> > https://stat.ethz.ch/mailman/listinfo/r-help > >>> >>> > PLEASE do read the posting guide > >>> >>> > http://www.R-project.org/posting-guide.html > >>> >>> > and provide commented, minimal, self-contained, reproducible > >code. > >>> >> > >>> >> > >> > >> > >[[alternative HTML version deleted]]
Ista Zahn
2016-Sep-07 01:44 UTC
[R] element wise pattern recognition and string substitution
If you want to mach each element of 'strings' to a different regex, do it. Here are three ways, using your original example. pattern1 <- "([^.]*)\\.([^.]*\\.[^.]*)\\.(.*)" pattern2 <- "([^.]*)\\.([^.]*)\\.(.*)" patterns <- c(pattern1,pattern2) strings <- c('TX.WT.CUT.mean','mg.tx.cv') for(i in seq(strings)) print(sub(patterns[i], "\\2", strings[i])) mapply(sub, pattern = patterns, x = strings, MoreArgs=list(replacement = "\\2")) library(stringi) stri_replace_all_regex(strings, patterns, "$2") Best, Ista On Tue, Sep 6, 2016 at 9:20 PM, Jun Shen <jun.shen.ut at gmail.com> wrote:> Hi Jeff, > > Thanks for the reply. I tried your suggestion and it doesn't seem to work > and I tried a simple pattern as follows and it works as expected > > sub("(3\\.mg\\.kg)\\.(>50-70\\.kg)\\.(.*)", '\\1', "3.mg.kg.>50-70.kg.P05") > [1] "3.mg.kg" > > sub("(3\\.mg\\.kg)\\.(>50-70\\.kg)\\.(.*)", '\\2', "3.mg.kg.>50-70.kg.P05") > [1] ">50-70.kg" > > sub("(3\\.mg\\.kg)\\.(>50-70\\.kg)\\.(.*)", '\\3', "3.mg.kg.>50-70.kg.P05") > [1] "P05" > > My problem is the pattern has to be dynamically constructed on the input > data of the function I am writing. It's actually not too difficult to > assemble the final.pattern with some code like the following > > sort.var <- c('TX','WTCUT') > combn.sort.var <- do.call(expand.grid, lapply(sort.var, > function(x)paste('(',gsub('\\.','\\\\.',unlist(unique(all.exposure[x]))), > ')', sep=''))) > all.patterns <- do.call(paste, c(combn.sort.var, '(.*)', sep='\\.')) > final.pattern <- paste0(all.patterns, collapse='|') > > You cannot run the code directly since the data object "all.exposure" is > not provided here. > > Jun > > > > On Tue, Sep 6, 2016 at 8:18 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> > wrote: > >> I am not near my computer today, but each parenthesis gets its own result >> number, so you should put the parenthesis around the whole pattern of >> alternatives instead of having many parentheses. >> >> I recommend thinking in terms of what common information you expect to >> find in these various strings, and place your parentheses to capture that >> information. There is no other reason to put parentheses in the pattern... >> they are not grouping symbols. >> -- >> Sent from my phone. Please excuse my brevity. >> >> On September 6, 2016 5:01:04 PM PDT, Bert Gunter <bgunter.4567 at gmail.com> >> wrote: >> >Jun: >> > >> >1. Tell us your desired result from your test vector and maybe someone >> >will help. >> > >> >2. As we played this game once already (you couldn't do it; I showed >> >you how), this seems to be a function of your limitations with regular >> >expressions. I'm probably not much better, but in any case, I don't >> >intend to be your consultant. See if you can find someone locally to >> >help you if you do not receive a satisfactory reply from the list. >> >There are many people here who are pretty good at this sort of thing, >> >but I don't know if they'll reply. Regex's are certainly complex. PERL >> >people tend to be pretty good at them, I believe. There are numerous >> >web sites and books on them if you need to acquire expertise for your >> >work. >> > >> >Cheers, >> >Bert >> >Bert Gunter >> > >> >"The trouble with having an open mind is that people keep coming along >> >and sticking things into it." >> >-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> > >> > >> >On Tue, Sep 6, 2016 at 3:59 PM, Jun Shen <jun.shen.ut at gmail.com> wrote: >> >> Hi Bert, >> >> >> >> I still couldn't make the multiple patterns to work. Here is an >> >example. I >> >> make the pattern as follows >> >> >> >> final.pattern <- >> >> >> >"(240\\.m\\.g)\\.(>50-70\\.kg)\\.(.*)|(3\\.mg\\.kg)\\.(> >> 50-70\\.kg)\\.(.*)|(240\\.m\\.g)\\.(>70-90\\.kg)\\.(.*)|(3\\ >> .mg\\.kg)\\.(>70-90\\.kg)\\.(.*)|(240\\.m\\.g)\\.(>90-110\\. >> kg)\\.(.*)|(3\\.mg\\.kg)\\.(>90-110\\.kg)\\.(.*)|(240\\.m\\ >> .g)\\.(50\\.kg\\.or\\.less)\\.(.*)|(3\\.mg\\.kg)\\.(50\\.kg\ >> \.or\\.less)\\.(.*)|(240\\.m\\.g)\\.(>110\\.kg)\\.(.*)|(3\\. >> mg\\.kg)\\.(>110\\.kg)\\.(.*)" >> >> >> >> test.string <- c('240.m.g.>110.kg.geo.mean', '3.mg.kg.>110.kg.P05', >> >> '240.m.g.>50-70.kg.geo.mean') >> >> >> >> sub(final.pattern, '\\1', test.string) >> >> sub(final.pattern, '\\2', test.string) >> >> sub(final.pattern, '\\3', test.string) >> >> >> >> Only the third string has been correctly parsed, which matches the >> >first >> >> pattern. It seems the rest of the patterns are not called. >> >> >> >> Jun >> >> >> >> >> >> On Mon, Sep 5, 2016 at 10:21 PM, Bert Gunter <bgunter.4567 at gmail.com> >> >wrote: >> >>> >> >>> Just noticed: My clumsy do.call() line in my previously posted code >> >>> below should be replaced with: >> >>> pat <- paste(pat,collapse = "|") >> >>> >> >>> >> >>> > pat <- c(pat1,pat2) >> >>> > paste(pat,collapse="|") >> >>> [1] "a+\\.*a+|b+\\.*b+" >> >>> >> >>> ************ replace this ************************** >> >>> > pat <- do.call(paste,c(as.list(pat), sep="|")) >> >>> ******************************************** >> >>> > sub(paste0("^[^b]*(",pat,").*$"),"\\1",z) >> >>> [1] "a.a" "bb" "b.bbb" >> >>> >> >>> >> >>> -- Bert >> >>> Bert Gunter >> >>> >> >>> "The trouble with having an open mind is that people keep coming >> >along >> >>> and sticking things into it." >> >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >>> >> >>> >> >>> On Mon, Sep 5, 2016 at 12:11 PM, Bert Gunter >> ><bgunter.4567 at gmail.com> >> >>> wrote: >> >>> > Jun: >> >>> > >> >>> > You need to provide a clear specification via regular expressions >> >of >> >>> > the patterns you wish to match -- at least for me to decipher it. >> >>> > Others may be smarter than I, though... >> >>> > >> >>> > Jeff: Thanks. I have now convinced myself that it can be done (a >> >>> > "proof" of sorts): If pat1, pat2,..., patn are m different >> >patterns >> >>> > (in a vector of patterns) to be matched in a vector of n strings, >> >>> > where only one of the patterns will match in any string, then use >> >>> > paste() (probably via do.call()) or otherwise to paste them >> >together >> >>> > separated by "|" to form the concatenated pattern, pat. Then >> >>> > >> >>> > sub(paste0("^.*(",pat, ").*$"),"\\1",thevector) >> >>> > >> >>> > should extract the matching pattern in each (perhaps with a little >> >>> > fiddling due to precedence rules); e.g. >> >>> > >> >>> >> z <-c(".fg.h.g.a.a", "bb..dd.ef.tgf.", "foo...b.bbb.tgy") >> >>> > >> >>> >> pat1 <- "a+\\.*a+" >> >>> >> pat2 <-"b+\\.*b+" >> >>> >> pat <- c(pat1,pat2) >> >>> > >> >>> >> pat <- do.call(paste,c(as.list(pat), sep="|")) >> >>> >> pat >> >>> > [1] "a+\\.*a+|b+\\.*b+" >> >>> > >> >>> >> sub(paste0("^[^b]*(",pat,").*$"), "\\1", z) >> >>> > [1] "a.a" "bb" "b.bbb" >> >>> > >> >>> > Cheers, >> >>> > Bert >> >>> > >> >>> > >> >>> > Bert Gunter >> >>> > >> >>> > "The trouble with having an open mind is that people keep coming >> >along >> >>> > and sticking things into it." >> >>> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >>> > >> >>> > >> >>> > On Mon, Sep 5, 2016 at 9:56 AM, Jun Shen <jun.shen.ut at gmail.com> >> >wrote: >> >>> >> Thanks for the reply, Bert. >> >>> >> >> >>> >> Your solution solves the example. I actually have a more general >> >>> >> situation >> >>> >> where I have this dot concatenated string from multiple >> >variables. The >> >>> >> problem is those variables may have values with dots in there. >> >The >> >>> >> number of >> >>> >> dots are not consistent for all values of a variable. So I am >> >thinking >> >>> >> to >> >>> >> define a vector of patterns for the vector of the string and >> >hopefully >> >>> >> to >> >>> >> find a way to use a pattern from the pattern vector for each >> >value of >> >>> >> the >> >>> >> string vector. The only way I can think of is "for" loop, which >> >can be >> >>> >> slow. >> >>> >> Also these are happening in a function I am writing. Just wonder >> >if >> >>> >> there is >> >>> >> another more efficient way. Thanks a lot. >> >>> >> >> >>> >> Jun >> >>> >> >> >>> >> On Mon, Sep 5, 2016 at 1:41 AM, Bert Gunter >> ><bgunter.4567 at gmail.com> >> >>> >> wrote: >> >>> >>> >> >>> >>> Well, he did provide an example, and... >> >>> >>> >> >>> >>> >> >>> >>> > z <- c('TX.WT.CUT.mean','mg.tx.cv') >> >>> >>> >> >>> >>> > sub("^.+?\\.(.+)\\.[^.]+$","\\1",z) >> >>> >>> [1] "WT.CUT" "tx" >> >>> >>> >> >>> >>> >> >>> >>> ## seems to do what was requested. >> >>> >>> >> >>> >>> Jeff would have to amplify on his initial statement however: do >> >you >> >>> >>> mean that separate patterns can always be combined via "|" ? Or >> >>> >>> something deeper? >> >>> >>> >> >>> >>> Cheers, >> >>> >>> Bert >> >>> >>> Bert Gunter >> >>> >>> >> >>> >>> "The trouble with having an open mind is that people keep coming >> >along >> >>> >>> and sticking things into it." >> >>> >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip >> >) >> >>> >>> >> >>> >>> >> >>> >>> On Sun, Sep 4, 2016 at 9:30 PM, Jeff Newmiller >> >>> >>> <jdnewmil at dcn.davis.ca.us> >> >>> >>> wrote: >> >>> >>> > Your opening assertion is false. >> >>> >>> > >> >>> >>> > Provide a reproducible example and someone will demonstrate. >> >>> >>> > -- >> >>> >>> > Sent from my phone. Please excuse my brevity. >> >>> >>> > >> >>> >>> > On September 4, 2016 9:06:59 PM PDT, Jun Shen >> >>> >>> > <jun.shen.ut at gmail.com> >> >>> >>> > wrote: >> >>> >>> >>Dear list, >> >>> >>> >> >> >>> >>> >>I have a vector of strings that cannot be described by one >> >pattern. >> >>> >>> >> So >> >>> >>> >>let's say I construct a vector of patterns in the same length >> >as the >> >>> >>> >>vector >> >>> >>> >>of strings, can I do the element wise pattern recognition and >> >string >> >>> >>> >>substitution. >> >>> >>> >> >> >>> >>> >>For example, >> >>> >>> >> >> >>> >>> >>pattern1 <- "([^.]*)\\.([^.]*\\.[^.]*)\\.(.*)" >> >>> >>> >>pattern2 <- "([^.]*)\\.([^.]*)\\.(.*)" >> >>> >>> >> >> >>> >>> >>patterns <- c(pattern1,pattern2) >> >>> >>> >>strings <- c('TX.WT.CUT.mean','mg.tx.cv') >> >>> >>> >> >> >>> >>> >>Say I want to extract "WT.CUT" from the first string and "tx" >> >from >> >>> >>> >> the >> >>> >>> >>second string. If I do >> >>> >>> >> >> >>> >>> >>sub(patterns, '\\2', strings), only the first pattern will be >> >used. >> >>> >>> >> >> >>> >>> >>looping the patterns doesn't work the way I want. Appreciate >> >any >> >>> >>> >>comments. >> >>> >>> >>Thanks. >> >>> >>> >> >> >>> >>> >>Jun >> >>> >>> >> >> >>> >>> >> [[alternative HTML version deleted]] >> >>> >>> >> >> >>> >>> >>______________________________________________ >> >>> >>> >>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, >> >see >> >>> >>> >>https://stat.ethz.ch/mailman/listinfo/r-help >> >>> >>> >>PLEASE do read the posting guide >> >>> >>> >>http://www.R-project.org/posting-guide.html >> >>> >>> >>and provide commented, minimal, self-contained, reproducible >> >code. >> >>> >>> > >> >>> >>> > ______________________________________________ >> >>> >>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, >> >see >> >>> >>> > https://stat.ethz.ch/mailman/listinfo/r-help >> >>> >>> > PLEASE do read the posting guide >> >>> >>> > http://www.R-project.org/posting-guide.html >> >>> >>> > and provide commented, minimal, self-contained, reproducible >> >code. >> >>> >> >> >>> >> >> >> >> >> >> >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Jun Shen
2016-Sep-07 03:59 UTC
[R] element wise pattern recognition and string substitution
Hi Ista, Thanks for the suggestion. I didn't know mapply can be used this way! Let me take one more step. Instead of defining a pattern for each string, I would like to define a set of patterns from all the possible combination of the unique values of those variables. Then I need each string to find a pattern for itself. I know this is getting a little stretching. Thanks for all the suggestion/comments from everyone. Jun On Tue, Sep 6, 2016 at 9:44 PM, Ista Zahn <istazahn at gmail.com> wrote:> If you want to mach each element of 'strings' to a different regex, do > it. Here are three ways, using your original example. > > pattern1 <- "([^.]*)\\.([^.]*\\.[^.]*)\\.(.*)" > pattern2 <- "([^.]*)\\.([^.]*)\\.(.*)" > > patterns <- c(pattern1,pattern2) > strings <- c('TX.WT.CUT.mean','mg.tx.cv') > > for(i in seq(strings)) print(sub(patterns[i], "\\2", strings[i])) > > mapply(sub, pattern = patterns, x = strings, MoreArgs=list(replacement > "\\2")) > > library(stringi) > stri_replace_all_regex(strings, patterns, "$2") > > Best, > Ista > On Tue, Sep 6, 2016 at 9:20 PM, Jun Shen <jun.shen.ut at gmail.com> wrote: > > Hi Jeff, > > > > Thanks for the reply. I tried your suggestion and it doesn't seem to work > > and I tried a simple pattern as follows and it works as expected > > > > sub("(3\\.mg\\.kg)\\.(>50-70\\.kg)\\.(.*)", '\\1', "3.mg.kg > .>50-70.kg.P05") > > [1] "3.mg.kg" > > > > sub("(3\\.mg\\.kg)\\.(>50-70\\.kg)\\.(.*)", '\\2', "3.mg.kg > .>50-70.kg.P05") > > [1] ">50-70.kg" > > > > sub("(3\\.mg\\.kg)\\.(>50-70\\.kg)\\.(.*)", '\\3', "3.mg.kg > .>50-70.kg.P05") > > [1] "P05" > > > > My problem is the pattern has to be dynamically constructed on the input > > data of the function I am writing. It's actually not too difficult to > > assemble the final.pattern with some code like the following > > > > sort.var <- c('TX','WTCUT') > > combn.sort.var <- do.call(expand.grid, lapply(sort.var, > > function(x)paste('(',gsub('\\.','\\\\.',unlist(unique(all. > exposure[x]))), > > ')', sep=''))) > > all.patterns <- do.call(paste, c(combn.sort.var, '(.*)', sep='\\.')) > > final.pattern <- paste0(all.patterns, collapse='|') > > > > You cannot run the code directly since the data object "all.exposure" is > > not provided here. > > > > Jun > > > > > > > > On Tue, Sep 6, 2016 at 8:18 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us > > > > wrote: > > > >> I am not near my computer today, but each parenthesis gets its own > result > >> number, so you should put the parenthesis around the whole pattern of > >> alternatives instead of having many parentheses. > >> > >> I recommend thinking in terms of what common information you expect to > >> find in these various strings, and place your parentheses to capture > that > >> information. There is no other reason to put parentheses in the > pattern... > >> they are not grouping symbols. > >> -- > >> Sent from my phone. Please excuse my brevity. > >> > >> On September 6, 2016 5:01:04 PM PDT, Bert Gunter < > bgunter.4567 at gmail.com> > >> wrote: > >> >Jun: > >> > > >> >1. Tell us your desired result from your test vector and maybe someone > >> >will help. > >> > > >> >2. As we played this game once already (you couldn't do it; I showed > >> >you how), this seems to be a function of your limitations with regular > >> >expressions. I'm probably not much better, but in any case, I don't > >> >intend to be your consultant. See if you can find someone locally to > >> >help you if you do not receive a satisfactory reply from the list. > >> >There are many people here who are pretty good at this sort of thing, > >> >but I don't know if they'll reply. Regex's are certainly complex. PERL > >> >people tend to be pretty good at them, I believe. There are numerous > >> >web sites and books on them if you need to acquire expertise for your > >> >work. > >> > > >> >Cheers, > >> >Bert > >> >Bert Gunter > >> > > >> >"The trouble with having an open mind is that people keep coming along > >> >and sticking things into it." > >> >-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >> > > >> > > >> >On Tue, Sep 6, 2016 at 3:59 PM, Jun Shen <jun.shen.ut at gmail.com> > wrote: > >> >> Hi Bert, > >> >> > >> >> I still couldn't make the multiple patterns to work. Here is an > >> >example. I > >> >> make the pattern as follows > >> >> > >> >> final.pattern <- > >> >> > >> >"(240\\.m\\.g)\\.(>50-70\\.kg)\\.(.*)|(3\\.mg\\.kg)\\.(> > >> 50-70\\.kg)\\.(.*)|(240\\.m\\.g)\\.(>70-90\\.kg)\\.(.*)|(3\\ > >> .mg\\.kg)\\.(>70-90\\.kg)\\.(.*)|(240\\.m\\.g)\\.(>90-110\\. > >> kg)\\.(.*)|(3\\.mg\\.kg)\\.(>90-110\\.kg)\\.(.*)|(240\\.m\\ > >> .g)\\.(50\\.kg\\.or\\.less)\\.(.*)|(3\\.mg\\.kg)\\.(50\\.kg\ > >> \.or\\.less)\\.(.*)|(240\\.m\\.g)\\.(>110\\.kg)\\.(.*)|(3\\. > >> mg\\.kg)\\.(>110\\.kg)\\.(.*)" > >> >> > >> >> test.string <- c('240.m.g.>110.kg.geo.mean', '3.mg.kg.>110.kg.P05', > >> >> '240.m.g.>50-70.kg.geo.mean') > >> >> > >> >> sub(final.pattern, '\\1', test.string) > >> >> sub(final.pattern, '\\2', test.string) > >> >> sub(final.pattern, '\\3', test.string) > >> >> > >> >> Only the third string has been correctly parsed, which matches the > >> >first > >> >> pattern. It seems the rest of the patterns are not called. > >> >> > >> >> Jun > >> >> > >> >> > >> >> On Mon, Sep 5, 2016 at 10:21 PM, Bert Gunter <bgunter.4567 at gmail.com > > > >> >wrote: > >> >>> > >> >>> Just noticed: My clumsy do.call() line in my previously posted code > >> >>> below should be replaced with: > >> >>> pat <- paste(pat,collapse = "|") > >> >>> > >> >>> > >> >>> > pat <- c(pat1,pat2) > >> >>> > paste(pat,collapse="|") > >> >>> [1] "a+\\.*a+|b+\\.*b+" > >> >>> > >> >>> ************ replace this ************************** > >> >>> > pat <- do.call(paste,c(as.list(pat), sep="|")) > >> >>> ******************************************** > >> >>> > sub(paste0("^[^b]*(",pat,").*$"),"\\1",z) > >> >>> [1] "a.a" "bb" "b.bbb" > >> >>> > >> >>> > >> >>> -- Bert > >> >>> Bert Gunter > >> >>> > >> >>> "The trouble with having an open mind is that people keep coming > >> >along > >> >>> and sticking things into it." > >> >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >> >>> > >> >>> > >> >>> On Mon, Sep 5, 2016 at 12:11 PM, Bert Gunter > >> ><bgunter.4567 at gmail.com> > >> >>> wrote: > >> >>> > Jun: > >> >>> > > >> >>> > You need to provide a clear specification via regular expressions > >> >of > >> >>> > the patterns you wish to match -- at least for me to decipher it. > >> >>> > Others may be smarter than I, though... > >> >>> > > >> >>> > Jeff: Thanks. I have now convinced myself that it can be done (a > >> >>> > "proof" of sorts): If pat1, pat2,..., patn are m different > >> >patterns > >> >>> > (in a vector of patterns) to be matched in a vector of n strings, > >> >>> > where only one of the patterns will match in any string, then use > >> >>> > paste() (probably via do.call()) or otherwise to paste them > >> >together > >> >>> > separated by "|" to form the concatenated pattern, pat. Then > >> >>> > > >> >>> > sub(paste0("^.*(",pat, ").*$"),"\\1",thevector) > >> >>> > > >> >>> > should extract the matching pattern in each (perhaps with a little > >> >>> > fiddling due to precedence rules); e.g. > >> >>> > > >> >>> >> z <-c(".fg.h.g.a.a", "bb..dd.ef.tgf.", "foo...b.bbb.tgy") > >> >>> > > >> >>> >> pat1 <- "a+\\.*a+" > >> >>> >> pat2 <-"b+\\.*b+" > >> >>> >> pat <- c(pat1,pat2) > >> >>> > > >> >>> >> pat <- do.call(paste,c(as.list(pat), sep="|")) > >> >>> >> pat > >> >>> > [1] "a+\\.*a+|b+\\.*b+" > >> >>> > > >> >>> >> sub(paste0("^[^b]*(",pat,").*$"), "\\1", z) > >> >>> > [1] "a.a" "bb" "b.bbb" > >> >>> > > >> >>> > Cheers, > >> >>> > Bert > >> >>> > > >> >>> > > >> >>> > Bert Gunter > >> >>> > > >> >>> > "The trouble with having an open mind is that people keep coming > >> >along > >> >>> > and sticking things into it." > >> >>> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >> >>> > > >> >>> > > >> >>> > On Mon, Sep 5, 2016 at 9:56 AM, Jun Shen <jun.shen.ut at gmail.com> > >> >wrote: > >> >>> >> Thanks for the reply, Bert. > >> >>> >> > >> >>> >> Your solution solves the example. I actually have a more general > >> >>> >> situation > >> >>> >> where I have this dot concatenated string from multiple > >> >variables. The > >> >>> >> problem is those variables may have values with dots in there. > >> >The > >> >>> >> number of > >> >>> >> dots are not consistent for all values of a variable. So I am > >> >thinking > >> >>> >> to > >> >>> >> define a vector of patterns for the vector of the string and > >> >hopefully > >> >>> >> to > >> >>> >> find a way to use a pattern from the pattern vector for each > >> >value of > >> >>> >> the > >> >>> >> string vector. The only way I can think of is "for" loop, which > >> >can be > >> >>> >> slow. > >> >>> >> Also these are happening in a function I am writing. Just wonder > >> >if > >> >>> >> there is > >> >>> >> another more efficient way. Thanks a lot. > >> >>> >> > >> >>> >> Jun > >> >>> >> > >> >>> >> On Mon, Sep 5, 2016 at 1:41 AM, Bert Gunter > >> ><bgunter.4567 at gmail.com> > >> >>> >> wrote: > >> >>> >>> > >> >>> >>> Well, he did provide an example, and... > >> >>> >>> > >> >>> >>> > >> >>> >>> > z <- c('TX.WT.CUT.mean','mg.tx.cv') > >> >>> >>> > >> >>> >>> > sub("^.+?\\.(.+)\\.[^.]+$","\\1",z) > >> >>> >>> [1] "WT.CUT" "tx" > >> >>> >>> > >> >>> >>> > >> >>> >>> ## seems to do what was requested. > >> >>> >>> > >> >>> >>> Jeff would have to amplify on his initial statement however: do > >> >you > >> >>> >>> mean that separate patterns can always be combined via "|" ? Or > >> >>> >>> something deeper? > >> >>> >>> > >> >>> >>> Cheers, > >> >>> >>> Bert > >> >>> >>> Bert Gunter > >> >>> >>> > >> >>> >>> "The trouble with having an open mind is that people keep coming > >> >along > >> >>> >>> and sticking things into it." > >> >>> >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip > >> >) > >> >>> >>> > >> >>> >>> > >> >>> >>> On Sun, Sep 4, 2016 at 9:30 PM, Jeff Newmiller > >> >>> >>> <jdnewmil at dcn.davis.ca.us> > >> >>> >>> wrote: > >> >>> >>> > Your opening assertion is false. > >> >>> >>> > > >> >>> >>> > Provide a reproducible example and someone will demonstrate. > >> >>> >>> > -- > >> >>> >>> > Sent from my phone. Please excuse my brevity. > >> >>> >>> > > >> >>> >>> > On September 4, 2016 9:06:59 PM PDT, Jun Shen > >> >>> >>> > <jun.shen.ut at gmail.com> > >> >>> >>> > wrote: > >> >>> >>> >>Dear list, > >> >>> >>> >> > >> >>> >>> >>I have a vector of strings that cannot be described by one > >> >pattern. > >> >>> >>> >> So > >> >>> >>> >>let's say I construct a vector of patterns in the same length > >> >as the > >> >>> >>> >>vector > >> >>> >>> >>of strings, can I do the element wise pattern recognition and > >> >string > >> >>> >>> >>substitution. > >> >>> >>> >> > >> >>> >>> >>For example, > >> >>> >>> >> > >> >>> >>> >>pattern1 <- "([^.]*)\\.([^.]*\\.[^.]*)\\.(.*)" > >> >>> >>> >>pattern2 <- "([^.]*)\\.([^.]*)\\.(.*)" > >> >>> >>> >> > >> >>> >>> >>patterns <- c(pattern1,pattern2) > >> >>> >>> >>strings <- c('TX.WT.CUT.mean','mg.tx.cv') > >> >>> >>> >> > >> >>> >>> >>Say I want to extract "WT.CUT" from the first string and "tx" > >> >from > >> >>> >>> >> the > >> >>> >>> >>second string. If I do > >> >>> >>> >> > >> >>> >>> >>sub(patterns, '\\2', strings), only the first pattern will be > >> >used. > >> >>> >>> >> > >> >>> >>> >>looping the patterns doesn't work the way I want. Appreciate > >> >any > >> >>> >>> >>comments. > >> >>> >>> >>Thanks. > >> >>> >>> >> > >> >>> >>> >>Jun > >> >>> >>> >> > >> >>> >>> >> [[alternative HTML version deleted]] > >> >>> >>> >> > >> >>> >>> >>______________________________________________ > >> >>> >>> >>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, > >> >see > >> >>> >>> >>https://stat.ethz.ch/mailman/listinfo/r-help > >> >>> >>> >>PLEASE do read the posting guide > >> >>> >>> >>http://www.R-project.org/posting-guide.html > >> >>> >>> >>and provide commented, minimal, self-contained, reproducible > >> >code. > >> >>> >>> > > >> >>> >>> > ______________________________________________ > >> >>> >>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, > >> >see > >> >>> >>> > https://stat.ethz.ch/mailman/listinfo/r-help > >> >>> >>> > PLEASE do read the posting guide > >> >>> >>> > http://www.R-project.org/posting-guide.html > >> >>> >>> > and provide commented, minimal, self-contained, reproducible > >> >code. > >> >>> >> > >> >>> >> > >> >> > >> >> > >> > >> > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Jeff Newmiller
2016-Sep-07 07:04 UTC
[R] element wise pattern recognition and string substitution
Here are some suggestions: test.string <- c( '240.m.g.>110.kg.geo.mean' , '3.mg.kg.>110.kg.P05' , '240.m.g.>50-70.kg.geo.mean' ) # based on your literal idea suggested.pattern1 <- "(240\\.m\\.g|3\\.mg\\.kg)\\.(>50-70\\.kg|>70-90\\.kg|>90-110\\.kg|50\\.kg\\.or\\.less|>110\\.kg)\\.(.*)" resultL <- strsplit( sub( suggested.pattern1 , "\\1\t\\2\t\\3" , test.string ) , split = "\t" ) # equivalent based on apparent repetitive patterns in your sample data suggested.pattern2 <- "(.*?m\\.g|kg)\\.(.*?kg|.*?less)\\.(.*)" resultL2 <- strsplit( sub( suggested.pattern2 , "\\1\t\\2\t\\3" , test.string ) , split = "\t" ) # put results into an organized table DF <- setNames( data.frame( do.call( rbind, resultL ) ) , c( "First", "Second", "Third" ) ) By the way... please aim to make your examples reproducible. It would have been easy for you to define the necessary variables in example form rather than sending a non-reproducible example. On Tue, 6 Sep 2016, Jun Shen wrote:> Hi Jeff, > > Thanks for the reply. I tried your suggestion and it doesn't seem to work and I tried a simple pattern as follows and it works as expected > > sub("(3\\.mg\\.kg)\\.(>50-70\\.kg)\\.(.*)", '\\1', "3.mg.kg.>50-70.kg.P05") > [1] "3.mg.kg" > > sub("(3\\.mg\\.kg)\\.(>50-70\\.kg)\\.(.*)", '\\2', "3.mg.kg.>50-70.kg.P05") > [1] ">50-70.kg" > > sub("(3\\.mg\\.kg)\\.(>50-70\\.kg)\\.(.*)", '\\3', "3.mg.kg.>50-70.kg.P05") > [1] "P05" > > My problem is the pattern has to be dynamically constructed on the input data of the function I am writing. It's actually not too difficult > to assemble the final.pattern with some code like the following > > sort.var <- c('TX','WTCUT') > combn.sort.var <- do.call(expand.grid, lapply(sort.var, function(x)paste('(',gsub('\\.','\\\\.',unlist(unique(all.exposure[x]))), ')', > sep=''))) > all.patterns <- do.call(paste, c(combn.sort.var, '(.*)', sep='\\.')) > final.pattern <- paste0(all.patterns, collapse='|') > > You cannot run the code directly since the data object "all.exposure" is not provided here. > > Jun > > > > On Tue, Sep 6, 2016 at 8:18 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote: > I am not near my computer today, but each parenthesis gets its own result number, so you should put the parenthesis around the > whole pattern of alternatives instead of having many parentheses. > > I recommend thinking in terms of what common information you expect to find in these various strings, and place your parentheses > to capture that information. There is no other reason to put parentheses in the pattern... they are not grouping symbols. > -- > Sent from my phone. Please excuse my brevity. > > On September 6, 2016 5:01:04 PM PDT, Bert Gunter <bgunter.4567 at gmail.com> wrote: > >Jun: > > > >1. Tell us your desired result from your test vector and maybe someone > >will help. > > > >2. As we played this game once already (you couldn't do it; I showed > >you how), this seems to be a function of your limitations with regular > >expressions. I'm probably not much better, but in any case, I don't > >intend to be your consultant. See if you can find someone locally to > >help you if you do not receive a satisfactory reply from the list. > >There are many people here who are pretty good at this sort of thing, > >but I don't know if they'll reply. Regex's are certainly complex. PERL > >people tend to be pretty good at them, I believe. There are numerous > >web sites and books on them if you need to acquire expertise for your > >work. > > > >Cheers, > >Bert > >Bert Gunter > > > >"The trouble with having an open mind is that people keep coming along > >and sticking things into it." > >-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > > > >On Tue, Sep 6, 2016 at 3:59 PM, Jun Shen <jun.shen.ut at gmail.com> wrote: > >> Hi Bert, > >> > >> I still couldn't make the multiple patterns to work. Here is an > >example. I > >> make the pattern as follows > >> > >> final.pattern <- > >> > >"(240\\.m\\.g)\\.(>50-70\\.kg)\\.(.*)|(3\\.mg\\.kg)\\.(>50-70\\.kg)\\.(.*)|(240\\.m\\.g)\\.(>70-90\\.kg)\\.(.*)|(3\\.mg\\.kg)\\.(>70-90\\.k > g)\\.(.*)|(240\\.m\\.g)\\.(>90-110\\.kg)\\.(.*)|(3\\.mg\\.kg)\\.(>90-110\\.kg)\\.(.*)|(240\\.m\\.g)\\.(50\\.kg\\.or\\.less)\\.(.*)|(3\\.mg\\ > .kg)\\.(50\\.kg\\.or\\.less)\\.(.*)|(240\\.m\\.g)\\.(>110\\.kg)\\.(.*)|(3\\.mg\\.kg)\\.(>110\\.kg)\\.(.*)" > >> > >> test.string <- c('240.m.g.>110.kg.geo.mean', '3.mg.kg.>110.kg.P05', > >> '240.m.g.>50-70.kg.geo.mean') > >> > >> sub(final.pattern, '\\1', test.string) > >> sub(final.pattern, '\\2', test.string) > >> sub(final.pattern, '\\3', test.string) > >> > >> Only the third string has been correctly parsed, which matches the > >first > >> pattern. It seems the rest of the patterns are not called. > >> > >> Jun > >> > >> > >> On Mon, Sep 5, 2016 at 10:21 PM, Bert Gunter <bgunter.4567 at gmail.com> > >wrote: > >>> > >>> Just noticed: My clumsy do.call() line in my previously posted code > >>> below should be replaced with: > >>> pat <- paste(pat,collapse = "|") > >>> > >>> > >>> > pat <- c(pat1,pat2) > >>> > paste(pat,collapse="|") > >>> [1] "a+\\.*a+|b+\\.*b+" > >>> > >>> ************ replace this ************************** > >>> > pat <- do.call(paste,c(as.list(pat), sep="|")) > >>> ******************************************** > >>> > sub(paste0("^[^b]*(",pat,").*$"),"\\1",z) > >>> [1] "a.a"? ?"bb"? ? "b.bbb" > >>> > >>> > >>> -- Bert > >>> Bert Gunter > >>> > >>> "The trouble with having an open mind is that people keep coming > >along > >>> and sticking things into it." > >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >>> > >>> > >>> On Mon, Sep 5, 2016 at 12:11 PM, Bert Gunter > ><bgunter.4567 at gmail.com> > >>> wrote: > >>> > Jun: > >>> > > >>> > You need to provide a clear specification via regular expressions > >of > >>> > the patterns you wish to match -- at least for me to decipher it. > >>> > Others may be smarter than I, though... > >>> > > >>> > Jeff: Thanks. I have now convinced myself that it can be done (a > >>> > "proof" of sorts): If pat1, pat2,..., patn are m different > >patterns > >>> > (in a vector of patterns)? to be matched in a vector of n strings, > >>> > where only one of the patterns will match in any string,? then use > >>> > paste() (probably via do.call()) or otherwise to paste them > >together > >>> > separated by "|" to form the concatenated pattern, pat. Then > >>> > > >>> > sub(paste0("^.*(",pat, ").*$"),"\\1",thevector) > >>> > > >>> > should extract the matching pattern in each (perhaps with a little > >>> > fiddling due to precedence rules); e.g. > >>> > > >>> >> z <-c(".fg.h.g.a.a", "bb..dd.ef.tgf.", "foo...b.bbb.tgy") > >>> > > >>> >> pat1 <- "a+\\.*a+" > >>> >> pat2 <-"b+\\.*b+" > >>> >> pat <- c(pat1,pat2) > >>> > > >>> >> pat <- do.call(paste,c(as.list(pat), sep="|")) > >>> >> pat > >>> > [1] "a+\\.*a+|b+\\.*b+" > >>> > > >>> >> sub(paste0("^[^b]*(",pat,").*$"), "\\1", z) > >>> > [1] "a.a"? ?"bb"? ? "b.bbb" > >>> > > >>> > Cheers, > >>> > Bert > >>> > > >>> > > >>> > Bert Gunter > >>> > > >>> > "The trouble with having an open mind is that people keep coming > >along > >>> > and sticking things into it." > >>> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >>> > > >>> > > >>> > On Mon, Sep 5, 2016 at 9:56 AM, Jun Shen <jun.shen.ut at gmail.com> > >wrote: > >>> >> Thanks for the reply, Bert. > >>> >> > >>> >> Your solution solves the example. I actually have a more general > >>> >> situation > >>> >> where I have this dot concatenated string from multiple > >variables. The > >>> >> problem is those variables may have values with dots in there. > >The > >>> >> number of > >>> >> dots are not consistent for all values of a variable. So I am > >thinking > >>> >> to > >>> >> define a vector of patterns for the vector of the string and > >hopefully > >>> >> to > >>> >> find a way to use a pattern from the pattern vector for each > >value of > >>> >> the > >>> >> string vector. The only way I can think of is "for" loop, which > >can be > >>> >> slow. > >>> >> Also these are happening in a function I am writing. Just wonder > >if > >>> >> there is > >>> >> another more efficient way. Thanks a lot. > >>> >> > >>> >> Jun > >>> >> > >>> >> On Mon, Sep 5, 2016 at 1:41 AM, Bert Gunter > ><bgunter.4567 at gmail.com> > >>> >> wrote: > >>> >>> > >>> >>> Well, he did provide an example, and... > >>> >>> > >>> >>> > >>> >>> > z <- c('TX.WT.CUT.mean','mg.tx.cv') > >>> >>> > >>> >>> > sub("^.+?\\.(.+)\\.[^.]+$","\\1",z) > >>> >>> [1] "WT.CUT" "tx" > >>> >>> > >>> >>> > >>> >>> ## seems to do what was requested. > >>> >>> > >>> >>> Jeff would have to amplify on his initial statement however: do > >you > >>> >>> mean that separate patterns can always be combined via "|" ?? Or > >>> >>> something deeper? > >>> >>> > >>> >>> Cheers, > >>> >>> Bert > >>> >>> Bert Gunter > >>> >>> > >>> >>> "The trouble with having an open mind is that people keep coming > >along > >>> >>> and sticking things into it." > >>> >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip > >) > >>> >>> > >>> >>> > >>> >>> On Sun, Sep 4, 2016 at 9:30 PM, Jeff Newmiller > >>> >>> <jdnewmil at dcn.davis.ca.us> > >>> >>> wrote: > >>> >>> > Your opening assertion is false. > >>> >>> > > >>> >>> > Provide a reproducible example and someone will demonstrate. > >>> >>> > -- > >>> >>> > Sent from my phone. Please excuse my brevity. > >>> >>> > > >>> >>> > On September 4, 2016 9:06:59 PM PDT, Jun Shen > >>> >>> > <jun.shen.ut at gmail.com> > >>> >>> > wrote: > >>> >>> >>Dear list, > >>> >>> >> > >>> >>> >>I have a vector of strings that cannot be described by one > >pattern. > >>> >>> >> So > >>> >>> >>let's say I construct a vector of patterns in the same length > >as the > >>> >>> >>vector > >>> >>> >>of strings, can I do the element wise pattern recognition and > >string > >>> >>> >>substitution. > >>> >>> >> > >>> >>> >>For example, > >>> >>> >> > >>> >>> >>pattern1 <- "([^.]*)\\.([^.]*\\.[^.]*)\\.(.*)" > >>> >>> >>pattern2 <- "([^.]*)\\.([^.]*)\\.(.*)" > >>> >>> >> > >>> >>> >>patterns <- c(pattern1,pattern2) > >>> >>> >>strings <- c('TX.WT.CUT.mean','mg.tx.cv') > >>> >>> >> > >>> >>> >>Say I want to extract "WT.CUT" from the first string and "tx" > >from > >>> >>> >> the > >>> >>> >>second string. If I do > >>> >>> >> > >>> >>> >>sub(patterns, '\\2', strings), only the first pattern will be > >used. > >>> >>> >> > >>> >>> >>looping the patterns doesn't work the way I want. Appreciate > >any > >>> >>> >>comments. > >>> >>> >>Thanks. > >>> >>> >> > >>> >>> >>Jun > >>> >>> >> > >>> >>> >>? ? ? ?[[alternative HTML version deleted]] > >>> >>> >> > >>> >>> >>______________________________________________ > >>> >>> >>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, > >see > >>> >>> >>https://stat.ethz.ch/mailman/listinfo/r-help > >>> >>> >>PLEASE do read the posting guide > >>> >>> >>http://www.R-project.org/posting-guide.html > >>> >>> >>and provide commented, minimal, self-contained, reproducible > >code. > >>> >>> > > >>> >>> > ______________________________________________ > >>> >>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, > >see > >>> >>> > https://stat.ethz.ch/mailman/listinfo/r-help > >>> >>> > PLEASE do read the posting guide > >>> >>> > http://www.R-project.org/posting-guide.html > >>> >>> > and provide commented, minimal, self-contained, reproducible > >code. > >>> >> > >>> >> > >> > >> > > > >--------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k ---------------------------------------------------------------------------
Jun Shen
2016-Sep-10 04:06 UTC
[R] element wise pattern recognition and string substitution
Hi Jeff, I have been trying different methods and found your approach is the most efficient. I am able to resolve the string-parsing problem. Let me report back to the group. This following example explains what I was trying to achieve. melt.results is where the strings reside, testdata is a snippet of data where the unique values are derived. replace.metaChar is a function I defined. Thanks for the help from everyone and appreciate any comment. Jun ################################################################ melt.results <- structure(list(param = c("Cmin1", "Cminss", "Cmaxss", "Cmin1", "Cminss", "Cmin1", "Cminss", "Cmaxss", "Cmin1", "Cminss"), variable structure(c(1L, 5L, 9L, 14L, 18L, 21L, 25L, 29L, 34L, 38L), .Label c("240.mg.>110.kg.geo.mean", "240.mg.>110.kg.cv", "240.mg.>110.kg.P05", "240.mg.>110.kg.P95", "3.mg.kg.>110.kg.geo.mean", "3.mg.kg.>110.kg.cv", "3.mg.kg.>110.kg.P05", "3.mg.kg.>110.kg.P95", "240.mg.>50-70.kg.geo.mean", "240.mg.>50-70.kg.cv", "240.mg.>50-70.kg.P05", "240.mg.>50-70.kg.P95", "3.mg.kg.>50-70.kg.geo.mean", "3.mg.kg.>50-70.kg.cv", "3.mg.kg.>50-70.kg.P05", "3.mg.kg.>50-70.kg.P95", "240.mg.50.kg.or.less.geo.mean", "240.mg.50.kg.or.less.cv", "240.mg.50.kg.or.less.P05", "240.mg.50.kg.or.less.P95", "3.mg.kg.50.kg.or.less.geo.mean", "3.mg.kg.50.kg.or.less.cv", "3.mg.kg.50.kg.or.less.P05", "3.mg.kg.50.kg.or.less.P95", "240.mg.>70-90.kg.geo.mean", "240.mg.>70-90.kg.cv", "240.mg.>70-90.kg.P05", "240.mg.>70-90.kg.P95", "3.mg.kg.>70-90.kg.geo.mean", "3.mg.kg.>70-90.kg.cv", "3.mg.kg.>70-90.kg.P05", "3.mg.kg.>70-90.kg.P95", "240.mg.>90-110.kg.geo.mean", "240.mg.>90-110.kg.cv", "240.mg.>90-110.kg.P05", "240.mg.>90-110.kg.P95", "3.mg.kg.>90-110.kg.geo.mean", "3.mg.kg.>90-110.kg.cv", "3.mg.kg.>90-110.kg.P05", "3.mg.kg.>90-110.kg.P95"), class = "factor"), value = c(97L, 144L, 76L, 137L, 18L, 104L, 92L, 87L, 111L, 41L)), .Names = c("param", "variable", "value"), row.names = c(1L, 14L, 27L, 40L, 53L, 61L, 74L, 87L, 100L, 113L), class = "data.frame") testdata <- structure(list(TX = c("240.mg", "3.mg.kg", "240.mg", "3.mg.kg", "240.mg", "3.mg.kg", "240.mg", "3.mg.kg", "240.mg", "3.mg.kg" ), WTCUT = c(">50-70.kg", ">50-70.kg", ">70-90.kg", ">70-90.kg", ">90-110.kg", ">90-110.kg", "50.kg.or.less", "50.kg.or.less", ">110.kg", ">110.kg")), .Names = c("TX", "WTCUT"), row.names = c(1L, 2L, 7L, 8L, 19L, 20L, 21L, 22L, 129L, 130L), class = "data.frame") replace.metaChar <- function(string) { metaChar <- c("\\$","\\*","\\+","\\.","\\?","\\[","\\]","\\^","\\{","\\}","\\|","\\(","\\)","\\\\") metaReplace <- paste('\\',metaChar, sep='') for(r in seq(metaChar)) gsub(metaChar[r], metaReplace[r], string) -> string return(string) } sort.var <- c('TX','WTCUT') one.pattern <- paste('\\b',paste(sapply(sapply(sort.var, function(x)replace.metaChar(unique(testdata[[x]]))), function(y) paste('(',paste(y,collapse='|'),')', sep='')), collapse='\\.'), '\\.(.*)', sep='') n.sort.var <- length(sort.var) one.replacement <- paste('\\', seq(n.sort.var+1), collapse='\t', sep='') one.results <- strsplit(sub(one.pattern, one.replacement, melt.results$variable), split='\t') melt.results[c(sort.var,'STATS')] <- as.data.frame(do.call(rbind, one.results)) On Wed, Sep 7, 2016 at 3:04 AM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:> Here are some suggestions: > > test.string <- c( '240.m.g.>110.kg.geo.mean' > , '3.mg.kg.>110.kg.P05' > , '240.m.g.>50-70.kg.geo.mean' > ) > # based on your literal idea > suggested.pattern1 <- > "(240\\.m\\.g|3\\.mg\\.kg)\\.(>50-70\\.kg|>70-90\\.kg|>90-11 > 0\\.kg|50\\.kg\\.or\\.less|>110\\.kg)\\.(.*)" > > resultL <- strsplit( sub( suggested.pattern1 > , "\\1\t\\2\t\\3" > , test.string ) > , split = "\t" > ) > > # equivalent based on apparent repetitive patterns in your sample data > suggested.pattern2 <- "(.*?m\\.g|kg)\\.(.*?kg|.*?less)\\.(.*)" > > resultL2 <- strsplit( sub( suggested.pattern2 > , "\\1\t\\2\t\\3" > , test.string > ) > , split = "\t" > ) > > # put results into an organized table > DF <- setNames( data.frame( do.call( rbind, resultL ) ) > , c( "First", "Second", "Third" ) > ) > > By the way... please aim to make your examples reproducible. It would have > been easy for you to define the necessary variables in example form > rather than sending a non-reproducible example. > > > On Tue, 6 Sep 2016, Jun Shen wrote: > > Hi Jeff, >> >> Thanks for the reply. I tried your suggestion and it doesn't seem to work >> and I tried a simple pattern as follows and it works as expected >> >> sub("(3\\.mg\\.kg)\\.(>50-70\\.kg)\\.(.*)", '\\1', "3.mg.kg >> .>50-70.kg.P05") >> [1] "3.mg.kg" >> >> sub("(3\\.mg\\.kg)\\.(>50-70\\.kg)\\.(.*)", '\\2', "3.mg.kg >> .>50-70.kg.P05") >> [1] ">50-70.kg" >> >> sub("(3\\.mg\\.kg)\\.(>50-70\\.kg)\\.(.*)", '\\3', "3.mg.kg >> .>50-70.kg.P05") >> [1] "P05" >> >> My problem is the pattern has to be dynamically constructed on the input >> data of the function I am writing. It's actually not too difficult >> to assemble the final.pattern with some code like the following >> >> sort.var <- c('TX','WTCUT') >> combn.sort.var <- do.call(expand.grid, lapply(sort.var, >> function(x)paste('(',gsub('\\.','\\\\.',unlist(unique(all.exposure[x]))), >> ')', >> sep=''))) >> all.patterns <- do.call(paste, c(combn.sort.var, '(.*)', sep='\\.')) >> final.pattern <- paste0(all.patterns, collapse='|') >> >> You cannot run the code directly since the data object "all.exposure" is >> not provided here. >> >> Jun >> >> >> >> On Tue, Sep 6, 2016 at 8:18 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> >> wrote: >> I am not near my computer today, but each parenthesis gets its own >> result number, so you should put the parenthesis around the >> whole pattern of alternatives instead of having many parentheses. >> >> I recommend thinking in terms of what common information you expect >> to find in these various strings, and place your parentheses >> to capture that information. There is no other reason to put >> parentheses in the pattern... they are not grouping symbols. >> -- >> Sent from my phone. Please excuse my brevity. >> >> On September 6, 2016 5:01:04 PM PDT, Bert Gunter < >> bgunter.4567 at gmail.com> wrote: >> >Jun: >> > >> >1. Tell us your desired result from your test vector and maybe >> someone >> >will help. >> > >> >2. As we played this game once already (you couldn't do it; I >> showed >> >you how), this seems to be a function of your limitations with >> regular >> >expressions. I'm probably not much better, but in any case, I don't >> >intend to be your consultant. See if you can find someone locally >> to >> >help you if you do not receive a satisfactory reply from the list. >> >There are many people here who are pretty good at this sort of >> thing, >> >but I don't know if they'll reply. Regex's are certainly complex. >> PERL >> >people tend to be pretty good at them, I believe. There are >> numerous >> >web sites and books on them if you need to acquire expertise for >> your >> >work. >> > >> >Cheers, >> >Bert >> >Bert Gunter >> > >> >"The trouble with having an open mind is that people keep coming >> along >> >and sticking things into it." >> >-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> > >> > >> >On Tue, Sep 6, 2016 at 3:59 PM, Jun Shen <jun.shen.ut at gmail.com> >> wrote: >> >> Hi Bert, >> >> >> >> I still couldn't make the multiple patterns to work. Here is an >> >example. I >> >> make the pattern as follows >> >> >> >> final.pattern <- >> >> >> >"(240\\.m\\.g)\\.(>50-70\\.kg)\\.(.*)|(3\\.mg\\.kg)\\.(>50- >> 70\\.kg)\\.(.*)|(240\\.m\\.g)\\.(>70-90\\.kg)\\.(.*)|(3\\. >> mg\\.kg)\\.(>70-90\\.k >> g)\\.(.*)|(240\\.m\\.g)\\.(>90-110\\.kg)\\.(.*)|(3\\.mg\\.kg >> )\\.(>90-110\\.kg)\\.(.*)|(240\\.m\\.g)\\.(50\\.kg\\.or\\. >> less)\\.(.*)|(3\\.mg\\ >> .kg)\\.(50\\.kg\\.or\\.less)\\.(.*)|(240\\.m\\.g)\\.(>110\\. >> kg)\\.(.*)|(3\\.mg\\.kg)\\.(>110\\.kg)\\.(.*)" >> >> >> >> test.string <- c('240.m.g.>110.kg.geo.mean', '3.mg.kg >> .>110.kg.P05', >> >> '240.m.g.>50-70.kg.geo.mean') >> >> >> >> sub(final.pattern, '\\1', test.string) >> >> sub(final.pattern, '\\2', test.string) >> >> sub(final.pattern, '\\3', test.string) >> >> >> >> Only the third string has been correctly parsed, which matches >> the >> >first >> >> pattern. It seems the rest of the patterns are not called. >> >> >> >> Jun >> >> >> >> >> >> On Mon, Sep 5, 2016 at 10:21 PM, Bert Gunter < >> bgunter.4567 at gmail.com> >> >wrote: >> >>> >> >>> Just noticed: My clumsy do.call() line in my previously posted >> code >> >>> below should be replaced with: >> >>> pat <- paste(pat,collapse = "|") >> >>> >> >>> >> >>> > pat <- c(pat1,pat2) >> >>> > paste(pat,collapse="|") >> >>> [1] "a+\\.*a+|b+\\.*b+" >> >>> >> >>> ************ replace this ************************** >> >>> > pat <- do.call(paste,c(as.list(pat), sep="|")) >> >>> ******************************************** >> >>> > sub(paste0("^[^b]*(",pat,").*$"),"\\1",z) >> >>> [1] "a.a" "bb" "b.bbb" >> >>> >> >>> >> >>> -- Bert >> >>> Bert Gunter >> >>> >> >>> "The trouble with having an open mind is that people keep coming >> >along >> >>> and sticking things into it." >> >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic >> strip ) >> >>> >> >>> >> >>> On Mon, Sep 5, 2016 at 12:11 PM, Bert Gunter >> ><bgunter.4567 at gmail.com> >> >>> wrote: >> >>> > Jun: >> >>> > >> >>> > You need to provide a clear specification via regular >> expressions >> >of >> >>> > the patterns you wish to match -- at least for me to decipher >> it. >> >>> > Others may be smarter than I, though... >> >>> > >> >>> > Jeff: Thanks. I have now convinced myself that it can be done >> (a >> >>> > "proof" of sorts): If pat1, pat2,..., patn are m different >> >patterns >> >>> > (in a vector of patterns) to be matched in a vector of n >> strings, >> >>> > where only one of the patterns will match in any string, >> then use >> >>> > paste() (probably via do.call()) or otherwise to paste them >> >together >> >>> > separated by "|" to form the concatenated pattern, pat. Then >> >>> > >> >>> > sub(paste0("^.*(",pat, ").*$"),"\\1",thevector) >> >>> > >> >>> > should extract the matching pattern in each (perhaps with a >> little >> >>> > fiddling due to precedence rules); e.g. >> >>> > >> >>> >> z <-c(".fg.h.g.a.a", "bb..dd.ef.tgf.", "foo...b.bbb.tgy") >> >>> > >> >>> >> pat1 <- "a+\\.*a+" >> >>> >> pat2 <-"b+\\.*b+" >> >>> >> pat <- c(pat1,pat2) >> >>> > >> >>> >> pat <- do.call(paste,c(as.list(pat), sep="|")) >> >>> >> pat >> >>> > [1] "a+\\.*a+|b+\\.*b+" >> >>> > >> >>> >> sub(paste0("^[^b]*(",pat,").*$"), "\\1", z) >> >>> > [1] "a.a" "bb" "b.bbb" >> >>> > >> >>> > Cheers, >> >>> > Bert >> >>> > >> >>> > >> >>> > Bert Gunter >> >>> > >> >>> > "The trouble with having an open mind is that people keep >> coming >> >along >> >>> > and sticking things into it." >> >>> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic >> strip ) >> >>> > >> >>> > >> >>> > On Mon, Sep 5, 2016 at 9:56 AM, Jun Shen < >> jun.shen.ut at gmail.com> >> >wrote: >> >>> >> Thanks for the reply, Bert. >> >>> >> >> >>> >> Your solution solves the example. I actually have a more >> general >> >>> >> situation >> >>> >> where I have this dot concatenated string from multiple >> >variables. The >> >>> >> problem is those variables may have values with dots in >> there. >> >The >> >>> >> number of >> >>> >> dots are not consistent for all values of a variable. So I am >> >thinking >> >>> >> to >> >>> >> define a vector of patterns for the vector of the string and >> >hopefully >> >>> >> to >> >>> >> find a way to use a pattern from the pattern vector for each >> >value of >> >>> >> the >> >>> >> string vector. The only way I can think of is "for" loop, >> which >> >can be >> >>> >> slow. >> >>> >> Also these are happening in a function I am writing. Just >> wonder >> >if >> >>> >> there is >> >>> >> another more efficient way. Thanks a lot. >> >>> >> >> >>> >> Jun >> >>> >> >> >>> >> On Mon, Sep 5, 2016 at 1:41 AM, Bert Gunter >> ><bgunter.4567 at gmail.com> >> >>> >> wrote: >> >>> >>> >> >>> >>> Well, he did provide an example, and... >> >>> >>> >> >>> >>> >> >>> >>> > z <- c('TX.WT.CUT.mean','mg.tx.cv') >> >>> >>> >> >>> >>> > sub("^.+?\\.(.+)\\.[^.]+$","\\1",z) >> >>> >>> [1] "WT.CUT" "tx" >> >>> >>> >> >>> >>> >> >>> >>> ## seems to do what was requested. >> >>> >>> >> >>> >>> Jeff would have to amplify on his initial statement >> however: do >> >you >> >>> >>> mean that separate patterns can always be combined via "|" >> ? Or >> >>> >>> something deeper? >> >>> >>> >> >>> >>> Cheers, >> >>> >>> Bert >> >>> >>> Bert Gunter >> >>> >>> >> >>> >>> "The trouble with having an open mind is that people keep >> coming >> >along >> >>> >>> and sticking things into it." >> >>> >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic >> strip >> >) >> >>> >>> >> >>> >>> >> >>> >>> On Sun, Sep 4, 2016 at 9:30 PM, Jeff Newmiller >> >>> >>> <jdnewmil at dcn.davis.ca.us> >> >>> >>> wrote: >> >>> >>> > Your opening assertion is false. >> >>> >>> > >> >>> >>> > Provide a reproducible example and someone will >> demonstrate. >> >>> >>> > -- >> >>> >>> > Sent from my phone. Please excuse my brevity. >> >>> >>> > >> >>> >>> > On September 4, 2016 9:06:59 PM PDT, Jun Shen >> >>> >>> > <jun.shen.ut at gmail.com> >> >>> >>> > wrote: >> >>> >>> >>Dear list, >> >>> >>> >> >> >>> >>> >>I have a vector of strings that cannot be described by one >> >pattern. >> >>> >>> >> So >> >>> >>> >>let's say I construct a vector of patterns in the same >> length >> >as the >> >>> >>> >>vector >> >>> >>> >>of strings, can I do the element wise pattern recognition >> and >> >string >> >>> >>> >>substitution. >> >>> >>> >> >> >>> >>> >>For example, >> >>> >>> >> >> >>> >>> >>pattern1 <- "([^.]*)\\.([^.]*\\.[^.]*)\\.(.*)" >> >>> >>> >>pattern2 <- "([^.]*)\\.([^.]*)\\.(.*)" >> >>> >>> >> >> >>> >>> >>patterns <- c(pattern1,pattern2) >> >>> >>> >>strings <- c('TX.WT.CUT.mean','mg.tx.cv') >> >>> >>> >> >> >>> >>> >>Say I want to extract "WT.CUT" from the first string and >> "tx" >> >from >> >>> >>> >> the >> >>> >>> >>second string. If I do >> >>> >>> >> >> >>> >>> >>sub(patterns, '\\2', strings), only the first pattern >> will be >> >used. >> >>> >>> >> >> >>> >>> >>looping the patterns doesn't work the way I want. >> Appreciate >> >any >> >>> >>> >>comments. >> >>> >>> >>Thanks. >> >>> >>> >> >> >>> >>> >>Jun >> >>> >>> >> >> >>> >>> >> [[alternative HTML version deleted]] >> >>> >>> >> >> >>> >>> >>______________________________________________ >> >>> >>> >>R-help at r-project.org mailing list -- To UNSUBSCRIBE and >> more, >> >see >> >>> >>> >>https://stat.ethz.ch/mailman/listinfo/r-help >> >>> >>> >>PLEASE do read the posting guide >> >>> >>> >>http://www.R-project.org/posting-guide.html >> >>> >>> >>and provide commented, minimal, self-contained, >> reproducible >> >code. >> >>> >>> > >> >>> >>> > ______________________________________________ >> >>> >>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and >> more, >> >see >> >>> >>> > https://stat.ethz.ch/mailman/listinfo/r-help >> >>> >>> > PLEASE do read the posting guide >> >>> >>> > http://www.R-project.org/posting-guide.html >> >>> >>> > and provide commented, minimal, self-contained, >> reproducible >> >code. >> >>> >> >> >>> >> >> >> >> >> >> >> >> >> >> > ------------------------------------------------------------ > --------------- > Jeff Newmiller The ..... ..... Go Live... > DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live > Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/Batteries O.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. rocks...1k > ------------------------------------------------------------ > ---------------[[alternative HTML version deleted]]