Tuszynski, Jaroslaw W.
2005-Sep-15 13:01 UTC
[R] Splitting the string at the last sub-string
Hi, I need to split a string into 2 strings, with the split point defined by the last occurrence of some substring. I come up with some convoluted code to do so: str = "Chance favors the prepared mind" sub = "e" y = unlist(strsplit(str,sub)) z = cbind(paste(y[-length(y)], sub, sep="", collapse = ""), y[length(y)]); y z z[1] z[2] Is there a simpler way to do so? I think ~8 function calls to do such a simple operation is an overkill. Jarek ====================================================\==== Jarek Tuszynski, PhD. o / \ Science Applications International Corporation <\__,| (703) 676-4192 "> \ Jaroslaw.W.Tuszynski at saic.com ` \
> regexpr(".*e", str)[1] 1 attr(,"match.length") [1] 25 tells you you need> substring(str, c(1, 26), c(25,length(str)))[1] "Chance favors the prepare" "d mind" to reproduce your answer (I don't know what you want to do with the substring, but you included it in the first string, which is not what split() does). On Thu, 15 Sep 2005, Tuszynski, Jaroslaw W. wrote:> > Hi, > > I need to split a string into 2 strings, with the split point defined by the > last occurrence of some substring. I come up with some convoluted code to do > so: > > str = "Chance favors the prepared mind" > sub = "e" > y = unlist(strsplit(str,sub)) > z = cbind(paste(y[-length(y)], sub, sep="", collapse = ""), y[length(y)]); > > y > z > z[1] > z[2] > > Is there a simpler way to do so? I think ~8 function calls to do such a > simple operation is an overkill. > > Jarek > ====================================================\===> Jarek Tuszynski, PhD. o / \ > Science Applications International Corporation <\__,| > (703) 676-4192 "> \ > Jaroslaw.W.Tuszynski at saic.com ` \ > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Tuszynski, Jaroslaw W.
2005-Sep-15 15:00 UTC
[R] Splitting the string at the last sub-string
Thanks for suggestions. I suspect the "regexpr" version will be better than my version, since I use it to find an string towards the end of a large (up to ~30Mb) test/XML file. Thanks again. Jarek ====================================================\==== Jarek Tuszynski, PhD. o / \ Science Applications International Corporation <\__,| (703) 676-4192 "> \ Jaroslaw.W.Tuszynski at saic.com ` \ -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Prof Brian Ripley Sent: Thursday, September 15, 2005 10:43 AM To: Barry Rowlingson Cc: r-help at stat.math.ethz.ch Subject: Re: [R] Splitting the string at the last sub-string On Thu, 15 Sep 2005, Barry Rowlingson wrote:> Prof Brian Ripley wrote: > >>> substring(str, c(1, 26), c(25,length(str))) > > nchar(str) surely?Yes, or anything larger: I actually tested 10000.> regexps can be rather slow though. Here's two functions:But that's not the way to do this repeatedly for the same pattern. (It is normally compiling regexps that is slow, and regexpr is vectorized.) Not that I would call 300us `slow'.> byRipley > function(str,sub){ > lp=attr(regexpr(paste(".*",sub,sep=""),str),'match.length') > return(substring(str, c(1, lp+1), c(lp,nchar(str)))) } > > byJarek > function(str,sub){ > y = unlist(strsplit(str,sub)) > return(cbind(paste(y[-length(y)], sub, sep="", collapse = ""), > y[length(y)])) > } > > and a quick test: > > > system.time(for(i in 1:100000){byJarek(str,sub)}) > [1] 15.55 0.10 16.06 0.00 0.00 > > > system.time(for(i in 1:100000){byRipley(str,sub)}) > [1] 30.28 0.07 31.86 0.00 0.00 > > Baz > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html