Hello, Is there any way I can use the gregexpr functions (or a different function) in a manner that will also return overlapping (i.e. non disjoint) regular expressions? For instance, when running gregexpr("AAA","AAAAAA"), I get two matches, one at position 1 and one at position 4. I'd like to receive 4 matches at positions 1, 2, 3 and 4. Thanks, Schraga Schwartz Department of Human Molecular Genetics and Biochemistry, Tel Aviv University Medical School, Tel Aviv 69978, Israel. Tel: +972-3-640 6894 email: schragas@post.tau.ac.il [[alternative HTML version deleted]]
Try this: gregexpr("A(?=AA)","AAAAAA", perl = TRUE) Read about zero width lookahead assertions at ?regex On Mon, May 5, 2008 at 10:16 AM, Schraga Schwartz <Schragas at post.tau.ac.il> wrote:> Hello, > > > > Is there any way I can use the gregexpr functions (or a different function) > in a manner that will also return overlapping (i.e. non disjoint) regular > expressions? > > > > For instance, when running gregexpr("AAA","AAAAAA"), I get two matches, one > at position 1 and one at position 4. I'd like to receive 4 matches at > positions 1, 2, 3 and 4. > > > > Thanks, > > > > Schraga Schwartz > > Department of Human Molecular Genetics and Biochemistry, > Tel Aviv University Medical School, > Tel Aviv 69978, Israel. > Tel: +972-3-640 6894 > > email: schragas at post.tau.ac.il > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
For such simple expressions you can use 'beheading': remove the current match with substr() and run regrepr again. That does not work for some regexps which depend on e.g. word boundaries, though. Alternatively you can modify the source code. On Mon, 5 May 2008, Schraga Schwartz wrote:> Hello, > > > > Is there any way I can use the gregexpr functions (or a different function) > in a manner that will also return overlapping (i.e. non disjoint) regular > expressions? > > > > For instance, when running gregexpr("AAA","AAAAAA"), I get two matches, one > at position 1 and one at position 4. I'd like to receive 4 matches at > positions 1, 2, 3 and 4. > > > > Thanks, > > > > Schraga Schwartz > > Department of Human Molecular Genetics and Biochemistry, > Tel Aviv University Medical School, > Tel Aviv 69978, Israel. > Tel: +972-3-640 6894 > > email: schragas at post.tau.ac.il > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595