search for: namedcaptur

Displaying 6 results from an estimated 6 matches for "namedcaptur".

Did you mean: namedcapture
2019 Feb 20
2
Bug: time complexity of substring is quadratic as string size and number of substrings increases
...many substrings to extract. For example substring("AAAA", 1:4, 1:4) or more generally, N=1000 substring(paste(rep("A", N), collapse=""), 1:N, 1:N) The problem I observe is that the time complexity is quadratic in N, as shown on this figure https://github.com/tdhock/namedCapture-article/blob/master/figure-substring-bug.png source: https://github.com/tdhock/namedCapture-article/blob/master/figure-substring-bug.R I expected the time complexity to be linear in N. The example above may seem contrived/trivial, but it is indeed relevant to a number of packages (rex, rematch2,...
2019 Feb 22
1
Bug: time complexity of substring is quadratic as string size and number of substrings increases
On 2/20/19 7:55 PM, Toby Hocking wrote: > Update: I have observed that stringi::stri_sub is linear time complexity, > and it computes the same thing as base::substring. figure > https://github.com/tdhock/namedCapture-article/blob/master/figure-substring-bug.png > source: > https://github.com/tdhock/namedCapture-article/blob/master/figure-substring-bug.R > > To me this is a clear indication of a bug in substring, but again it would > be nice to have some feedback/confirmation before posting on bu...
2019 Feb 20
0
Bug: time complexity of substring is quadratic as string size and number of substrings increases
Update: I have observed that stringi::stri_sub is linear time complexity, and it computes the same thing as base::substring. figure https://github.com/tdhock/namedCapture-article/blob/master/figure-substring-bug.png source: https://github.com/tdhock/namedCapture-article/blob/master/figure-substring-bug.R To me this is a clear indication of a bug in substring, but again it would be nice to have some feedback/confirmation before posting on bugzilla. Also this sugge...
2019 Feb 19
1
patch for gregexpr(perl=TRUE)
...tat.ethz.ch/pipermail/r-help/2008-October/178451.html I figured out the issue, which is fixed by changing 1 line of code in src/main/grep.c -- there is a strlen function call which is currently inside of the while loop over matches, and the patch moves it before the loop. https://github.com/tdhock/namedCapture-article/blob/master/linear-time-gregexpr-perl.patch I made some figures that show the quadratic time complexity before applying the patch, and the linear time complexity after applying the patch https://github.com/tdhock/namedCapture-article#19-feb-2019 I would have posted a bug report on bugs.r...
2019 Aug 29
0
Feature request: non-dropping regmatches/strextract
if you want "to extract regex matches into a new column in a data.frame" then there are some package functions which do exactly that. three examples are namedCapture::df_match_variable, rematch2::bind_re_match, and tidyr::extract. For a more detailed discussion see my R journal submission (under review) about regular expression packages, https://raw.githubusercontent.com/tdhock/namedCapture-article/master/RJwrapper.pdf Comments/suggestions welcome. On Thu, Au...
2019 Aug 15
4
Feature request: non-dropping regmatches/strextract
A very common use case for regmatches is to extract regex matches into a new column in a data.frame (or data.table, etc.) or otherwise use the extracted strings alongside the input. However, the default behavior is to drop empty matches, which results in mismatches in column length if reassignment is done without subsetting. For consistency with other R functions and compatibility with this use