Displaying 20 results from an estimated 63 matches for "regmatch".
2019 Aug 15
4
Feature request: non-dropping regmatches/strextract
A very common use case for regmatches is to extract regex matches into a new column in a data.frame (or data.table, etc.) or otherwise use the extracted strings alongside the input. However, the default behavior is to drop empty matches, which results in mismatches in column length if reassignment is done without subsetting.
For con...
2019 Aug 15
0
Feature request: non-dropping regmatches/strextract
Changing the default behavior of regmatches would break its use with
gregexpr, where
the number of matches per input element faries, so a zero-length character
vector
makes more sense than NA_character_.
> x <- c("John Doe", "e e cummings", "Juan de la Madrid")
> m <- gregexpr("[A-Z]", x...
2019 Aug 29
0
Feature request: non-dropping regmatches/strextract
...(under review) about regular expression packages,
https://raw.githubusercontent.com/tdhock/namedCapture-article/master/RJwrapper.pdf
Comments/suggestions welcome.
On Thu, Aug 15, 2019 at 12:15 AM Cyclic Group Z_1 via R-devel <
r-devel at r-project.org> wrote:
> A very common use case for regmatches is to extract regex matches into a
> new column in a data.frame (or data.table, etc.) or otherwise use the
> extracted strings alongside the input. However, the default behavior is to
> drop empty matches, which results in mismatches in column length if
> reassignment is done without...
2019 Sep 02
2
Feature request: non-dropping regmatches/strextract
I think that's a good reason for not including this in regmatches; you're right, its name is somewhat suggestive of yielding matches. Also, that sounds like a great design for strcapture with an atomic prototype.
Best,
CG
2012 Aug 21
7
Regular Expressions in grep
Dear r-help members,
I have a number in the form of a string, say:
a<-"-01020.909200"
I'd like to extract "1020." as well as ".9092"
Front<-grep(pattern="[1-9]+[0-9]*\\.", value=TRUE, x=a, fixed=FALSE)
End<-grep(pattern="\\.[0-9]*[1-9]+", value=TRUE, x=a, fixed=FALSE)
However, both strings give "-01020.909200", exactly
2019 Aug 29
2
Feature request: non-dropping regmatches/strextract
Thank you, I am aware that there are packages that can accomplish this. I mentioned stringr::str_extract as a function that does not drop empty matches. I think that the behavior of regmatches(..., regexpr(...))?in base R should permit an option to prevent dropping of empty matches both for sake of consistency with the rest of the language (missing data does not yield a dropped index in other sorts of R functions, and an empty match conceptually corresponds with missing data) and facil...
2019 Sep 02
0
Feature request: non-dropping regmatches/strextract
After some discussion within R core, we decided that a "nomatch"
argument on regmatches() may be a good initial step. We might add a
new function later that combines the regexpr() and regmatches() steps.
The gregexpr() and regexec() inputs are both lists so it's not clear
whether a "nomatch" value would be relevant (the elements are empty)
in those cases.
On Mon, Sep...
2019 Aug 29
0
Feature request: non-dropping regmatches/strextract
...chael
On Thu, Aug 29, 2019 at 2:20 PM Cyclic Group Z_1 via R-devel
<r-devel at r-project.org> wrote:
>
> Thank you, I am aware that there are packages that can accomplish this. I mentioned stringr::str_extract as a function that does not drop empty matches. I think that the behavior of regmatches(..., regexpr(...)) in base R should permit an option to prevent dropping of empty matches both for sake of consistency with the rest of the language (missing data does not yield a dropped index in other sorts of R functions, and an empty match conceptually corresponds with missing data) and facil...
2019 Aug 15
1
Feature request: non-dropping regmatches/strextract
...s=character(),
stringsAsFactors=FALSE))
Name Address
1 Groucho groucho at marx.com
2 chico at marx.com
3 Harpo
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Thu, Aug 15, 2019 at 1:04 PM William Dunlap <wdunlap at tibco.com> wrote:
> I don't care much for regmatches and haven't tried strextract, but I think
> replacing the character(0) by NA_character_ is almost always inappropriate
> if the match information comes from gregexpr.
>
> I think strcapture() does a pretty good job of what I think you are trying
> to do. Perhaps adding an argu...
2019 Aug 29
2
Feature request: non-dropping regmatches/strextract
Thank you! I greatly appreciate your consideration, though of course it is up to you. I think many people switch to stringr/stringi simply because functions in those packages have some consistent design choices, for example, they do not drop empty/missing matches, which facilitates array-based programming. For example, in the cases where one needs to make a new column in a data.frame (data.table,
2019 Aug 15
0
Feature request: non-dropping regmatches/strextract
I don't care much for regmatches and haven't tried strextract, but I think
replacing the character(0) by NA_character_ is almost always inappropriate
if the match information comes from gregexpr.
I think strcapture() does a pretty good job of what I think you are trying
to do. Perhaps adding an argument to map no match to...
2019 Aug 30
0
Feature request: non-dropping regmatches/strextract
Just started thinking about this. The name of regmatches() suggests
that it will only extract the matches but not return anything for the
non-matches. We might need another function that returns a value for
non-matches. Perhaps the value should be the empty string for
non-matches and NA for matches to NA. The rationale is that we
delegate to regexpr()...
2009 Feb 24
8
[Bug 20298] New: Nouveau doesn' t allow my modeline because of hardcoded value
http://bugs.freedesktop.org/show_bug.cgi?id=20298
Summary: Nouveau doesn't allow my modeline because of hardcoded
value
Product: xorg
Version: git
Platform: All
OS/Version: All
Status: NEW
Severity: normal
Priority: medium
Component: Driver/nouveau
AssignedTo: nouveau at
2019 Aug 15
2
Feature request: non-dropping regmatches/strextract
I do think keeping the default behavior is desirable for backwards compatibility; my suggestion is not to change default behavior but to add an optional argument that allows a different behavior. Although this can be implemented in a user-defined function, retaining empty matches facilitates programmatic use, and seems to be something that should be available in base R. It is available, for
2007 Jun 24
2
problem gsub in the locale of CP932 and SJIS (PR#9751)
...and SJIS.
The inconvenient character code which used 0x5c after the first byte.
--- R-2.5.0.orig/src/main/character.c 2007-04-03 11:05:05.000000000 +0900
+++ R-2.5.0/src/main/character.c 2007-06-24 22:31:06.000000000 +0900
@@ -986,6 +986,17 @@
char *p = repl;
n = strlen(repl) - (regmatch[0].rm_eo - regmatch[0].rm_so);
while (*p) {
+#ifdef SUPPORT_MBCS
+ if(mbcslocale){
+ int clen;
+ mbstate_t mb_st;
+ mbs_init(&mb_st);
+ if((clen = Mbrtowc(NULL, p, MB_CUR_MAX, &mb_st)) > 1){
+ p+=clen;
+ cont...
2000 Feb 07
4
Segmentation fault, devPS.c, 0.99.0 (PR#413)
Full_Name: Roger Bivand
Version: 0.99.0
OS: RH Linux 6.1
Submission from: (NULL) (158.37.60.152)
I am working on an interface between R and the GRASS geographical information
system,
written in R, with no dynamically loaded code. I have written full examples, and
tested
then under R 0.90.1, both by entering example() for each function and R CMD
check, both
of which worked without problem.
Under
2002 May 08
0
embedded R regexec returning nonsense
...y interaction between the R regex functions and something in
python.
With R 1.4.1, R starts up smoothly under rpy and RSPython and only certain
functions elicit the error. For instance, doing "bitmap('file.bmp')" will
cause a segfault in do_strsplit (character.c:260) because the regmatch
structure contains the nonsense values:
(gdb) print regmatch[0]
$1 = {rm_so = 9257728, rm_eo = 9257729}
With R 1.5.0, R fails to complete startup initialization under both rpy and
RSPython, segfaulting in do_readDCF (dcf.c:109) with similar nonsense values
in the regmatch structure:
(gdb) pr...
2012 Nov 02
2
backreferences in gregexpr
Hi Folks,
I'm trying to extract just the backreferences from a regex.
> temp = "abcd1234abcd1234"
> regmatches(temp, gregexpr("(?:abcd)(1234)", temp))
[[1]]
[1] "abcd1234" "abcd1234"
What I would like is:
[1] "1234" "1234"
Note: I know I can just match 1234 here, but the actual example is
complicated enough that I have to match a larger string, and just...
2012 Sep 24
5
Memory usage in R grows considerably while calculating word frequencies
...o do it.
R program:
# Read in the entire file and convert all words in text to lower case
words.txt<-tolower(scan("text_file","character",sep="\n"))
# Extract words
pattern <- "(\\b[A-Za-z]+\\b)"
match <- gregexpr(pattern,words.txt)
words.txt <- regmatches(words.txt,match)
# Create a vector from the list of words
words.txt<-unlist(words.txt)
# Calculate word frequencies
words.txt<-table(words.txt,dnn="words")
# Sort by frequency, not alphabetically
words.txt<-sort(words.txt,decreasing=TRUE)
# Put into some readable form, &quo...
2008 Feb 19
8
[Bug 14567] New: Randr 1.2 fails on nv17 lvds in a Dell Inspiron 8100 ( continued from 14491)
http://bugs.freedesktop.org/show_bug.cgi?id=14567
Summary: Randr 1.2 fails on nv17 lvds in a Dell Inspiron 8100
(continued from 14491)
Product: xorg
Version: unspecified
Platform: Other
OS/Version: All
Status: NEW
Severity: normal
Priority: medium
Component: Driver/nouveau