search for: regmatch

Displaying 20 results from an estimated 63 matches for "regmatch".

2019 Aug 15
4
Feature request: non-dropping regmatches/strextract
A very common use case for regmatches is to extract regex matches into a new column in a data.frame (or data.table, etc.) or otherwise use the extracted strings alongside the input. However, the default behavior is to drop empty matches, which results in mismatches in column length if reassignment is done without subsetting. For con...
2019 Aug 15
0
Feature request: non-dropping regmatches/strextract
Changing the default behavior of regmatches would break its use with gregexpr, where the number of matches per input element faries, so a zero-length character vector makes more sense than NA_character_. > x <- c("John Doe", "e e cummings", "Juan de la Madrid") > m <- gregexpr("[A-Z]", x...
2019 Aug 29
0
Feature request: non-dropping regmatches/strextract
...(under review) about regular expression packages, https://raw.githubusercontent.com/tdhock/namedCapture-article/master/RJwrapper.pdf Comments/suggestions welcome. On Thu, Aug 15, 2019 at 12:15 AM Cyclic Group Z_1 via R-devel < r-devel at r-project.org> wrote: > A very common use case for regmatches is to extract regex matches into a > new column in a data.frame (or data.table, etc.) or otherwise use the > extracted strings alongside the input. However, the default behavior is to > drop empty matches, which results in mismatches in column length if > reassignment is done without...
2019 Sep 02
2
Feature request: non-dropping regmatches/strextract
I think that's a good reason for not including this in regmatches; you're right, its name is somewhat suggestive of yielding matches. Also, that sounds like a great design for strcapture with an atomic prototype. Best, CG
2012 Aug 21
7
Regular Expressions in grep
Dear r-help members, I have a number in the form of a string, say: a<-"-01020.909200" I'd like to extract "1020." as well as ".9092" Front<-grep(pattern="[1-9]+[0-9]*\\.", value=TRUE, x=a, fixed=FALSE) End<-grep(pattern="\\.[0-9]*[1-9]+", value=TRUE, x=a, fixed=FALSE) However, both strings give "-01020.909200", exactly
2019 Aug 29
2
Feature request: non-dropping regmatches/strextract
Thank you, I am aware that there are packages that can accomplish this. I mentioned stringr::str_extract as a function that does not drop empty matches. I think that the behavior of regmatches(..., regexpr(...))?in base R should permit an option to prevent dropping of empty matches both for sake of consistency with the rest of the language (missing data does not yield a dropped index in other sorts of R functions, and an empty match conceptually corresponds with missing data) and facil...
2019 Sep 02
0
Feature request: non-dropping regmatches/strextract
After some discussion within R core, we decided that a "nomatch" argument on regmatches() may be a good initial step. We might add a new function later that combines the regexpr() and regmatches() steps. The gregexpr() and regexec() inputs are both lists so it's not clear whether a "nomatch" value would be relevant (the elements are empty) in those cases. On Mon, Sep...
2019 Aug 29
0
Feature request: non-dropping regmatches/strextract
...chael On Thu, Aug 29, 2019 at 2:20 PM Cyclic Group Z_1 via R-devel <r-devel at r-project.org> wrote: > > Thank you, I am aware that there are packages that can accomplish this. I mentioned stringr::str_extract as a function that does not drop empty matches. I think that the behavior of regmatches(..., regexpr(...)) in base R should permit an option to prevent dropping of empty matches both for sake of consistency with the rest of the language (missing data does not yield a dropped index in other sorts of R functions, and an empty match conceptually corresponds with missing data) and facil...
2019 Aug 15
1
Feature request: non-dropping regmatches/strextract
...s=character(), stringsAsFactors=FALSE)) Name Address 1 Groucho groucho at marx.com 2 chico at marx.com 3 Harpo Bill Dunlap TIBCO Software wdunlap tibco.com On Thu, Aug 15, 2019 at 1:04 PM William Dunlap <wdunlap at tibco.com> wrote: > I don't care much for regmatches and haven't tried strextract, but I think > replacing the character(0) by NA_character_ is almost always inappropriate > if the match information comes from gregexpr. > > I think strcapture() does a pretty good job of what I think you are trying > to do. Perhaps adding an argu...
2019 Aug 29
2
Feature request: non-dropping regmatches/strextract
Thank you! I greatly appreciate your consideration, though of course it is up to you. I think many people switch to stringr/stringi simply because functions in those packages have some consistent design choices, for example, they do not drop empty/missing matches, which facilitates array-based programming. For example, in the cases where one needs to make a new column in a data.frame (data.table,
2019 Aug 15
0
Feature request: non-dropping regmatches/strextract
I don't care much for regmatches and haven't tried strextract, but I think replacing the character(0) by NA_character_ is almost always inappropriate if the match information comes from gregexpr. I think strcapture() does a pretty good job of what I think you are trying to do. Perhaps adding an argument to map no match to...
2019 Aug 30
0
Feature request: non-dropping regmatches/strextract
Just started thinking about this. The name of regmatches() suggests that it will only extract the matches but not return anything for the non-matches. We might need another function that returns a value for non-matches. Perhaps the value should be the empty string for non-matches and NA for matches to NA. The rationale is that we delegate to regexpr()...
2009 Feb 24
8
[Bug 20298] New: Nouveau doesn' t allow my modeline because of hardcoded value
http://bugs.freedesktop.org/show_bug.cgi?id=20298 Summary: Nouveau doesn't allow my modeline because of hardcoded value Product: xorg Version: git Platform: All OS/Version: All Status: NEW Severity: normal Priority: medium Component: Driver/nouveau AssignedTo: nouveau at
2019 Aug 15
2
Feature request: non-dropping regmatches/strextract
I do think keeping the default behavior is desirable for backwards compatibility; my suggestion is not to change default behavior but to add an optional argument that allows a different behavior. Although this can be implemented in a user-defined function, retaining empty matches facilitates programmatic use, and seems to be something that should be available in base R. It is available, for
2007 Jun 24
2
problem gsub in the locale of CP932 and SJIS (PR#9751)
...and SJIS. The inconvenient character code which used 0x5c after the first byte. --- R-2.5.0.orig/src/main/character.c 2007-04-03 11:05:05.000000000 +0900 +++ R-2.5.0/src/main/character.c 2007-06-24 22:31:06.000000000 +0900 @@ -986,6 +986,17 @@ char *p = repl; n = strlen(repl) - (regmatch[0].rm_eo - regmatch[0].rm_so); while (*p) { +#ifdef SUPPORT_MBCS + if(mbcslocale){ + int clen; + mbstate_t mb_st; + mbs_init(&mb_st); + if((clen = Mbrtowc(NULL, p, MB_CUR_MAX, &mb_st)) > 1){ + p+=clen; + cont...
2000 Feb 07
4
Segmentation fault, devPS.c, 0.99.0 (PR#413)
Full_Name: Roger Bivand Version: 0.99.0 OS: RH Linux 6.1 Submission from: (NULL) (158.37.60.152) I am working on an interface between R and the GRASS geographical information system, written in R, with no dynamically loaded code. I have written full examples, and tested then under R 0.90.1, both by entering example() for each function and R CMD check, both of which worked without problem. Under
2002 May 08
0
embedded R regexec returning nonsense
...y interaction between the R regex functions and something in python. With R 1.4.1, R starts up smoothly under rpy and RSPython and only certain functions elicit the error. For instance, doing "bitmap('file.bmp')" will cause a segfault in do_strsplit (character.c:260) because the regmatch structure contains the nonsense values: (gdb) print regmatch[0] $1 = {rm_so = 9257728, rm_eo = 9257729} With R 1.5.0, R fails to complete startup initialization under both rpy and RSPython, segfaulting in do_readDCF (dcf.c:109) with similar nonsense values in the regmatch structure: (gdb) pr...
2012 Nov 02
2
backreferences in gregexpr
Hi Folks, I'm trying to extract just the backreferences from a regex. > temp = "abcd1234abcd1234" > regmatches(temp, gregexpr("(?:abcd)(1234)", temp)) [[1]] [1] "abcd1234" "abcd1234" What I would like is: [1] "1234" "1234" Note: I know I can just match 1234 here, but the actual example is complicated enough that I have to match a larger string, and just...
2012 Sep 24
5
Memory usage in R grows considerably while calculating word frequencies
...o do it. R program: # Read in the entire file and convert all words in text to lower case words.txt<-tolower(scan("text_file","character",sep="\n")) # Extract words pattern <- "(\\b[A-Za-z]+\\b)" match <- gregexpr(pattern,words.txt) words.txt <- regmatches(words.txt,match) # Create a vector from the list of words words.txt<-unlist(words.txt) # Calculate word frequencies words.txt<-table(words.txt,dnn="words") # Sort by frequency, not alphabetically words.txt<-sort(words.txt,decreasing=TRUE) # Put into some readable form, &quo...
2008 Feb 19
8
[Bug 14567] New: Randr 1.2 fails on nv17 lvds in a Dell Inspiron 8100 ( continued from 14491)
http://bugs.freedesktop.org/show_bug.cgi?id=14567 Summary: Randr 1.2 fails on nv17 lvds in a Dell Inspiron 8100 (continued from 14491) Product: xorg Version: unspecified Platform: Other OS/Version: All Status: NEW Severity: normal Priority: medium Component: Driver/nouveau