thr3ads.net - similar to: "Feature request: non-dropping regmatches/strextract"

Displaying 20 results from an estimated 1000 matches similar to: "Feature request: non-dropping regmatches/strextract"

Feature request: non-dropping regmatches/strextract

2019 Aug 29

Feature request: non-dropping regmatches/strextract

if you want "to extract regex matches into a new column in a data.frame" then there are some package functions which do exactly that. three examples are namedCapture::df_match_variable, rematch2::bind_re_match, and tidyr::extract. For a more detailed discussion see my R journal submission (under review) about regular expression packages,

Feature request: non-dropping regmatches/strextract

2019 Aug 29

Feature request: non-dropping regmatches/strextract

Thank you, I am aware that there are packages that can accomplish this. I mentioned stringr::str_extract as a function that does not drop empty matches. I think that the behavior of regmatches(..., regexpr(...))?in base R should permit an option to prevent dropping of empty matches both for sake of consistency with the rest of the language (missing data does not yield a dropped index in other

Feature request: non-dropping regmatches/strextract

2019 Aug 15

Feature request: non-dropping regmatches/strextract

Changing the default behavior of regmatches would break its use with gregexpr, where the number of matches per input element faries, so a zero-length character vector makes more sense than NA_character_. > x <- c("John Doe", "e e cummings", "Juan de la Madrid") > m <- gregexpr("[A-Z]", x) > regmatches(x,m) [[1]] [1] "J"

Feature request: non-dropping regmatches/strextract

2019 Aug 15

Feature request: non-dropping regmatches/strextract

I do think keeping the default behavior is desirable for backwards compatibility; my suggestion is not to change default behavior but to add an optional argument that allows a different behavior. Although this can be implemented in a user-defined function, retaining empty matches facilitates programmatic use, and seems to be something that should be available in base R. It is available, for

Feature request: non-dropping regmatches/strextract

2019 Aug 29

Feature request: non-dropping regmatches/strextract

Thank you! I greatly appreciate your consideration, though of course it is up to you. I think many people switch to stringr/stringi simply because functions in those packages have some consistent design choices, for example, they do not drop empty/missing matches, which facilitates array-based programming. For example, in the cases where one needs to make a new column in a data.frame (data.table,

Feature request: non-dropping regmatches/strextract

2019 Aug 15

Feature request: non-dropping regmatches/strextract

Using a non-capturing group, "(?:...)" instead of "(...)", simplifies my example a bit > x <- c("Groucho <groucho at marx.com>", "<chico at marx.com>", "Harpo") > strcapture("([[:alpha:]]+)?(?: *<([[:alpha:]. ]+@[[:alpha:]. ]+)>)?", x, proto=data.frame(Name=character(), Address=character(),

Feature request: non-dropping regmatches/strextract

2019 Sep 02

Feature request: non-dropping regmatches/strextract

I think that's a good reason for not including this in regmatches; you're right, its name is somewhat suggestive of yielding matches. Also, that sounds like a great design for strcapture with an atomic prototype. Best, CG

Feature request: non-dropping regmatches/strextract

2019 Aug 29

Feature request: non-dropping regmatches/strextract

I'd be happy to entertain patches or at least more specific suggestions to improve strextract() and strcapture(). I hadn't exported strextract(), because I wasn't quite sure how it should behave. This feedback should be helpful. Thanks, Michael On Thu, Aug 29, 2019 at 2:20 PM Cyclic Group Z_1 via R-devel <r-devel at r-project.org> wrote: > > Thank you, I am aware that

Feature request: non-dropping regmatches/strextract

2019 Aug 30

Feature request: non-dropping regmatches/strextract

Just started thinking about this. The name of regmatches() suggests that it will only extract the matches but not return anything for the non-matches. We might need another function that returns a value for non-matches. Perhaps the value should be the empty string for non-matches and NA for matches to NA. The rationale is that we delegate to regexpr() (at least conceptually), and it returns a

Feature request: non-dropping regmatches/strextract

2019 Aug 15

Feature request: non-dropping regmatches/strextract

I don't care much for regmatches and haven't tried strextract, but I think replacing the character(0) by NA_character_ is almost always inappropriate if the match information comes from gregexpr. I think strcapture() does a pretty good job of what I think you are trying to do. Perhaps adding an argument to map no match to NA instead of "" would give you just what you wanted.

Feature request: non-dropping regmatches/strextract

2019 Sep 02

Feature request: non-dropping regmatches/strextract

After some discussion within R core, we decided that a "nomatch" argument on regmatches() may be a good initial step. We might add a new function later that combines the regexpr() and regmatches() steps. The gregexpr() and regexec() inputs are both lists so it's not clear whether a "nomatch" value would be relevant (the elements are empty) in those cases. On Mon, Sep 2,

error handling in strcapture

2016 Oct 04

error handling in strcapture

I noticed a problem in the strcapture from R-devel (2016-09-27 r71386), when the text contains a missing value and perl=TRUE. { # NA in text input should map to row of NA's in output, without warning r9p <- strcapture(perl = TRUE, "(.).* ([[:digit:]]+)", c("One 1", NA, "Fifty 50"), data.frame(Initial=factor(), Number=numeric())) e9p <-

error handling in strcapture

2016 Oct 04

error handling in strcapture

It is also not catching the cases where the number of capture expressions does not match the number of entries in proto. I think all of the following should give an error about the mismatch. > strcapture("(.)(.)", c("ab", "cde", "fgh", "ij", "lm"), proto=list(A="",B="",C="")) A B C 1 a b cd 2 d

error handling in strcapture

2016 Sep 21

error handling in strcapture

If there are any matches then strcapture can see if the pattern has the same number of capture expressions as the prototype has columns and give an error if not. That seems appropriate. If there are no matches, then there is no easy way to see if the prototype is compatible with the pattern, so should strcapture just assume the best and fill in the prototype with NA's? Should there be

Extracting numeric part from a string

2017 Aug 02

Extracting numeric part from a string

Hi again, I am struggling to extract the number part from below string : "\"cm_ffm\":\"563.77\"" Basically, I need to extract 563.77 from above. The underlying number can be a whole number, and there could be comma separator as well. So far I tried below : > library(stringr) > str_extract("\"cm_ffm\":\"563.77\"",

Consulta

2019 Sep 23

Consulta

Buenas tarde a todo en s: Tenia la versión de R 3.6 y utilizaba la paquetería de pdftools para extraer información de archivos en pdf actualice la versión 3.6.1 y ya no reconoce la paquetería alguien que me pueda ayudar. Prácticamente no reconoce las funciones de pdftools library(pdftools) library(stringr)? library(NLP)? library(tm)? library(tesseract)? library(magick)?

Sorting a Data Frame by hybrid string / number key

2011 Feb 03

Sorting a Data Frame by hybrid string / number key

Hi, I'm trying to present a table of some experimental data, and I want to order the rows by the instance names. The issue I've got is that there are a variety of conventions for the instance names (e.g. competition01, competition13, small_1, big_20, med_9). What I want to be able to sort them first in category order so: competition < small < med < big, and then perform the

Consulta

2019 Sep 24

Consulta

Emilio Ahora cuando quiero instalar los paquetes pdftools, magick y otros más me salen el siguiente error WARNING: Rtools is required to build R packages but is not currently installed. Please download and install the appropriate version of Rtools before proceeding: https://cran.rstudio.com/bin/windows/Rtools/ Installing package into ?C:/Users/bdominguez/Documents/R/win-library/3.6? (as ?lib?

Working with string

2011 Jul 07

Working with string

Hi there, I have to extract some relevant portion from a defined string, which is a mix of numeric and character. However this has following sequence: Some String - Some numerical - "c/C" (or "p/P") - then again some set of numbers. Examples of such string is "fdahsdfcha163517253c463278643" or "fdahsdfcha163517253C463278643" or

ayuda con stringr

2013 Jul 15

ayuda con stringr

Hola a todos. Soy un poco torpe manejando cadenas de texto, así que os pido ayuda. Tengo un vector de texto de este tipo datos$tipo [1] m.1.p.Álava m.1.p.Albacete [3] m.2.p.Alicante m.1.p.Almería [5] m.3.p.Asturias m.1.p.Ávila [7] m.1.p.Badajoz m.1.p.Baleares (Illes) [9] m.1.p.Barcelona m.1.p.Burgos [11] m.1.p.Cáceres m.1.p.Cádiz Y quiero extraer el

similar to: Feature request: non-dropping regmatches/strextract