search for: gregexpr

Displaying 20 results from an estimated 139 matches for "gregexpr".

Did you mean: regexpr
2006 May 06
2
regular expression change in R version 2.3.0?
The interpretation of regular expressions with repetition quantifiers in the 'gregexpr' function seems to have changed between R Version 2.2.0 and 2.3.0. The 'gsub' function, however, gives the same results in R Versions 2.2.0 and 2.3.0. Below is an example that demonstrates the version differences of the 'gregexpr' function. I am not sure whether this new beh...
2007 Oct 10
4
gregexpr (PR#9965)
Full_Name: Peter Dolan Version: 2.5.1 OS: Windows Submission from: (NULL) (128.193.227.43) gregexpr does not find all matching substrings if the substrings overlap: > gregexpr("abab","ababab") [[1]] [1] 1 attr(,"match.length") [1] 4 It does work correctly in Version 2.3.1 under linux.
2011 Aug 17
2
question regarding gregexpr and read.table
Hi, I have a silly question regarding the usage of two commands: read.table and gregexpr: For read.table, if I read a matrix and set header = T, I found that all the dash ("-") becomes dots (".") A = read.table("Matrix.txt", sep = "\t", header = F) A[1,1] # "A-B-C-D". A = read.table("Matrix.txt", sep = "\t", heade...
2006 Nov 07
1
Gregexpr - extract results with lapply
Gregexpr - extract results with lapply Hello, I need to extract sequences of three upper case letters in a string. In other words, in this string: str <-c("ABC", "this WOUld be gOOD") The result I'm looking for is ABC WOU OOD. With gregexpr, I can get the position and lengt...
2009 Feb 25
1
Using gregexpr with multiple search elements
Dear list, I am trying to use gregexpr to see if entries in a dataframe have either of two possible values for a string. here's an example text<-c("fat", "rat", "cat", "dog", "log", "fish") If I just wanted to find if any one of the elements in text match the patter...
2010 Jul 08
2
strsplit("dia ma", "\\b") splits characterwise
...) [[1]] [1] "d" "i" "a" " " "m" "a" > strsplit("dia ma", "\\b", perl=TRUE) [[1]] [1] "d" "i" "a" " " "m" "a" How can that be? This is the output of 'gregexpr'. > gregexpr("\\b", "dia ma") [[1]] [1] 1 2 3 4 5 6 attr(,"match.length") [1] 0 0 0 0 0 0 > gregexpr("\\b", "dia ma", perl=TRUE) [[1]] [1] 1 4 5 7 attr(,"match.length") [1] 0 0 0 0 The output from gregexpr("\\b", &q...
2010 Feb 08
2
the hat ^ in regular expression
Un texte encapsul? et encod? dans un jeu de caract?res inconnu a ?t? nettoy?... Nom : non disponible URL : <https://stat.ethz.ch/pipermail/r-help/attachments/20100208/52a6d080/attachment.pl>
2019 Feb 19
1
patch for gregexpr(perl=TRUE)
Hi all, Several people have noticed that gregexpr is very slow for large subject strings when perl=TRUE is specified. - https://stackoverflow.com/questions/31216299/r-faster-gregexpr-for-very-large-strings - http://r.789695.n4.nabble.com/strsplit-perl-TRUE-gregexpr-perl-TRUE-very-slow-for-long-strings-td4727902.html - https://stat.ethz.ch/pipermai...
2008 Dec 12
4
gregexpr - match overlap mishandled (PR#13391)
Full_Name: Reid Thompson Version: 2.8.0 RC (2008-10-12 r46696) OS: darwin9.5.0 Submission from: (NULL) (129.98.107.177) the gregexpr() function does NOT return a complete list of global matches as it should. this occurs when a pattern matches two overlapping portions of a string, only the first match is returned. the following function call demonstrates this error (although this is not how I initially discovered the problem):...
2008 Dec 12
4
gregexpr - match overlap mishandled (PR#13391)
Full_Name: Reid Thompson Version: 2.8.0 RC (2008-10-12 r46696) OS: darwin9.5.0 Submission from: (NULL) (129.98.107.177) the gregexpr() function does NOT return a complete list of global matches as it should. this occurs when a pattern matches two overlapping portions of a string, only the first match is returned. the following function call demonstrates this error (although this is not how I initially discovered the problem):...
2006 Oct 07
2
gregexpr in R 2.3.0 != gregexpr in R 2.4.0
Hi all I have a question regarding differences in the way gregpexr works in R 2.3.0 and R 2.4.0. In R 2.3.0, this is what happens: > gregexpr(" [a-z] [a-z] ", " a b c d e f ", perl=T) [[1]] [1] 1 3 5 7 9 attr(,"match.length") [1] 5 5 5 5 5 ... while in R 2.4.0, this is what happens: > gregexpr(" [a-z] [a-z] ", " a b c d e f ", perl=T) [[1]] [1] 1 7 attr(,"match.length") [1...
2010 Sep 27
7
Regular expressions: offsets of groups
Dear list! > gregexpr("a+(b+)", "abcdaabbc") [[1]] [1] 1 5 attr(,"match.length") [1] 2 4 What I want is the offsets of the matches for the group (b+), i.e. 2 and 7, not the offsets of the complete matches. Is there a way in R to get that? I know about gsubgn and strapply, but they only g...
2012 Mar 30
1
How to use access results of gregexpr in data frames
...ng code successfully to find the first instance of "/". dframe <- data.frame(date=c("5/14/2011", "4/7/2011")) dframe$x1 <- regexpr("/", dframe[, 1]) dframe date x1 1 5/14/2011 2 2 4/7/2011 2To find the second instance, I thought I'd try to use gregexpr to find all instances of "/" (there's always two per string). dframe$all <- gregexpr("/", dframe[, 1]) dframe date x1 all 1 5/14/2011 2 2,5 2 4/7/2011 2 2,4 So far so good.  I then thought to index the second element of dframe$all.  I tried both of the following u...
2008 Jan 31
1
segfault in gregexpr()
Hi, Tried with R 2.6 and R 2.7: > gregexpr("", "abc", fixed=TRUE) *** caught segfault *** address 0x1c09000, cause 'memory not mapped' Traceback: 1: gregexpr("", "abc", fixed = TRUE) Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without...
2012 Nov 02
2
backreferences in gregexpr
Hi Folks, I'm trying to extract just the backreferences from a regex. > temp = "abcd1234abcd1234" > regmatches(temp, gregexpr("(?:abcd)(1234)", temp)) [[1]] [1] "abcd1234" "abcd1234" What I would like is: [1] "1234" "1234" Note: I know I can just match 1234 here, but the actual example is complicated enough that I have to match a larger string, and just want to pass out...
2007 May 22
1
regexp bug in very recent r-devel
...cale: [...] attached base packages: [1] "stats" "graphics" "grDevices" "utils" "datasets" "methods" [7] "base" > regexpr("o", "foo", fixed = TRUE) [1] 2 attr(,"match.length") [1] 1 > gregexpr("o", "foo", fixed = FALSE) [[1]] [1] 2 3 attr(,"match.length") [1] 1 1 > gregexpr("o", "foo", fixed = TRUE) *** caught segfault *** address 0xc022fdab, cause 'memory not mapped' Traceback: 1: gregexpr("o", "foo",...
2008 Oct 31
1
gregexpr slow and increases exponentially with string length --> how to speed it up?
...s found. - the perl=T option slows down the search Any idea to speed this up would be greatly appreciated! Best, Emmanuel > for (i in c(10000, 50000, 100000, 500000)){ + aa = as.character(sample(1:9, i, replace=T)) + aa = paste(aa, collapse='') + print(i) + print(system.time(gregexpr("[367]2[1-9][129]",aa))) + } [1] 10000 user system elapsed 0.004 0.000 0.003 [1] 50000 user system elapsed 0.060 0.000 0.061 [1] 1e+05 user system elapsed 0.240 0.000 0.238 [1] 5e+05 user system elapsed 5.733 0.000 5.732 >
2006 Feb 01
1
Word boundaries and gregexpr in R 2.2.1
...xt<-c("This is a first example sentence.", "And this is a second example sentence.") If I now look for word boundaries with regexpr, this is what I get: > regexpr("\\b", text, perl=TRUE) [1] 1 1 attr(,"match.length") [1] 0 0 So far, so good. But with gregexpr I get: > gregexpr("\\b", text, perl=TRUE) Error: cannot allocate vector of size 524288 Kb In addition: Warning messages: 1: Reached total allocation of 1015Mb: see help(memory.size) 2: Reached total allocation of 1015Mb: see help(memory.size) Why don't I get the locations and e...
2017 Jan 06
0
strsplit(perl=TRUE), gregexpr(perl=TRUE) very slow for long strings
While doing some speed testing I noticed that in R-3.2.3 the perl=TRUE variants of strsplit() and gregexpr() took time proportional to the square of the number of pattern matches in their input strings. E.g., the attached test function times gsub, strsplit, and gregexpr, with perl TRUE (PCRE) and FALSE (TRE), when the input string contains 'n' matches to the given pattern. Notice the quadratic...
2006 Feb 01
1
Word boundaries and gregexpr in R 2.2.1 (PR#8547)
...t;-c("This is a first example sentence.", "And this is a second example sentence.") If I now look for word boundaries with regexpr, this is what I get: > regexpr("\\b", text, perl=TRUE) [1] 1 1 attr(,"match.length") [1] 0 0 So far, so good. But with gregexpr I get: > gregexpr("\\b", text, perl=TRUE) Error: cannot allocate vector of size 524288 Kb In addition: Warning messages: 1: Reached total allocation of 1015Mb: see help(memory.size) 2: Reached total allocation of 1015Mb: see help(memory.size) Why don't I get the locations and ext...