similar to: Word boundaries and gregexpr in R 2.2.1

Displaying 20 results from an estimated 2000 matches similar to: "Word boundaries and gregexpr in R 2.2.1"

2006 Feb 01
1
Word boundaries and gregexpr in R 2.2.1 (PR#8547)
Full_Name: Stefan Th. Gries Version: 2.2.1 OS: Windows XP (Home and Professional) Submission from: (NULL) (68.6.34.104) The problem is this: I have a vector of two character strings. > text<-c("This is a first example sentence.", "And this is a second example sentence.") If I now look for word boundaries with regexpr, this is what I get: >
2006 Oct 07
2
gregexpr in R 2.3.0 != gregexpr in R 2.4.0
Hi all I have a question regarding differences in the way gregpexr works in R 2.3.0 and R 2.4.0. In R 2.3.0, this is what happens: > gregexpr(" [a-z] [a-z] ", " a b c d e f ", perl=T) [[1]] [1] 1 3 5 7 9 attr(,"match.length") [1] 5 5 5 5 5 ... while in R 2.4.0, this is what happens: > gregexpr(" [a-z] [a-z] ", " a b c d e f ", perl=T)
2005 Aug 26
3
parts of data frames: subset vs. [-c()]
Dear all I have a problem with splitting up a data frame called ReVerb: ?? str(ReVerb) `data.frame': 92713 obs. of 16 variables: $ CHILD : Factor w/ 7 levels "ABE","ADA","EVE",..: 1 1 1 1 1 1 1 1 1 1 ... $ AGE : Factor w/ 484 levels "1;06.00","1;06.16",..: 43 43 43 99 99 99 99 99 99 99 ... $ AGE_Q : num 2.0 2.0 2.0 2.4 2.4
2012 Mar 30
1
How to use access results of gregexpr in data frames
Hello, I'm trying to figure out how to find the index of the second occurrence of "/" in a string (which happens to represent a date) within a data frame column. I've used the following code successfully to find the first instance of "/". dframe <- data.frame(date=c("5/14/2011", "4/7/2011")) dframe$x1 <- regexpr("/", dframe[, 1])
2006 Jul 23
3
RfW 2.3.1: regular expressions to detect pairs of identical word-final character sequences
Dear all I use R for Windows 2.3.1 on a fully updated Windows XP Home SP2 machine and I have two related regular expression problems. platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor
2008 Dec 12
4
gregexpr - match overlap mishandled (PR#13391)
Full_Name: Reid Thompson Version: 2.8.0 RC (2008-10-12 r46696) OS: darwin9.5.0 Submission from: (NULL) (129.98.107.177) the gregexpr() function does NOT return a complete list of global matches as it should. this occurs when a pattern matches two overlapping portions of a string, only the first match is returned. the following function call demonstrates this error (although this is not how I
2008 Dec 12
4
gregexpr - match overlap mishandled (PR#13391)
Full_Name: Reid Thompson Version: 2.8.0 RC (2008-10-12 r46696) OS: darwin9.5.0 Submission from: (NULL) (129.98.107.177) the gregexpr() function does NOT return a complete list of global matches as it should. this occurs when a pattern matches two overlapping portions of a string, only the first match is returned. the following function call demonstrates this error (although this is not how I
2006 Nov 07
1
Gregexpr - extract results with lapply
Gregexpr - extract results with lapply Hello, I need to extract sequences of three upper case letters in a string. In other words, in this string: str <-c("ABC", "this WOUld be gOOD") The result I'm looking for is ABC WOU OOD. With gregexpr, I can get the position and length of the sequences gregexpr('[A-Z]{3}',str,perl=TRUE) [[1]] [1] 1
2006 Sep 18
2
404 HTTP not found
Hi I wrote a script which retrieves links from websites and loads them with scan: ... website<-tolower(scan(current.pages[i], what="character", sep="\n", quiet=TRUE)) ... However occasionally, the script finds broken links, such as <http://www.google.com/test>. when the script tries to access such websites, the repeat loop breaks and I get the error message Error
2011 Aug 17
2
question regarding gregexpr and read.table
Hi, I have a silly question regarding the usage of two commands: read.table and gregexpr: For read.table, if I read a matrix and set header = T, I found that all the dash ("-") becomes dots (".") A = read.table("Matrix.txt", sep = "\t", header = F) A[1,1] # "A-B-C-D". A = read.table("Matrix.txt", sep = "\t", header = T)
2009 Feb 25
1
Using gregexpr with multiple search elements
Dear list, I am trying to use gregexpr to see if entries in a dataframe have either of two possible values for a string. here's an example text<-c("fat", "rat", "cat", "dog", "log", "fish") If I just wanted to find if any one of the elements in text match the pattern "at" I would do gregexpr("\\at", text)
2019 Feb 19
1
patch for gregexpr(perl=TRUE)
Hi all, Several people have noticed that gregexpr is very slow for large subject strings when perl=TRUE is specified. - https://stackoverflow.com/questions/31216299/r-faster-gregexpr-for-very-large-strings - http://r.789695.n4.nabble.com/strsplit-perl-TRUE-gregexpr-perl-TRUE-very-slow-for-long-strings-td4727902.html - https://stat.ethz.ch/pipermail/r-help/2008-October/178451.html I figured out
2007 Oct 10
4
gregexpr (PR#9965)
Full_Name: Peter Dolan Version: 2.5.1 OS: Windows Submission from: (NULL) (128.193.227.43) gregexpr does not find all matching substrings if the substrings overlap: > gregexpr("abab","ababab") [[1]] [1] 1 attr(,"match.length") [1] 4 It does work correctly in Version 2.3.1 under linux.
2007 May 22
1
regexp bug in very recent r-devel
completion is semi-broken in today's r-devel, and the reason seems to be some regular expression changes: > sessionInfo() R version 2.6.0 Under development (unstable) (2007-05-22 r41673) i686-pc-linux-gnu locale: [...] attached base packages: [1] "stats" "graphics" "grDevices" "utils" "datasets" "methods" [7]
2008 Jan 31
1
segfault in gregexpr()
Hi, Tried with R 2.6 and R 2.7: > gregexpr("", "abc", fixed=TRUE) *** caught segfault *** address 0x1c09000, cause 'memory not mapped' Traceback: 1: gregexpr("", "abc", fixed = TRUE) Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace
2012 Nov 02
2
backreferences in gregexpr
Hi Folks, I'm trying to extract just the backreferences from a regex. > temp = "abcd1234abcd1234" > regmatches(temp, gregexpr("(?:abcd)(1234)", temp)) [[1]] [1] "abcd1234" "abcd1234" What I would like is: [1] "1234" "1234" Note: I know I can just match 1234 here, but the actual example is complicated enough that I have to
2008 Oct 31
1
gregexpr slow and increases exponentially with string length --> how to speed it up?
Dear All, I have a long string and need to search for regular expressions in there. However it becomes horribly slow as the string length increases. Below is an example: when "i" increases by 5, the time spent increases by more! (my string is 11,000,000 letters long!) I also noticed that - the search time increases dramatically with the number of matches found. - the perl=T option
2007 Mar 08
2
Named backreferences in replacement patterns
Hi I have a problem with substitutions involving named backreferences. I have a vector American.dates: > American.dates [1] "5/15/1976" "2.15.1970" "1.9.2006" which I want to change into British.dates: > British.dates [1] "15/5/1976" "15/2/1970" "9/1/2006" I know I can do it like this:
2017 Jun 28
1
regexec() bug in R 3.4.0
Hi, In R 3.4.0, the "Pattern Matching and Replacement" documentation that describes regexec(), gregexpr(), etc. states that the "text" argument to regexec is a character vector, "or an object which can be coerced by as.character to a character vector": regexec(pattern, text, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)
2013 Mar 20
2
Pattern match
Hello again, in the help page of grep() function, it is written that pattern: character string containing a regular expression (or character string for fixed = TRUE) to be matched in the given character vector. Coerced by as.character to a character string if possible. If a character vector of length 2 or more is supplied, the first element is used with a warning. Missing values are allowed