thr3ads.net - similar to: "regexp capturing group in R"

Displaying 20 results from an estimated 10000 matches similar to: "regexp capturing group in R"

2008 Nov 28

regexp help needed

Hello, I have a vector of dates and I would like to grep the year component from this vector (= all digits after the last punctuation character) dates <- c("28.7.08","28.7.2008","28/7/08", "28/7/2008", "28/07/2008", "28-07-2008", "28-07-08") the resulting vector should look like "08" "2008"

Equivalent to regMatchPos in R.

2011 Sep 27

Equivalent to regMatchPos in R.

R Experts: I am trying to isolate the numeric value of day from a string that might look like "Cycle 1 Day 8" or "Cycle 12 Day 15". In essence, what I need is a function that can look at a character string and tell me the location within the string where it matches with a given string. In this case, where within "Cycle 12 Day 15" is the text "Day"

extracting a matched string using regexpr

2010 May 05

extracting a matched string using regexpr

Given a text like I want to be able to extract a matched regular expression from a piece of text. this apparently works, but is pretty ugly # some html test<-"</tr><tr><th>88958</th><th>Abcdsef</th><th>67.8S</th><th>68.9\nW</th><th>26m</th>" # a pattern to extract 5 digits > pattern<-"[0-9]{5}" #

how to implement string pattern extraction in R

2010 Aug 22

how to implement string pattern extraction in R

Hi, In perl, to get a substring matching a particular pattern can be implemented like the following example: $x = "AAAA.txt"; if ($x=~ /(.*?)\.txt/){ $prefix = $1; } So how to do the same thing in R? Can someone provide me the code sample? Thanks much in advance. -- Waverley @ Palo Alto

regexp,grep: capturing more than one substring

2004 Oct 27

regexp,grep: capturing more than one substring

Hello, I would like to have a function that retrieve matching strings in the same way as with java.util.regex (java 1.4.2). Example: f('^.*(xx?)\\.([0-9]*)$','abcxx.785') => c('xx','785') First of all: Is it possible to achiev this with grep(... perl=TRUE,value=TRUE )? As I would call this function very often with large data, I'm reluctant to use Sjava

Regexp subexpression

2006 Mar 25

Regexp subexpression

I can't get the PERL subexpression translated to R. Following, for example, B. Ripley's http://finzi.psych.upenn.edu/R/Rhelp02a/archive/58984.html I am using sub, but it looks like an ugly substitute. Assume I want to extract the first alpha part and the first numeric part, but only if they are in sequence. Do I really have to use the sub twice, first extracting the first variable, then

Named capture in regexp

2011 Feb 25

Named capture in regexp

Dear R core developers, One feature from Python that I have been wanting in R is the ability to capture groups in regular expressions using names. Consider the following example in R. > notables <- c(" Ben Franklin and Jefferson Davis","\tMillard Fillmore") > name.rex <- "(?<first>[A-Z][a-z]+) (?<last>[A-Z][a-z]+)" > (parsed <-

Finding multiple characters in the same string

2007 Aug 02

Finding multiple characters in the same string

Hi I have this problem where I need to find if there is any numbers in a string, this is no problem if theres only one number per string. I would then simply use the regexpr() funtion togheter with the substring function to extract the number. But regexpr only picks one number per string either from the beginning or the end, but not multiple. Can this be done? And how for example My string <-

Manipulate Data (with regular expressions)

2008 Jul 08

Manipulate Data (with regular expressions)

Dear Everyone, I try to automatically manipulate the data of a variable (class = factor) like x 220 220a 221 221b B221 Into two variables (class = numeric) like x y 220 0 220 1 221 0 221 1 221 1 y has to carry the information about the class (number or string) of the former x-Variable. I could do it by hand like x[x == "220a"] <- 220

regexp bug in very recent r-devel

2007 May 22

regexp bug in very recent r-devel

completion is semi-broken in today's r-devel, and the reason seems to be some regular expression changes: > sessionInfo() R version 2.6.0 Under development (unstable) (2007-05-22 r41673) i686-pc-linux-gnu locale: [...] attached base packages: [1] "stats" "graphics" "grDevices" "utils" "datasets" "methods" [7]

search for string insider a string

2009 Mar 13

search for string insider a string

Hi, sorry if it is a too stupid question, but how do I a string search in R: I have a dataframe A with A$test like: test1 bcdtestblabla2.1bla cdtestblablabla3.88blabla and I want to search for string that start with 'dtest' and ends with number and return the location of that substring and the number, so the end result would be: NA NA 3 2.1 2 3.88 I find grep can

parsing strings between [ ] in columns

2010 Feb 18

parsing strings between [ ] in columns

Dear all, I have a data.frame with a column like the x shown below myDF<-data.frame(cbind(x=c("[[1, 0, 0], [0, 1]]", "[[1, 1, 0], [0, 1]]","[[1, 0, 0], [1, 1]]", "[[0, 0, 1], [0, 1]]"))) > myDF x 1 [[1, 0, 0], [0, 1]] 2 [[1, 1, 0], [0, 1]] 3 [[1, 0, 0], [1, 1]] 4 [[0, 0, 1], [0, 1]] As you can see my x column is composed of

simple q: returning a logical vector of substring matches

2007 Jan 20

simple q: returning a logical vector of substring matches

I'm a relative R novice, and sometimes the simple things trip me up. Suppose I have a <- c("apple", "pear") and I want a logical vector of whether each of these strings contains "ear" (in this case, F T). What is the idiom? Quizzically, Mark Lindeman

Splitting the string at the last sub-string

2005 Sep 15

Splitting the string at the last sub-string

Hi, I need to split a string into 2 strings, with the split point defined by the last occurrence of some substring. I come up with some convoluted code to do so: str = "Chance favors the prepared mind" sub = "e" y = unlist(strsplit(str,sub)) z = cbind(paste(y[-length(y)], sub, sep="", collapse = ""), y[length(y)]); y z z[1] z[2] Is there a simpler way

Using grep to determine value of last letter...

2009 Oct 19

Using grep to determine value of last letter...

I am currently being defeated by grep. I am attempting to determine the value of the last letter of a character string. An example of my data set is shown below. Regarding the codes, I would like to identify the value of the last character and then take the appropriate action, e.g. If the value is L then label UL rating XXX It the value is F then label UL rating YYY ... I assume it will be

extract all numbers from a string

2013 Jun 16

extract all numbers from a string

Hi all, I have been beating my head against this problem for a bit, but I can't figure it out. I have a series of strings of variable length, and each will have one or more numbers, of varying format. E.g., I might have: tmpstr = "The first number is: 32. Another one is: 32.1. Here's a number in scientific format, 0.3523e10, and another, 0.3523e-10, and a negative,

regexpr with accents

2012 Aug 06

regexpr with accents

Hello, I have build a syntax to find out if a given substring is included in a larger string that works like this: d1$V1[regexpr("some text = 9",d1$V2)>0] <- 9 and this works all right till "some text" contains standard ASCII set. However, it does not work when accents are included as the following: d1$V1[regexpr("some t?xt = 9",d1$V2)>0] <- 9 I have

String manipulation with regexpr, got to be a better way

2011 Sep 29

String manipulation with regexpr, got to be a better way

Help-Rs, I'm doing some string manipulation in a file where I converted a string date in mm/dd/yyyy format and returned the date yyyy. I've used regexpr (hat tip to Gabor G for a very nice earlier post on this function) in steps (I've un-nested the code and provided it and an example of what I did below. My question is: is there a more efficient way to do this. Specifically is

parsing dir output for file sizes

2000 Jul 11

parsing dir output for file sizes

I've got an R wrapper around an old DOS program. In the R program I need to test whether the DOS program failed to produce certain output. This is indicated by certain text files created by the DOS program being empty. I can use system(command, intern=TRUE) to get the output of a DOS dir for a test file, but I'm having trouble parsing this to get the file size. Is there an R function

string problems ( grep and regepxr)

2004 Mar 24

string problems ( grep and regepxr)

Recently working with strings and data I have found a small problem. Windows XP R 1.8.1 Reading data from a "txt file" with readLine. finding a specific line with "grep" command, all OK. but here comes the problem... After finding the correct line(s) i need to find a substring inside each string. In this case "tabs" I think it represented by "\t" in the

similar to: regexp capturing group in R