thr3ads.net - similar to: "extracting a matched string using regexpr"

Displaying 20 results from an estimated 3000 matches similar to: "extracting a matched string using regexpr"

2008 Apr 15

why does regexpr not work with '.'

Dear R Helpers, I am running R 2.6.2 on a Windows XP machine. I am trying to use regexpr to locate full stops in strings, but, without success. Here an example:- f="a,b.c at d:" #define an arbitrary test string regexpr(',',f) #find the occurrences of ',' in f - should be one at location 2 # and this is what regexpr finds #[1] 2

regexpr help (match.length=0)

2010 Jun 01

regexpr help (match.length=0)

R-help, Sorry if this is more of a regex question than an R question. However, help would be appreciated on my use of the regexpr function. In the first example below, I ask for all characters (a-z) in 'abc123'; regexpr returns a 3-character match beginning at the first character. > regexpr("[[:alpha:]]*", "abc123") [1] 1 attr(,"match.length") [1] 3

Finding multiple characters in the same string

2007 Aug 02

Finding multiple characters in the same string

Hi I have this problem where I need to find if there is any numbers in a string, this is no problem if theres only one number per string. I would then simply use the regexpr() funtion togheter with the substring function to extract the number. But regexpr only picks one number per string either from the beginning or the end, but not multiple. Can this be done? And how for example My string <-

regexp capturing group in R

2009 Feb 25

regexp capturing group in R

Hello, Newbie question: how do you capture groups in a regexp in R? Let's say I have txt="blah blah start=20080101 end=20090224". I'd like to get the two dates start and end. In Perl, one would say: my ($start,$end) = ($txt =~ /start=(\d{8}).*end=(\d{8})/); I've tried: txt <- "blah blah start=20080101 end=20090224" m <-

String manipulation with regexpr, got to be a better way

2011 Sep 29

String manipulation with regexpr, got to be a better way

Help-Rs, I'm doing some string manipulation in a file where I converted a string date in mm/dd/yyyy format and returned the date yyyy. I've used regexpr (hat tip to Gabor G for a very nice earlier post on this function) in steps (I've un-nested the code and provided it and an example of what I did below. My question is: is there a more efficient way to do this. Specifically is

a grep/regexpr problem

2004 Feb 06

a grep/regexpr problem

Hi, I'm trying to parse lines of the form: dan001.hin (0): fingerprint={256, 411, 426, 947, 973, 976} What I need is the sequence of number between {}. I'm using grep as match <- grep("{([0-9,\s]*)}",s,perl=T,value=T) where s is a character vector. But all I get is the whole string s. I tried using regexpr in an attempt to get just the sequence I wanted: match <-

regexpr

2007 Jun 29

regexpr

Hi, I 'd like to match each member of a list to a target string, e.g. ------------------------------ mylist=c("MN","NY","FL") g=regexpr(mylist[1], "Those from MN:") if (g>0) { "On list" } ------------------------------ My question is: How to add an end-of-string symbol '$' to the to-match string? so that 'M' won't

regexpr and portability issue

2005 Aug 03

regexpr and portability issue

Dear all-- I am still forging my first arms with R and I am fighting with regexpr() as well as portability between unix and windoz. I need to extract barcodes from filenames (which are located between a double and single underscore) as well as the directory where the filename is residing. Here is the solution I came to: aFileName <-

Crash report: regexpr("a{2-}", "")

2010 Sep 22

Crash report: regexpr("a{2-}", "")

Each of the following calls crash ("core dumps") R (R --vanilla) on various versions and OSes: regexpr("a{2-}", "") sub("a{2-}", "") gsub("a{2-}", "") EXAMPLES: > sessionInfo() R version 2.11.1 Patched (2010-09-16 r52949) Platform: i386-pc-mingw32 (32-bit) ... > regexpr("a{2-}", "") Assertion

Crash report: regexpr("a{2-}", "")

2010 Sep 22

Crash report: regexpr("a{2-}", "")

regexpr with accents

2012 Aug 06

regexpr with accents

Hello, I have build a syntax to find out if a given substring is included in a larger string that works like this: d1$V1[regexpr("some text = 9",d1$V2)>0] <- 9 and this works all right till "some text" contains standard ASCII set. However, it does not work when accents are included as the following: d1$V1[regexpr("some t?xt = 9",d1$V2)>0] <- 9 I have

Regexpr with "."

2003 Aug 13

Regexpr with "."

I'm trying to use the regexpr function to locate the decimal in a character string. Regardless of the position of the decimal, the function returns 1. For example, > regexpr(".", "Female.Alabama") [1] 1 attr(,"match.length") [1] 1 In trying to figure out what was going on here, I tried the below command: > gsub(".", ",",

search for string insider a string

2009 Mar 13

search for string insider a string

Hi, sorry if it is a too stupid question, but how do I a string search in R: I have a dataframe A with A$test like: test1 bcdtestblabla2.1bla cdtestblablabla3.88blabla and I want to search for string that start with 'dtest' and ends with number and return the location of that substring and the number, so the end result would be: NA NA 3 2.1 2 3.88 I find grep can

substituting dots in the names of the columns (sub, gsub, regexpr)

2007 Jul 26

substituting dots in the names of the columns (sub, gsub, regexpr)

Dear R users, I have the following two problems, related to the function sub, grep, regexpr and similia. The header of the file(s) I have to import is like this. c("y (m)", "BD (g/cm3)", "PR (Mpa)", "Ks (m/s)", "SP g./g.", "P (m3/m3)", "theta1 (g/g)", "theta2 (g/g)", "AWC (g/g)") To get rid of spaces and

Searching within a ch. string

2009 May 11

Searching within a ch. string

Hi all, is there any function to find some words in a character-string? For example suppose the string is : "gdfsa-sdhchc-88", now I want to find whether this string contains "sdhch". Is there any R function to do that? Regards, -- View this message in context: http://www.nabble.com/Searching-within-a-ch.-string-tp23484010p23484010.html Sent from the R help mailing list

regexpr mystery can not remove trailing spaces

2010 Jun 02

regexpr mystery can not remove trailing spaces

Dear all I encountered strange problem with regexpr replacement I made this character object str <- "02.06.10 12:40 " > str(str) chr "02.06.10 12:40 " I read in an object which seems to be quite similar > str(as.character(becva$V1)[1]) chr "02.06.10 12:40 " However I can not remove trailing spaces from it > sub(' +$',

regexp help needed

2008 Nov 28

regexp help needed

Hello, I have a vector of dates and I would like to grep the year component from this vector (= all digits after the last punctuation character) dates <- c("28.7.08","28.7.2008","28/7/08", "28/7/2008", "28/07/2008", "28-07-2008", "28-07-08") the resulting vector should look like "08" "2008"

regexpr syntax question

2008 Oct 01

regexpr syntax question

Greetings R list, I am stuck on a simple syntax problem. I want to list all files in a directory, excluding files of a certain type. I have tried pattern matching as follows: a <- list.files(data, full.name = TRUE, pattern != ".xml") # exclude all .xml files The warning returns that my syntax is incorrect. I have read the regexpr help files and search old posts to no

Regexpr. analyzer

2006 Oct 27

Regexpr. analyzer

Hi! I want to index html files, but w/o the tags, so I was thinking either I remove them before I index it (expensive), or put up an RegExpAnalyzer. BTW, when using an analyzer, does that mean that everything which it declines (i.e. the RegExpAnalyzer doesn''t match) won''t be put into the index files (i.e. blows it up)? I came up with a simple test, which didn''t

Manipulate Data (with regular expressions)

2008 Jul 08

Manipulate Data (with regular expressions)

Dear Everyone, I try to automatically manipulate the data of a variable (class = factor) like x 220 220a 221 221b B221 Into two variables (class = numeric) like x y 220 0 220 1 221 0 221 1 221 1 y has to carry the information about the class (number or string) of the former x-Variable. I could do it by hand like x[x == "220a"] <- 220

similar to: extracting a matched string using regexpr