Displaying 20 results from an estimated 20000 matches similar to: "Extracting matched expressions"
2008 Jun 14
2
strsplit, keeping delimiters
Hi all,
Does anyone have a version of strsplit that keeps the string that is
split by. e.g. from
x <- "A: 123 B: 456 C: 678"
I'd like to get
c("A:", "123 ", "B: ", "456 ", "C: ", 678)
but
strsplit(x, "[A-Z]+:")
gives me
c("", " 123 ", " 456 ", " 678")
Any ideas?
Thanks,
2010 May 05
1
extracting a matched string using regexpr
Given a text like
I want to be able to extract a matched regular expression from a piece of
text.
this apparently works, but is pretty ugly
# some html
test<-"</tr><tr><th>88958</th><th>Abcdsef</th><th>67.8S</th><th>68.9\nW</th><th>26m</th>"
# a pattern to extract 5 digits
> pattern<-"[0-9]{5}"
#
2010 Jun 03
5
string handling
I have a data.frame as the following:
var1 var2
9G/G09 abd89C/T90
10A/T9 32C/C
90G/G A/A
. .
. .
. .
10T/C 00G/G90
What I want is to get the letters which are on the left and right of '/'.
for example, for "9G/G09", I only want "G", "G", and for "abd89C/T90", I
only want "C" and
2010 Mar 31
3
regular expression help to extract specific strings from text
Dear all,
Lets say I have the following:
> x <- c("Eve: Going to try something new today...", "Adam: Hey @Eve, how are you finding R? #rstats", "Eve: @Adam, It's awesome, so much better at statistics that #Excel ever was! @Cain & @Able disagree though :(", "Adam: @Eve I'm sure they'll sort it out :)", "blahblah")
> x
[1]
2007 Dec 11
5
book on regular expressions
Hello,
Could someone recommend a good book on regular expressions with focus on
applications/use as it might relate to R. I remember there was a mention of
such a reference book recently, but I could not locate that message on the
archive.
Thanks.
-Christos
Christos Hatzis, Ph.D.
Nuvera Biosciences, Inc.
400 West Cummings Park
Suite 5350
Woburn, MA 01801
Tel: 781-938-3830
2009 Sep 16
3
How to extract a specific substring from a string (regular expressions) ? See details inside
Hi all,
I have thousands of strings like these ones:
"1159_1; YP_177963; PPE FAMILY PROTEIN"
"1100_13; SECRETED L-ALANINE DEHYDROGENASE ALD CAA15575"
"1141_24; gi;2894249;emb;CAA17111.1; PROBABLE ISOCITRATE DEHYDROGENASE"
and various others..
I'm interested to extract the code for the protein (in this example: YP_177963, CAA15575, CAA17111).
I
2009 Nov 13
3
Escaping regular expressions
Hi all,
Is there a method for escaping strings to be used regular expressions?
i.e. if I have a user supplied string that I'd like to use as a fixed
component is there a method that will turn (e.g.) ".$^" into
"\\.\\$\\^" ?
Thanks,
Hadley
--
http://had.co.nz/
2010 Jun 28
2
Identify and extract a whole word of variable length using regular expressions
Hi everybody,
I'm quite weak with regular expression, and I need some help...
I have strings of the type
>a
[1,] "ppe46 Rv3018c MT3098/MT3101 MTV012.32c"
[2,] "ppe16 Rv1135c MT1168"
[3,] "ppe21 Rv1548c MT1599 MTCY48.17"
[4,] "ppe12 Rv0755c MT0779"
[5,] "PE_PGRS51 Rv3367"
2010 May 18
2
Function that is giving me a headache- any help appreciated (automatic read )
note: whole function is below- I am sure I am doing something silly.
when I use it like USGS(input="precipitation") it is choking on the
precip.1 <- subset(DF, precipitation!="NA")
b <- ddply(precip.1$precipitation, .(precip.1$gauge_name), cumsum)
DF.precip <- precip.1
DF.precip$precipitation <- b$.data
part, but runs fine outside of the function:
days=7
2009 Oct 26
1
regular expressions
Dear list,
I have the following text to parse (originating from readLines as some
lines have unequal size),
st = c("START text1 1 text2 2.3", "whatever intermediate text", "START
text1 23.4 text2 3.1415")
from which I'd like to extract the lines starting with "START", and
group the subsequent fields in a data.frame in this format:
text1 text2
2009 Jul 08
5
R regular expression to extract words with the query string.
Hi,
Is there a way in R to get the string which matches the expression, where
the expression is a substring of the parent string.
Lets say, I have $i <- "transcript:ENST0000112334 pid:ENSP000012345"
What I need is the string "pid:ENSP000012345" from $i using the query
"ENSP".
Appreciate your comments.
Praveen Surendran
School of Medicine and
2008 Nov 02
5
R newbie: how to replace string/regular expression
Hello;
I am a R newbie and would like to know correct and efficient method for
doing string replacement.
I have a large data set, where I want to replace character "M", "b",
and "K" (currency in Million, Billion and K) to millions. That is
209.7B with (209.7 * 10e6) and 100.00K with (100.00 *1/100)
and etc..
d <- c("120.0M", "11.01m",
2008 Aug 12
2
perl expression question
I have a string such as
fileName<-"Agg.20.20.20-all-01".
All I want to do is pull the "20.20.20" and the "all" as strings.
Obviously, they aren't always those values.
The "20.20.20" can be "30.30.30" but it's always after the . which is
next to the second g in Agg and it's always the same length. The all
might not always be
2008 May 13
3
Regular Expressions
Hi R,
Again struck with regular expressions...
Suppose,
S=c("World_is_beautiful", "one_two_three_four","My_book")
I need to extract the last but one element of the strings. So, my output should look like:
Ans=c("is","three","My")
gsub() can do this...but wondering how do I give the regular expression....
2008 Aug 06
2
matching problem
I have a matching problem that I cant solve.
mystring = "xxx{XX}yy{YYY}zzz{Z}" where "x","X","y","Y","z","Z" basiclly can
be anything, letters, digits etc. I'm only interested in the content within
each "{}".
I am close but not really there yet.
library(gsubfn)
strapply(mystring,"\\{[^\\}]+",, perl=F)
2010 Mar 15
2
tcltk and R
I have had some comments on sqldf regarding its dependence on tcltk
such as the second last sentence on this blog post:
http://translate.google.com/translate?hl=en&sl=zh-CN&u=http://www.wentrue.net/blog/%3Fp%3D453&prev=http://blogsearch.google.com/blogsearch%3Fhl%3Den%26ie%3DUTF-8%26q%3Dsqldf%26lr%3D%26sa%3DN%26start%3D10
sqldf does not directly use tcltk but it does use strapply in
2011 Oct 18
9
readRDS and saveRDS
Hi all,
Is there any chance that readRDS and saveRDS might one day become
read.rds and write.rds? That would make them more consistent with the
other reading and writing functions.
Hadley
--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/
2010 Oct 13
5
Regular expression to find value between brackets
Hi,
this should be an easy one, but I can't figure it out.
I have a vector of tests, with their units between brackets (if they have
units).
eg tests <- c("pH", "Assay (%)", "Impurity A(%)", "content (mg/ml)")
Now I would like to hava a function where I use a test as input, and which
returns the units
like:
f <- function (x) sub("\\)",
2010 Jul 02
3
Good Package(s) for String and URL processing?
Are there packages that allow improved String and URL processing?
E.g. extract parts of a URLs such as sub-domains, top-level domain,
protocols (e.g. https, http, ftp), file type based on endings, check
if a URL is valid or not, etc...
I am currently only using split and paste. Are there better and more
efficient ways to handle strings e.g. finding sub-strings or to do
pattern matching?
What
2008 Jul 04
4
Re ad in a file - produce independent vectors
Is there a way of reading in a file in a way that each line becomes a vector:
for example:
meals.txt
breakfast bacon eggs sausage
lunch sandwich apple marsbar crisps
dinner chicken rice custard pie
I want to read in this file and end up with 3 different vectors, one called
breakfast which contains "bacon", "eggs", sausage" One called