search for: strapplyc

Displaying 7 results from an estimated 7 matches for "strapplyc".

Did you mean: strapply
2012 Jul 06
2
Maximum number of patterns and speed in grep
Hi, I am using R's grep function to find patterns in vectors of strings. The number of patterns I would like to match is 7,700 (of different sizes). I noticed that I get an error message when I do the following: data <- array() for (j in 1:length(x)) { array[j] <- length(grep(paste(patterns[1:7700], collapse = "|"), x[j], value = T)) } When I break this up into 4 chunks of
2012 May 14
3
Scraping a web page.
Folks, I want to scrape a series of web-page sources for strings like the following: "/en/Ships/A-8605507.html" "/en/Ships/Aalborg-8122830.html" which appear in an href inside an <a> tag inside a <div> tag inside a table. In fact all I want is the (exactly) 7-digit number before ".html". The good news is that as far as I can tell the the <a>
2013 Jan 14
4
Grabbing Specific Words from Content (basic text mining)
Hi all, Suppose I have a data frame with mixed content (name age and address). a<-"Name: John Smith Age: 35 Address: 32, street, sub, something" b<-data.frame(a) 1. The question is I want to extract the name age and address separately from this data frame (containing potentially more people). 2. Also just incase I have to deal with it how would the syntax change if I had
2012 Nov 02
2
backreferences in gregexpr
Hi Folks, I'm trying to extract just the backreferences from a regex. > temp = "abcd1234abcd1234" > regmatches(temp, gregexpr("(?:abcd)(1234)", temp)) [[1]] [1] "abcd1234" "abcd1234" What I would like is: [1] "1234" "1234" Note: I know I can just match 1234 here, but the actual example is complicated enough that I have to
2023 Apr 12
1
Split String in regex while Keeping Delimiter
On Wed, 12 Apr 2023 08:29:50 +0000 Emily Bakker <emilybakker at outlook.com> wrote: > Some example data: > ?leucocyten + gramnegatieve staven +++ grampositieve staven ++? > ?leucocyten ? grampositieve coccen +? > ? > I want to split the strings such that I get the following result: > c(?leucocyten +?, ??gramnegatieve staven +++?, > ??grampositieve staven ++?) >
2012 Jan 07
3
Getting a list of unique gene names from a list with semi-colons
Hello, I have one column in my dataframe that has gene names of interest. Unfortunately, due to the fact that some probes lie between two genes or two transcripts of a gene, it looks something like this - FAM81A LOC283050;LOC283050;LOC283050;ZMIZ1 PINK1;PINK1 MRPL12;MRPL12 C1orf114 MMS19;UBTD1 I would like to know how to get a list with all the names with no semi-colons and removing the
2013 Jun 16
2
extract all numbers from a string
Hi all, I have been beating my head against this problem for a bit, but I can't figure it out. I have a series of strings of variable length, and each will have one or more numbers, of varying format. E.g., I might have: tmpstr = "The first number is: 32. Another one is: 32.1. Here's a number in scientific format, 0.3523e10, and another, 0.3523e-10, and a negative,