similar to: How to use access results of gregexpr in data frames

Displaying 20 results from an estimated 2000 matches similar to: "How to use access results of gregexpr in data frames"

2011 Jun 09
3
How to subset based on column name that is a number ?
Hi, I have a data frame with column names "1", "2", "3", ... and I'd like to extract a subset based on the values in the first column. None of the methods I tried worked (below). x <- subset(dframe, 1 = = "My Text") x <- subset(dframe, "1" = = "My Text") x <- subset(dframe, names(dframe)[1] = = "My Text") Q
2006 Nov 07
1
Gregexpr - extract results with lapply
Gregexpr - extract results with lapply Hello, I need to extract sequences of three upper case letters in a string. In other words, in this string: str <-c("ABC", "this WOUld be gOOD") The result I'm looking for is ABC WOU OOD. With gregexpr, I can get the position and length of the sequences gregexpr('[A-Z]{3}',str,perl=TRUE) [[1]] [1] 1
2011 Aug 17
2
question regarding gregexpr and read.table
Hi, I have a silly question regarding the usage of two commands: read.table and gregexpr: For read.table, if I read a matrix and set header = T, I found that all the dash ("-") becomes dots (".") A = read.table("Matrix.txt", sep = "\t", header = F) A[1,1] # "A-B-C-D". A = read.table("Matrix.txt", sep = "\t", header = T)
2009 Feb 25
1
Using gregexpr with multiple search elements
Dear list, I am trying to use gregexpr to see if entries in a dataframe have either of two possible values for a string. here's an example text<-c("fat", "rat", "cat", "dog", "log", "fish") If I just wanted to find if any one of the elements in text match the pattern "at" I would do gregexpr("\\at", text)
2019 Feb 19
1
patch for gregexpr(perl=TRUE)
Hi all, Several people have noticed that gregexpr is very slow for large subject strings when perl=TRUE is specified. - https://stackoverflow.com/questions/31216299/r-faster-gregexpr-for-very-large-strings - http://r.789695.n4.nabble.com/strsplit-perl-TRUE-gregexpr-perl-TRUE-very-slow-for-long-strings-td4727902.html - https://stat.ethz.ch/pipermail/r-help/2008-October/178451.html I figured out
2007 Oct 10
4
gregexpr (PR#9965)
Full_Name: Peter Dolan Version: 2.5.1 OS: Windows Submission from: (NULL) (128.193.227.43) gregexpr does not find all matching substrings if the substrings overlap: > gregexpr("abab","ababab") [[1]] [1] 1 attr(,"match.length") [1] 4 It does work correctly in Version 2.3.1 under linux.
2008 Jan 31
1
segfault in gregexpr()
Hi, Tried with R 2.6 and R 2.7: > gregexpr("", "abc", fixed=TRUE) *** caught segfault *** address 0x1c09000, cause 'memory not mapped' Traceback: 1: gregexpr("", "abc", fixed = TRUE) Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace
2006 Oct 07
2
gregexpr in R 2.3.0 != gregexpr in R 2.4.0
Hi all I have a question regarding differences in the way gregpexr works in R 2.3.0 and R 2.4.0. In R 2.3.0, this is what happens: > gregexpr(" [a-z] [a-z] ", " a b c d e f ", perl=T) [[1]] [1] 1 3 5 7 9 attr(,"match.length") [1] 5 5 5 5 5 ... while in R 2.4.0, this is what happens: > gregexpr(" [a-z] [a-z] ", " a b c d e f ", perl=T)
2008 Dec 12
4
gregexpr - match overlap mishandled (PR#13391)
Full_Name: Reid Thompson Version: 2.8.0 RC (2008-10-12 r46696) OS: darwin9.5.0 Submission from: (NULL) (129.98.107.177) the gregexpr() function does NOT return a complete list of global matches as it should. this occurs when a pattern matches two overlapping portions of a string, only the first match is returned. the following function call demonstrates this error (although this is not how I
2008 Dec 12
4
gregexpr - match overlap mishandled (PR#13391)
Full_Name: Reid Thompson Version: 2.8.0 RC (2008-10-12 r46696) OS: darwin9.5.0 Submission from: (NULL) (129.98.107.177) the gregexpr() function does NOT return a complete list of global matches as it should. this occurs when a pattern matches two overlapping portions of a string, only the first match is returned. the following function call demonstrates this error (although this is not how I
2006 Feb 01
1
Word boundaries and gregexpr in R 2.2.1
Hi I have a question concerning how to match word boundaries which I bet has a very simple answer, but I haven't found it with trial and error nor by searching the help archives for the terms in the subject line. The problem is this: I have a vector of two character strings. text<-c("This is a first example sentence.", "And this is a second example sentence.") If I
2006 Feb 01
1
Word boundaries and gregexpr in R 2.2.1 (PR#8547)
Full_Name: Stefan Th. Gries Version: 2.2.1 OS: Windows XP (Home and Professional) Submission from: (NULL) (68.6.34.104) The problem is this: I have a vector of two character strings. > text<-c("This is a first example sentence.", "And this is a second example sentence.") If I now look for word boundaries with regexpr, this is what I get: >
2012 Nov 02
2
backreferences in gregexpr
Hi Folks, I'm trying to extract just the backreferences from a regex. > temp = "abcd1234abcd1234" > regmatches(temp, gregexpr("(?:abcd)(1234)", temp)) [[1]] [1] "abcd1234" "abcd1234" What I would like is: [1] "1234" "1234" Note: I know I can just match 1234 here, but the actual example is complicated enough that I have to
1999 Oct 19
2
Summary bug?
Hi, It seems that there's a bug in summary, in the max. output... but max() alone works fine. > hw04.dframe$area ... [41] 1790 1380 1296 2745 798 2306 438649 1481 1559 2450 ... > summary(hw04.dframe) area Min. : 798 1st Qu.: 1349 Median : 1690 Mean : 6962 3rd Qu.: 2306 Max. :438600 ### should read 438649 or, to the point,
2008 Oct 31
1
gregexpr slow and increases exponentially with string length --> how to speed it up?
Dear All, I have a long string and need to search for regular expressions in there. However it becomes horribly slow as the string length increases. Below is an example: when "i" increases by 5, the time spent increases by more! (my string is 11,000,000 letters long!) I also noticed that - the search time increases dramatically with the number of matches found. - the perl=T option
2011 Aug 16
2
How to use 'switch' with strings containing spaces?
Hi, Does anyone know if the alternatives in the 'switch' function can be specified as strings containing spaces?  Neither of the two approaches below works. switch(expr, "Choice 1"="My first choice", "Choice 2"="My 2nd choice", "Choice 3"="My 3rd choice") x <- c("Choice 1", "Choice 2", "Choice
2008 Nov 25
1
Efficient passing through big data.frame and modifying select
> -----Original Message----- > From: William Dunlap > Sent: Tuesday, November 25, 2008 9:16 AM > To: 'johannes_graumann at web.de' > Subject: Re: [R] Efficient passing through big data.frame and > modifying select fields > > > Johannes Graumann johannes_graumann at web.de > > Tue Nov 25 15:16:01 CET 2008 > > > > Hi all, > > > >
2008 Dec 23
1
quotation problem/dataframe names as function input argument.
Dear R friends: Can someone help me with the following problem? Many thanks in advance. # Problem Description: # I want to write functions which take a (character) vector of dataframe names as input argument. # For example, I want to extract the number of observations from a number of dataframes. # I tried the following: nobs.fun <- function (dframe.vec) { nobs.vec <-
2012 Nov 21
3
Problems understanding use of regular expression (in gsub) for manipulating currency
Hello, After reading help file, various threads on this board, and other online tutorials, I've attempted to use gsub (using Perl-like syntax) to change a currency string into something that can be converted to numeric type using only one regular expression.  Can anybody point out my error?  Note that >  x <- "\"$ 1,200,300,400.50\"" Tried the following in an
2005 Nov 09
2
error in NORM lib
Dear alltogether, I experience very strange behavior of imputation of NA's with the NORM library. I use R 2.2.0, win32. The code is below and the same dataset was also tried with MICE and aregImpute() from HMISC _without_ any problem. The problem is as follows: (1) using the whole dataset results in very strange imputations - values far beyond the maximum of the respective column, >