Displaying 20 results from an estimated 5000 matches similar to: "Spliting columns, strings or reg exp returning substrings"
2009 Jul 28
2
aggregating strings
I am currently summarising a data set by collapsing data based on common identifiers in a column. I am using the 'aggregate' function to summarise numeric columns, i.e. "aggregate(dat[,3], list(dat$gene), mean)". I also wish to summarise text columns e.g. by concatenating values in a comma separated list, but the aggregate function can only return scalar values and so something
2009 Jan 20
2
Merging tables
I am relatively new to R and am trying to do some basic data manipulation. Basically I have a table (csv - table 1) of data for a set of samples (rows), and a second table (table 2) of information about a subset of samples of particular interest. I want to pull out the data from table 1 for the samples in table 2, either by:
* Merging the two tables based on a common identifier (SampleID - may
2006 Apr 17
1
strsplit does not return correct value when spliting "" (PR#8777)
Full_Name: Charles Dupont
Version: 2.2.0
OS: linux
Submission from: (NULL) (160.129.129.136)
when
strsplit("", " ")
returns character(0)
where as
strsplit("a", " ")
returns "a".
these return values are not constiant with each other.
Charles Dupont
2019 Feb 20
2
Bug: time complexity of substring is quadratic as string size and number of substrings increases
Hi all, (and especially hi to Tomas Kalibera who accepted my patch sent
yesterday)
I believe that I have found another bug, this time in the substring
function. The use case that I am concerned with is when there is a single
(character scalar) text/subject, and many substrings to extract. For example
substring("AAAA", 1:4, 1:4)
or more generally,
N=1000
2019 Feb 22
1
Bug: time complexity of substring is quadratic as string size and number of substrings increases
On 2/20/19 7:55 PM, Toby Hocking wrote:
> Update: I have observed that stringi::stri_sub is linear time complexity,
> and it computes the same thing as base::substring. figure
> https://github.com/tdhock/namedCapture-article/blob/master/figure-substring-bug.png
> source:
> https://github.com/tdhock/namedCapture-article/blob/master/figure-substring-bug.R
>
> To me this is a
2010 Jul 07
3
use sliding window to count substrings found in large string
Hello together,
I'm looking for advice on how to do some tests on strings.
What I want to do is the following:
(just an example, real strings/sequence are about 200-400 characters long)
given set of Strings:
String1 abcdefgh
String2 bcdefgop
use a sliding window of size x to create an vector of all subsequences
of size x
found in the set (order matters! ).
Now create, for every string
2011 Apr 11
1
Getting many substrings but only loading the original string one time.
Hi All,
I'm looking for a way to get many substrings from a longer string and
then stitch them together. But, since the longer string is really, really
long (like 250 MB long), I don't want to do this in a loop and load and
re-load the longer string many times. Does anybody have an idea?
Maybe I could pass in two vectors (the first would have the starting
coordinates, and the second
2007 Dec 13
6
spliting strings ...
Hi everyone,
I have a vector of strings, each string made up by different number of words. I want to get a new vector which has only the first word of each string in the first vector. I came up with this:
str <- c('aaa bbb', 'cc', 'd eee aa', 'mmm o n')
str1 <- rep(1, length(str))
for (i in 1:length(str)) {
str1[i] <- strsplit(str, "
2019 Feb 20
0
Bug: time complexity of substring is quadratic as string size and number of substrings increases
Update: I have observed that stringi::stri_sub is linear time complexity,
and it computes the same thing as base::substring. figure
https://github.com/tdhock/namedCapture-article/blob/master/figure-substring-bug.png
source:
https://github.com/tdhock/namedCapture-article/blob/master/figure-substring-bug.R
To me this is a clear indication of a bug in substring, but again it would
be nice to have
2005 Oct 28
3
splitting a character field in R
Dear R users,
I have a dataframe with one character field, and I would like to create two
new fields (columns) in my dataset, by spliting the existing character
field into two using an existing substring.
... something that in SAS I could solve e.g. combining substr(which I am
aware exist in R) and "index" for determining the position of the pattern
within the string.
e.g. if my
2012 Jan 10
2
strange Sys.Date() side effect
Any ideas what is the problem with this code?
> N <- 2; c(Sys.Date(), sprintf('N = %d', N))
[1] "2012-01-10" NA
Warning message:
In as.POSIXlt.Date(x) : NAs introduced by coercion
Best regards,
Ryszard
Ryszard Czerminski
AstraZeneca Pharmaceuticals LP
35 Gatehouse Drive
Waltham, MA 02451
USA
781-839-4304
ryszard.czerminski@astrazeneca.com
2006 Apr 17
0
(PR#8777) strsplit does [not] return correct value when spliting ""
Prof Brian Ripley wrote:
> On Mon, 17 Apr 2006, Charles Dupont wrote:
[...]
> > The man page states in the value section that strsplit returns:
> > A list of length 'length(x)' the 'i'-th element of which contains
> > the vector of splits of 'x[i]'.
> >
> > It mentions no change in behavior if the value of x[i] = "".
>
2006 Jul 28
1
spliting
Dear mailing list,
I have a big data frame and each element in the matrix has two alphabets. I
want to split those alphabets into two so each element will have one
alphabet and the number of my columns will be doubled . So can some one help
with the code?
Example of what I want is to split them.
Input (three column)
GG AG AG
CC CC CC
CC CC CC
AG
2008 Jul 17
2
spliting a string
Hi
String<-"130.5"
Df<-Strsplit(".",":",String)
Then Df get
"" "" "" ""
But I want Df should contains
Df
"130"
"5"
If any body knows how to do it.tel me
Thanks
K.Ravichandra
[[alternative HTML version deleted]]
2010 Nov 01
1
spliting first 10 words in a string
Hi all,
I have a columnn with text that has quite a few words in it. I would like to split these words in separate columns, but just first ten words in the string. Is that possible in R?
Thank you, m
[[alternative HTML version deleted]]
2007 Nov 28
1
how to find and use specific column after spliting dataframe
Dear all:
I am a new R-user and I have 2 questions
about it.
1) I need to find specific sub-dataframe,
and then use specific column to calculate.
For example, after splitting dataframe, I find specific the
sub-dataframe, such as ?A.split [1]?.
But, I don?t know how to find ?time? and ?concentration? columns of ?A.split
[1]?.
2) The equation used to sub-dataframe is
2009 Dec 17
2
Problem with spliting a dataframe values
Hi all,
Hi this is kiran
I am facing a problem to split a dataframe
that is..
i have a string like: "a,b,c|1,2,3|4,5,6|7,8,8"
first I have to split with respect to "|"
I did it with command
unlist(strsplit("a,b,c|1,2,3|4,5,6|7,8,8", "\\,"))
after getting that set i made it as a dataframe and it comes like
a,b,c
1,2,3
4,5,6
7,8,8
now i have to
2005 Oct 20
5
spliting an integer
Hi there,
From the vector X of integers,
X = c(11999, 122000, 81997)
I would like to make these two vectors:
Z= c(1999, 2000, 1997)
Y =c(1 , 12 , 8)
That is, each entry of vector Z receives the four last digits of each entry of X, and Y receives "the rest".
Any suggestions?
Thanks in advance,
Dimitri
[[alternative HTML version deleted]]
2011 Nov 23
2
bizarre seq() behavior?
Is there any rational explanation for the bizarre seq() behavior below?
> seq(2,8.1, lenght.out=3)
[1] 2 3 4 5 6 7 8
> help(seq)
> seq(2,8,length.out=3)
[1] 2 5 8
> seq(2,8.1,length.out=3)
[1] 2.00 5.05 8.10
Except maybe that it is early in the morning :)
Best regards,
Ryszard
Ryszard Czerminski
AstraZeneca Pharmaceuticals LP
35 Gatehouse Drive
Waltham, MA 02451
USA
781-839-4304
2012 Jan 12
3
strsplit() does not split on "."?
Any ideas what is wrong?
> strsplit("a.b", ".") # generates empty strings with split="."
[[1]]
[1] "" "" ""
> strsplit("a b", " ") # seems to work fine with split=" ", and other
characters...
[[1]]
[1] "a" "b"
>
> R.Version()
$platform
[1]