charles.dupont at vanderbilt.edu
2006-Apr-17 14:00 UTC
[Rd] strsplit does not return correct value when spliting "" (PR#8777)
Full_Name: Charles Dupont Version: 2.2.0 OS: linux Submission from: (NULL) (160.129.129.136) when strsplit("", " ") returns character(0) where as strsplit("a", " ") returns "a". these return values are not constiant with each other. Charles Dupont
Charles Dupont
2006-Apr-17 17:27 UTC
[Rd] (PR#8777) strsplit does [not] return correct value when spliting ""
Now using R 2.3.0. I have a string that can be "". I want to find the max screen width of the all the lines in the string. so I run the command > x <- c("hello", "bob is\ngreat", "foo", "", "bar") > substrings <- strsplit(x, "\n"), type="width") > sapply(substrings, FUN=function(x) max(nchar(x, type="width"))) which returns [1] 5 6 3 -Inf 3 This happens because of the behavior of strsplit for a string that is not "" > strsplit("Hello\nBob", "\n") it returns [[1]] [1] "Hello" "Bob" for a string that is "" > strsplit("", "\n") it returns [[1]] character(0) I would expect [[1]] [1] "" because "" is character vector of length 1 containing a string of length 0, not a character vector of length 0. For any other string if the split string is not matched in argument x then it returns the original string x. The man page states in the value section that strsplit returns: A list of length 'length(x)' the 'i'-th element of which contains the vector of splits of 'x[i]'. It mentions no change in behavior if the value of x[i] = "". Prof Brian Ripley wrote:> Please use a current version of R: we are at 2.3.0RC (and we do ask you > not to report on obselete versions). > > What rule are you using, and where did you find it in the R documentation? > > In fact > >> strsplit("", " ") > > [[1]] > character(0) > > which is not as you stated. This is a feature, as it distinct from > >> strsplit(" ", " ") > > [[1]] > [1] "" > > Consider also > >> strsplit("", "") > > [[1]] > character(0) > >> strsplit("a", "") > > [[1]] > [1] "a" > >> strsplit("ab", "") > > [[1]] > [1] "a" "b" > > > On Mon, 17 Apr 2006, charles.dupont at vanderbilt.edu wrote: > >> Full_Name: Charles Dupont >> Version: 2.2.0 >> OS: linux >> Submission from: (NULL) (160.129.129.136) >> >> >> when >> >> strsplit("", " ") >> >> returns character(0) >> >> where as >> >> strsplit("a", " ") >> >> returns "a". >> >> these return values are not constiant with each other. >> >> Charles Dupont >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> >> >-- Charles Dupont Computer System Analyst School of Medicine Department of Biostatistics Vanderbilt University