Hi, first time poster here. During my time using R, I have always found string concatenation to be (what I feel is) unnecessarily complicated by requiring the use of the paste() or similar commands. When searching for how to concatenate strings in R, several top search results show answers that say to write your own function or override the '+' operator. Sample code like the following from this <http://stackoverflow.com/questions/4730551/making-a-string-concatenation-operator-in-r> page "+" = function(x,y) { if(is.character(x) & is.character(y)) { return(paste(x , y, sep="")) } else { .Primitive("+")(x,y) }} An old (2005) post <https://stat.ethz.ch/pipermail/r-help/2005-February/066709.html> on r-help mentioned possible performance reasons as to why this type of string concatenation is not supported out of the box but did not go into detail. Can someone explain why such a basic task as this must be handled by paste() instead of just using the '+' operator directly? Would performance degrade much today if the '+' form of string concatenation were added into R by default? Josh Bradley [[alternative HTML version deleted]]
On Tue, Jun 16, 2015 at 6:32 PM, Joshua Bradley <jgbradley1 at gmail.com> wrote: [...]> An old (2005) post > <https://stat.ethz.ch/pipermail/r-help/2005-February/066709.html> on r-help > mentioned possible performance reasons as to why this type of string > concatenation is not supported out of the box but did not go into detail. > Can someone explain why such a basic task as this must be handled by > paste() instead of just using the '+' operator directly?Well, R-core's reason was in that email thread, quoting: "The issue is that only coercion between numeric (broad sense, including complex) types is supported for the arithmetical operators, presumably to avoid the ambiguity of things like x <- 123.45 y <- as.character(1) x + y Should that be 124.45 or "123.451"? One of the difficulties of any dispatch on two arguments is how to do the best matching on two classes, especially with symmetric operators like "+". Internally R favours simple fast rules." Personally, I am not really convinced by this, because what currently happens is this: 1 + "1" #> Error in 1 + "1" : non-numeric argument to binary operator "1" + 1 #> Error in "1" + 1 : non-numeric argument to binary operator which is perfectly fine behavior, and it could stay the same with a '+' string concatenation operator, i.e.: - if both arguments are characters, call paste(), - otherwise go on and do whatever is being done right now. In other words, coercion to string is not important in the '+' operator.> Would performance > degrade much today if the '+' form of string concatenation were added into > R by default?Personally, I highly doubt it, but I don't have a benchmark to back this up. Gabor [...]
Hi Joshua, On 06/16/2015 03:32 PM, Joshua Bradley wrote:> Hi, first time poster here. During my time using R, I have always found > string concatenation to be (what I feel is) unnecessarily complicated by > requiring the use of the paste() or similar commands. > > > When searching for how to concatenate strings in R, several top search > results show answers that say to write your own function or override the > '+' operator. > > Sample code like the following from this > <http://stackoverflow.com/questions/4730551/making-a-string-concatenation-operator-in-r> > page > > "+" = function(x,y) { > if(is.character(x) & is.character(y)) { > return(paste(x , y, sep="")) > } else { > .Primitive("+")(x,y) > }}Note that paste0() is a more convenient and more efficient way to concatenate strings: paste0(x, y) # no need to specify 'sep', no separator is inserted Related to this, one thing that has always bothered me is the different/inconsistent recycling schemes used by different binary operations in R: > 1:3 + integer(0) integer(0) > c("a", "b", "c") >= character(0) logical(0) > paste0(c("a", "b", "c"), character(0)) [1] "a" "b" "c" > mapply(paste0, c("a", "b", "c"), character(0)) Error in mapply(paste0, c("a", "b", "c"), character(0)) : zero-length inputs cannot be mixed with those of non-zero length If I was to override `+` to concatenate strings, I would make it stick to the recycling scheme used by arithmetic and comparison operators (which is the most sensible of all IMO). H.> > > > An old (2005) post > <https://stat.ethz.ch/pipermail/r-help/2005-February/066709.html> on r-help > mentioned possible performance reasons as to why this type of string > concatenation is not supported out of the box but did not go into detail. > Can someone explain why such a basic task as this must be handled by > paste() instead of just using the '+' operator directly? Would performance > degrade much today if the '+' form of string concatenation were added into > R by default? > > > > Josh Bradley > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319
On Tue, Jun 16, 2015 at 8:24 PM, Herv? Pag?s <hpages at fredhutch.org> wrote: [...]> > If I was to override `+` to concatenate strings, I would make it stick > to the recycling scheme used by arithmetic and comparison operators > (which is the most sensible of all IMO).Yeah, I agree, paste's recycling rules are sometimes painful. This could be "fixed" with a nice new '+' concatenation operator, too. :) Gabor> H.[...]
On Jun 16, 2015 3:44 PM, "Joshua Bradley" <jgbradley1 at gmail.com> wrote:> > Hi, first time poster here. During my time using R, I have always found > string concatenation to be (what I feel is) unnecessarily complicated by > requiring the use of the paste() or similar commands.I don't follow. In what sense is paste complicated to use? Not in the sense of it's actual behavior, since what you propose below has identical behavior. So is your objection simply the number of characters one must type? I would argue that having a separate verb makes code much more readable, particularly at a quick glance. I know a character will come out of paste no matter what goes in. That is not without value from a code maintenance perspective. IMHO. ~G> > > When searching for how to concatenate strings in R, several top search > results show answers that say to write your own function or override the > '+' operator. > > Sample code like the following from this > <http://stackoverflow.com/questions/4730551/making-a-string-concatenation-operator-in-r> > page > > "+" = function(x,y) { > if(is.character(x) & is.character(y)) { > return(paste(x , y, sep="")) > } else { > .Primitive("+")(x,y) > }} > > > > An old (2005) post > <https://stat.ethz.ch/pipermail/r-help/2005-February/066709.html> onr-help> mentioned possible performance reasons as to why this type of string > concatenation is not supported out of the box but did not go into detail. > Can someone explain why such a basic task as this must be handled by > paste() instead of just using the '+' operator directly? Would performance > degrade much today if the '+' form of string concatenation were added into > R by default? > > > > Josh Bradley > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel[[alternative HTML version deleted]]
Bad choice of words I'm afraid. What I'm ultimately pushing for is a feature request. To allow string concatenation with '+' by default. Sure I can write my own string addition function (like the example I posted previously) but I use it so often that I end up putting it in every script I write. It is ultimately a matter of readability and syntactic sugar I guess. As an example, I work in the bioinformatics domain and write R scripts for pipelines with calls to various programs that require a lot of parameters to be set/varied. Seeing "paste" everywhere detracts from reading the code (in my opinion). This may not be a very strong argument, but to give a bit more objective reason, I claim its more readable/intuitive because other big languages have also picked up this convention (C++, java, javascript, python, etc.). Josh Bradley Graduate Student University of Maryland On Tue, Jun 16, 2015 at 11:00 PM, Gabriel Becker <gmbecker at ucdavis.edu> wrote:> > On Jun 16, 2015 3:44 PM, "Joshua Bradley" <jgbradley1 at gmail.com> wrote: > > > > Hi, first time poster here. During my time using R, I have always found > > string concatenation to be (what I feel is) unnecessarily complicated by > > requiring the use of the paste() or similar commands. > > I don't follow. In what sense is paste complicated to use? Not in the > sense of it's actual behavior, since what you propose below has identical > behavior. So is your objection simply the number of characters one must > type? > > I would argue that having a separate verb makes code much more readable, > particularly at a quick glance. I know a character will come out of paste > no matter what goes in. That is not without value from a code maintenance > perspective. IMHO. > > ~G > > > > > > > When searching for how to concatenate strings in R, several top search > > results show answers that say to write your own function or override the > > '+' operator. > > > > Sample code like the following from this > > < > http://stackoverflow.com/questions/4730551/making-a-string-concatenation-operator-in-r > > > > page > > > > "+" = function(x,y) { > > if(is.character(x) & is.character(y)) { > > return(paste(x , y, sep="")) > > } else { > > .Primitive("+")(x,y) > > }} > > > > > > > > An old (2005) post > > <https://stat.ethz.ch/pipermail/r-help/2005-February/066709.html> on > r-help > > mentioned possible performance reasons as to why this type of string > > concatenation is not supported out of the box but did not go into detail. > > Can someone explain why such a basic task as this must be handled by > > paste() instead of just using the '+' operator directly? Would > performance > > degrade much today if the '+' form of string concatenation were added > into > > R by default? > > > > > > > > Josh Bradley > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]