Joris Meys
2017-Jun-14 12:18 UTC
[Rd] [WISH / PATCH] possibility to split string literals across multiple lines
Mark, that's actually a fair statement, although your extra operator doesn't cause construction at parse time. You still call paste0(), but just add an extra layer on top of it. I also doubt that even in gigantic loops the benefit is going to be significant. Take following example: atestfun <- function(x){ y <- paste0("a very long", "string for testing") grep(x, y) } atestfun2 <- function(x){ y <- "a very long string for testing" grep(x,y) } cfun <- cmpfun(atestfun) cfun2 <- cmpfun(atestfun2) require(rbenchmark) benchmark(atestfun("a"), atestfun2("a"), cfun("a"), cfun2("a"), replications = 100000) Which gives after 100,000 replications: test replications elapsed relative 1 atestfun("a") 100000 0.83 1.339 2 atestfun2("a") 100000 0.62 1.000 3 cfun("a") 100000 0.81 1.306 4 cfun2("a") 100000 0.62 1.000 The patch can in principle make similar code marginally faster, but I'm not convinced the patch is going to make any real difference except for in some very specific and exotic cases. Even more, calling a function like the examples inside the loop is the only way I can come up with where this might be a problem. If you just construct the string inside the loop, there's two possibilities: - the string does not need to change, and then you better construct it outside of the loop - the string does need to change, and then you need paste() or paste0() anyway I'm not against incorporating the patch, as it would eliminate a few keystrokes. It's a neat idea, but I don't expect any other noticeable advantage from it. my humble 2 cents Cheers Joris On Wed, Jun 14, 2017 at 2:00 PM, Mark van der Loo <mark.vanderloo at gmail.com> wrote:> Having some line-breaking character for string literals would have benefits > as string literals can then be constructed parse-time rather than run-time. > I have run into this myself a few times as well. One way to at least > emulate something like that is the following. > > `%+%` <- function(x,y) paste0(x,y) > > "hello" %+% > " pretty" %+% > " world" > > > -Mark > > > > Op wo 14 jun. 2017 om 13:53 schreef Andreas Kersting <r-devel at akersting.de > >: > > > On Wed, 14 Jun 2017 06:12:09 -0500, Duncan Murdoch < > > murdoch.duncan at gmail.com> wrote: > > > > > On 14/06/2017 5:58 AM, Andreas Kersting wrote: > > > > Hi, > > > > > > > > I would really like to have a way to split long string literals > across > > > > multiple lines in R. > > > > > > I don't understand why you require the string to be a literal. Why not > > > construct the long string in an expression like > > > > > > paste0("aaa", > > > "bbb") > > > > > > ? Surely the execution time of the paste0 call is negligible. > > > > > > Duncan Murdoch > > > > Actually "execution time" is precisely one of the reasons why I would > like > > to see this feature as - depending on the context (e.g. in a tight loop) > - > > the execution time of paste0 (or probably also glue, thanks Gabor) is not > > necessarily insignificant. > > > > The other reason is style: I think it is cleaner if we can construct such > > a long string literal without the need for a function call. > > > > Andreas > > > > > > > > > > Currently, if a string literal spans multiple lines, there is no way > to > > > > inhibit the introduction of newline characters: > > > > > > > > > "aaa > > > > + bbb" > > > > [1] "aaa\nbbb" > > > > > > > > > > > > If a line ends with a backslash, it is just ignored: > > > > > > > > > "aaa\ > > > > + bbb" > > > > [1] "aaa\nbbb" > > > > > > > > > > > > We could use this fact to implement string splitting in a fairly > > > > backward-compatible way, since currently such trailing backslashes > > > > should hardly be used as they do not have any effect. The attached > > patch > > > > makes the parser ignore a newline character directly following a > > backslash: > > > > > > > > > "aaa\ > > > > + bbb" > > > > [1] "aaabbb" > > > > > > > > > > > > I personally would also prefer if leading blanks (spaces and tabs) in > > > > the second line are ignored to allow for proper indentation: > > > > > > > > > "aaa \ > > > > + bbb" > > > > [1] "aaa bbb" > > > > > > > > > "aaa\ > > > > + \ bbb" > > > > [1] "aaa bbb" > > > > > > > > This is also implemented by this patch. > > > > > > > > > > > > An alternative approach could be to have something like > > > > > > > > ("aaa " > > > > "bbb") > > > > > > > > or > > > > > > > > ("aaa ", > > > > "bbb") > > > > > > > > be interpreted as "aaa bbb". > > > > > > > > I don't know the ins and outs of the parser of R (hence: please very > > > > carefully review the attached patch), but I guess this would be more > > > > work to implement!? > > > > > > > > > > > > What do you think? Is there anybody else who is missing this feature > in > > > > the first place? > > > > > > > > Regards, > > > > Andreas > > > > > > > > > > > > > > > > ______________________________________________ > > > > R-devel at r-project.org mailing list > > > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > > > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Mathematical Modelling, Statistics and Bio-Informatics tel : +32 (0)9 264 61 79 Joris.Meys at Ugent.be ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]]
Mark van der Loo
2017-Jun-14 12:23 UTC
[Rd] [WISH / PATCH] possibility to split string literals across multiple lines
I know it doesn't cause construction at parse time, and it was also not what I said. What I meant was that it makes the syntax at least look a little as if you have a line-breaking character within string literals. Op wo 14 jun. 2017 om 14:18 schreef Joris Meys <jorismeys at gmail.com>:> Mark, that's actually a fair statement, although your extra operator > doesn't cause construction at parse time. You still call paste0(), but just > add an extra layer on top of it. > > I also doubt that even in gigantic loops the benefit is going to be > significant. Take following example: > > atestfun <- function(x){ > y <- paste0("a very long", > "string for testing") > grep(x, y) > } > atestfun2 <- function(x){ > y <- "a very long > string for testing" > grep(x,y) > } > cfun <- cmpfun(atestfun) > cfun2 <- cmpfun(atestfun2) > > require(rbenchmark) > benchmark(atestfun("a"), > atestfun2("a"), > cfun("a"), > cfun2("a"), > replications = 100000) > > Which gives after 100,000 replications: > > test replications elapsed relative > 1 atestfun("a") 100000 0.83 1.339 > 2 atestfun2("a") 100000 0.62 1.000 > 3 cfun("a") 100000 0.81 1.306 > 4 cfun2("a") 100000 0.62 1.000 > > The patch can in principle make similar code marginally faster, but I'm > not convinced the patch is going to make any real difference except for in > some very specific and exotic cases. Even more, calling a function like the > examples inside the loop is the only way I can come up with where this > might be a problem. If you just construct the string inside the loop, > there's two possibilities: > > - the string does not need to change, and then you better construct it > outside of the loop > - the string does need to change, and then you need paste() or paste0() > anyway > > I'm not against incorporating the patch, as it would eliminate a few > keystrokes. It's a neat idea, but I don't expect any other noticeable > advantage from it. > > my humble 2 cents > Cheers > Joris > > On Wed, Jun 14, 2017 at 2:00 PM, Mark van der Loo < > mark.vanderloo at gmail.com> wrote: > >> Having some line-breaking character for string literals would have >> benefits >> as string literals can then be constructed parse-time rather than >> run-time. >> I have run into this myself a few times as well. One way to at least >> emulate something like that is the following. >> >> `%+%` <- function(x,y) paste0(x,y) >> >> "hello" %+% >> " pretty" %+% >> " world" >> >> >> -Mark >> >> >> >> Op wo 14 jun. 2017 om 13:53 schreef Andreas Kersting < >> r-devel at akersting.de>: >> >> > On Wed, 14 Jun 2017 06:12:09 -0500, Duncan Murdoch < >> > murdoch.duncan at gmail.com> wrote: >> > >> > > On 14/06/2017 5:58 AM, Andreas Kersting wrote: >> > > > Hi, >> > > > >> > > > I would really like to have a way to split long string literals >> across >> > > > multiple lines in R. >> > > >> > > I don't understand why you require the string to be a literal. Why >> not >> > > construct the long string in an expression like >> > > >> > > paste0("aaa", >> > > "bbb") >> > > >> > > ? Surely the execution time of the paste0 call is negligible. >> > > >> > > Duncan Murdoch >> > >> > Actually "execution time" is precisely one of the reasons why I would >> like >> > to see this feature as - depending on the context (e.g. in a tight >> loop) - >> > the execution time of paste0 (or probably also glue, thanks Gabor) is >> not >> > necessarily insignificant. >> > >> > The other reason is style: I think it is cleaner if we can construct >> such >> > a long string literal without the need for a function call. >> > >> > Andreas >> > >> > > > >> > > > Currently, if a string literal spans multiple lines, there is no >> way to >> > > > inhibit the introduction of newline characters: >> > > > >> > > > > "aaa >> > > > + bbb" >> > > > [1] "aaa\nbbb" >> > > > >> > > > >> > > > If a line ends with a backslash, it is just ignored: >> > > > >> > > > > "aaa\ >> > > > + bbb" >> > > > [1] "aaa\nbbb" >> > > > >> > > > >> > > > We could use this fact to implement string splitting in a fairly >> > > > backward-compatible way, since currently such trailing backslashes >> > > > should hardly be used as they do not have any effect. The attached >> > patch >> > > > makes the parser ignore a newline character directly following a >> > backslash: >> > > > >> > > > > "aaa\ >> > > > + bbb" >> > > > [1] "aaabbb" >> > > > >> > > > >> > > > I personally would also prefer if leading blanks (spaces and tabs) >> in >> > > > the second line are ignored to allow for proper indentation: >> > > > >> > > > > "aaa \ >> > > > + bbb" >> > > > [1] "aaa bbb" >> > > > >> > > > > "aaa\ >> > > > + \ bbb" >> > > > [1] "aaa bbb" >> > > > >> > > > This is also implemented by this patch. >> > > > >> > > > >> > > > An alternative approach could be to have something like >> > > > >> > > > ("aaa " >> > > > "bbb") >> > > > >> > > > or >> > > > >> > > > ("aaa ", >> > > > "bbb") >> > > > >> > > > be interpreted as "aaa bbb". >> > > > >> > > > I don't know the ins and outs of the parser of R (hence: please very >> > > > carefully review the attached patch), but I guess this would be more >> > > > work to implement!? >> > > > >> > > > >> > > > What do you think? Is there anybody else who is missing this >> feature in >> > > > the first place? >> > > > >> > > > Regards, >> > > > Andreas >> > > > >> > > > >> > > > >> > > > ______________________________________________ >> > > > R-devel at r-project.org mailing list >> > > > https://stat.ethz.ch/mailman/listinfo/r-devel >> > > > >> > >> > ______________________________________________ >> > R-devel at r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-devel >> > >> >> [[alternative HTML version deleted]] > > >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > > > -- > Joris Meys > Statistical consultant > > Ghent University > Faculty of Bioscience Engineering > Department of Mathematical Modelling, Statistics and Bio-Informatics > > tel : +32 (0)9 264 61 79 <+32%209%20264%2061%2079> > Joris.Meys at Ugent.be > ------------------------------- > Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php >[[alternative HTML version deleted]]
Joris Meys
2017-Jun-14 12:35 UTC
[Rd] [WISH / PATCH] possibility to split string literals across multiple lines
Hi Mark, I got you. I just pointed out the obvious to illustrate why your emulation didn't eliminate the need for the real thing. I didn't mean to imply you weren't aware of this, even though it may seem so. Sometimes I'm not 100% aware of the subtleties of the English language. This seems one of those cases. Met vriendelijke groeten Joris On Wed, Jun 14, 2017 at 2:23 PM, Mark van der Loo <mark.vanderloo at gmail.com> wrote:> I know it doesn't cause construction at parse time, and it was also not > what I said. What I meant was that it makes the syntax at least look a > little as if you have a line-breaking character within string literals. > > Op wo 14 jun. 2017 om 14:18 schreef Joris Meys <jorismeys at gmail.com>: > >> Mark, that's actually a fair statement, although your extra operator >> doesn't cause construction at parse time. You still call paste0(), but just >> add an extra layer on top of it. >> >> I also doubt that even in gigantic loops the benefit is going to be >> significant. Take following example: >> >> atestfun <- function(x){ >> y <- paste0("a very long", >> "string for testing") >> grep(x, y) >> } >> atestfun2 <- function(x){ >> y <- "a very long >> string for testing" >> grep(x,y) >> } >> cfun <- cmpfun(atestfun) >> cfun2 <- cmpfun(atestfun2) >> >> require(rbenchmark) >> benchmark(atestfun("a"), >> atestfun2("a"), >> cfun("a"), >> cfun2("a"), >> replications = 100000) >> >> Which gives after 100,000 replications: >> >> test replications elapsed relative >> 1 atestfun("a") 100000 0.83 1.339 >> 2 atestfun2("a") 100000 0.62 1.000 >> 3 cfun("a") 100000 0.81 1.306 >> 4 cfun2("a") 100000 0.62 1.000 >> >> The patch can in principle make similar code marginally faster, but I'm >> not convinced the patch is going to make any real difference except for in >> some very specific and exotic cases. Even more, calling a function like the >> examples inside the loop is the only way I can come up with where this >> might be a problem. If you just construct the string inside the loop, >> there's two possibilities: >> >> - the string does not need to change, and then you better construct it >> outside of the loop >> - the string does need to change, and then you need paste() or paste0() >> anyway >> >> I'm not against incorporating the patch, as it would eliminate a few >> keystrokes. It's a neat idea, but I don't expect any other noticeable >> advantage from it. >> >> my humble 2 cents >> Cheers >> Joris >> >> On Wed, Jun 14, 2017 at 2:00 PM, Mark van der Loo < >> mark.vanderloo at gmail.com> wrote: >> >>> Having some line-breaking character for string literals would have >>> benefits >>> as string literals can then be constructed parse-time rather than >>> run-time. >>> I have run into this myself a few times as well. One way to at least >>> emulate something like that is the following. >>> >>> `%+%` <- function(x,y) paste0(x,y) >>> >>> "hello" %+% >>> " pretty" %+% >>> " world" >>> >>> >>> -Mark >>> >>> >>> >>> Op wo 14 jun. 2017 om 13:53 schreef Andreas Kersting < >>> r-devel at akersting.de>: >>> >>> > On Wed, 14 Jun 2017 06:12:09 -0500, Duncan Murdoch < >>> > murdoch.duncan at gmail.com> wrote: >>> > >>> > > On 14/06/2017 5:58 AM, Andreas Kersting wrote: >>> > > > Hi, >>> > > > >>> > > > I would really like to have a way to split long string literals >>> across >>> > > > multiple lines in R. >>> > > >>> > > I don't understand why you require the string to be a literal. Why >>> not >>> > > construct the long string in an expression like >>> > > >>> > > paste0("aaa", >>> > > "bbb") >>> > > >>> > > ? Surely the execution time of the paste0 call is negligible. >>> > > >>> > > Duncan Murdoch >>> > >>> > Actually "execution time" is precisely one of the reasons why I would >>> like >>> > to see this feature as - depending on the context (e.g. in a tight >>> loop) - >>> > the execution time of paste0 (or probably also glue, thanks Gabor) is >>> not >>> > necessarily insignificant. >>> > >>> > The other reason is style: I think it is cleaner if we can construct >>> such >>> > a long string literal without the need for a function call. >>> > >>> > Andreas >>> > >>> > > > >>> > > > Currently, if a string literal spans multiple lines, there is no >>> way to >>> > > > inhibit the introduction of newline characters: >>> > > > >>> > > > > "aaa >>> > > > + bbb" >>> > > > [1] "aaa\nbbb" >>> > > > >>> > > > >>> > > > If a line ends with a backslash, it is just ignored: >>> > > > >>> > > > > "aaa\ >>> > > > + bbb" >>> > > > [1] "aaa\nbbb" >>> > > > >>> > > > >>> > > > We could use this fact to implement string splitting in a fairly >>> > > > backward-compatible way, since currently such trailing backslashes >>> > > > should hardly be used as they do not have any effect. The attached >>> > patch >>> > > > makes the parser ignore a newline character directly following a >>> > backslash: >>> > > > >>> > > > > "aaa\ >>> > > > + bbb" >>> > > > [1] "aaabbb" >>> > > > >>> > > > >>> > > > I personally would also prefer if leading blanks (spaces and tabs) >>> in >>> > > > the second line are ignored to allow for proper indentation: >>> > > > >>> > > > > "aaa \ >>> > > > + bbb" >>> > > > [1] "aaa bbb" >>> > > > >>> > > > > "aaa\ >>> > > > + \ bbb" >>> > > > [1] "aaa bbb" >>> > > > >>> > > > This is also implemented by this patch. >>> > > > >>> > > > >>> > > > An alternative approach could be to have something like >>> > > > >>> > > > ("aaa " >>> > > > "bbb") >>> > > > >>> > > > or >>> > > > >>> > > > ("aaa ", >>> > > > "bbb") >>> > > > >>> > > > be interpreted as "aaa bbb". >>> > > > >>> > > > I don't know the ins and outs of the parser of R (hence: please >>> very >>> > > > carefully review the attached patch), but I guess this would be >>> more >>> > > > work to implement!? >>> > > > >>> > > > >>> > > > What do you think? Is there anybody else who is missing this >>> feature in >>> > > > the first place? >>> > > > >>> > > > Regards, >>> > > > Andreas >>> > > > >>> > > > >>> > > > >>> > > > ______________________________________________ >>> > > > R-devel at r-project.org mailing list >>> > > > https://stat.ethz.ch/mailman/listinfo/r-devel >>> > > > >>> > >>> > ______________________________________________ >>> > R-devel at r-project.org mailing list >>> > https://stat.ethz.ch/mailman/listinfo/r-devel >>> > >>> >>> [[alternative HTML version deleted]] >> >> >>> >>> ______________________________________________ >>> R-devel at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >> >> >> >> -- >> Joris Meys >> Statistical consultant >> >> Ghent University >> Faculty of Bioscience Engineering >> Department of Mathematical Modelling, Statistics and Bio-Informatics >> >> tel : +32 (0)9 264 61 79 <+32%209%20264%2061%2079> >> Joris.Meys at Ugent.be >> ------------------------------- >> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php >> >-- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Mathematical Modelling, Statistics and Bio-Informatics tel : +32 (0)9 264 61 79 Joris.Meys at Ugent.be ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]]
Possibly Parallel Threads
- [WISH / PATCH] possibility to split string literals across multiple lines
- [WISH / PATCH] possibility to split string literals across multiple lines
- [WISH / PATCH] possibility to split string literals across multiple lines
- [WISH / PATCH] possibility to split string literals across multiple lines
- [WISH / PATCH] possibility to split string literals across multiple lines