Proposed patch (I think .txt files work OK as attachments to the list?) On 2019-04-04 2:21 a.m., Martin Maechler wrote:>>>>>> Ben Bolker >>>>>> on Fri, 29 Mar 2019 12:34:50 -0400 writes: > > > I suspect that the issue is addressed (obliquely) in the examples, > > which shows that variables with spaces in them (or otherwise > > 'non-syntactic', i.e. not satisfying the constraints of legal R symbols) > > can be handled by protecting them with backticks (``) > > > ## using non-syntactic names: > > reformulate(c("`P/E`", "`% Growth`"), response = as.name("+-")) > > > It seems to me there could be room for a *documentation* patch (stating > > explicitly that if termlabels has length > 1 its elements are > > concatenated with "+", and explicitly stating that non-syntactic names > > must be protected with back-ticks). (There is a little bit of obscurity > > in the fact that the elements of termlabels don't have to be > > syntactically valid names: many will be included in formulas if they can > > be interpreted as *parseable* expressions, e.g. reformulate("x<2")) > > > I would be happy to give it a shot if the consensus is that it would > > be worthwhile. > > I think it would be worthwhile to add to the docs a bit. > > [With currently just your and my vote, we have a 100% consensus > ;-)] > > Martin > > > One workaround to the OP's problem is below (may be worth including > > as an example in docs) > > >> z <- c("a variable","another variable") > >> reformulate(z) > > Error in parse(text = termtext, keep.source = FALSE) : > > <text>:1:6: unexpected symbol > > 1: ~ a variable > > ^ > >> reformulate(sprintf("`%s`",z)) > > ~`a variable` + `another variable` > > > > > > On 2019-03-29 11:54 a.m., J C Nash wrote: > >> The main thing is to post the "small reproducible example". > >> > >> My (rather long term experience) can be written > >> > >> if (exists("reproducible example") ) { > >> DeveloperFixHappens() > >> } else { > >> NULL > >> } > >> > >> JN > >> > >> On 2019-03-29 11:38 a.m., Saren Tasciyan wrote: > >>> Well, first I can't sign in bugzilla myself, that is why I wrote here first. Also, I don't know if I have the time at > >>> the moment to provide tests, multiple examples or more. If that is not ok or welcomed, that is fine, I can come back, > >>> whenever I have more time to properly report the bug. > >>> > >>> I didn't find the existing bug report, sorry for that. > >>> > >>> Yes, it is related. My problem was that I have column names with spaces and current solution doesn't solve it. I have a > >>> solution, which works for me and maybe also for others. > >>> > >>> Either, someone can register me to bugzilla or I can post it here, which could give some direction to developers. I > >>> don't mind whichever is preferred here. > >>> > >>> Best, > >>> > >>> Saren > >>> > >>> > >>> On 29.03.19 09:29, Martin Maechler wrote: > >>>>>>>>> Saren Tasciyan > >>>>>>>>> ???? on Thu, 28 Mar 2019 17:02:10 +0100 writes: > >>>> ???? > Hi, > >>>> ???? > I have found a bug in reformulate function and have a solution for it. I > >>>> ???? > was wondering, where I can submit it? > >>>> > >>>> ???? > Best, > >>>> ???? > Saren > >>>> > >>>> > >>>> Well, you could have given a small reproducible example > >>>> depicting the bug, notably when posting here: > >>>> Just a prose text with no R code or other technical content is > >>>> almost always not really appropriate fo the R-devel mailing list. > >>>> > >>>> Further, in such a case you should google a bit and hopefully > >>>> have found > >>>> ??????? https://www.r-project.org/bugs.html > >>>> > >>>> which also mention reproducibility (and many more useful things). > >>>> > >>>> Then it also tells you about R's bug repository, also called > >>>> "R's bugzilla" at https://bugs.r-project.org/ > >>>> > >>>> and if you are diligent (but here, I'd say bugzilla is > >>>> (configured?) far from ideal), you'd also find bug PR#17359 > >>>> > >>>> ??? https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17359 > >>>> > >>>> which was reported already on Nov 2017 .. and only fixed > >>>> yesterday (in the "cleanup old bugs" process that happens > >>>> often before the big new spring release of R). > >>>> > >>>> So is your bug the same as that one? > >>>> > >>>> Martin > >>>> > >>>> ???? > -- > >>>> ???? > Saren Tasciyan > >>>> ???? > /PhD Student / Sixt Group/ > >>>> ???? > Institute of Science and Technology Austria > >>>> ???? > Am Campus 1 > >>>> ???? > 3400 Klosterneuburg, Austria > >>>> > >>>> ???? > ______________________________________________ > >>>> ???? > R-devel at r-project.org mailing list > >>>> ???? > https://stat.ethz.ch/mailman/listinfo/r-devel > >>>> > >>>> ______________________________________________ > >>>> R-devel at r-project.org mailing list > >>>> https://stat.ethz.ch/mailman/listinfo/r-devel > >> > >> ______________________________________________ > >> R-devel at r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-devel > >> > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel >-------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: reformulate_diff.txt URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20190404/34393c52/attachment.txt>
Martin Maechler
2019-Apr-05 07:38 UTC
[Rd] Bug in the "reformulate" function in stats package
>>>>> Ben Bolker >>>>> on Thu, 4 Apr 2019 12:46:37 -0400 writes:> Proposed patch Thank you Ben! [the rest is technical nit-picking .. but hopefully interesting to the smart R-devel reader base:] There was a very subtle thinko in your patch which is not easily diagnosed from R's parse_Rd(): Error in parse_Rd("/u/maechler/R/D/r-devel/R/src/library/stats/man/delete.response.Rd", : Unexpected end of input (in " quoted string opened at delete.response.Rd:78:63) In addition: Warning message: In parse_Rd("/u/maechler/R/D/r-devel/R/src/library/stats/man/delete.response.Rd", : newline within quoted string at delete.response.Rd:74 and even I needed more than a minute to find out that the culprit was that reformulate(sprintf("`%s`", x)) is not ok in *.Rd and must be reformulate(sprintf("`\%s`", x)) --------- > (I think .txt files work OK as attachments to the list?) yes, typically -- what really counts is if your e-mail program marks them with MIME-type 'text/plain' and most E-mail programs are very "silly" / "safe" nowadays and don't expect to have smart users and hence mark (and sometimes encode) everything unknown as non-text. Using very old flexible e-mail interfaces such as Emacs VM allow you to specify the MIME-type in addition to the file *and* it also proposes smart defaults, I think by using something like unix 'file' to determine that your 'foo.diff' file is plain text. {{ .. and we all know that Windows is sillily using file extensions to determine file type and only knows Windows-extensions plus those added explicitly by software installed; so nowadays *.rda is marked as an Rstudio file ... [argh]. }} Martin > On 2019-04-04 2:21 a.m., Martin Maechler wrote: >>>>>>> Ben Bolker >>>>>>> on Fri, 29 Mar 2019 12:34:50 -0400 writes: >> >> > I suspect that the issue is addressed (obliquely) in the examples, >> > which shows that variables with spaces in them (or otherwise >> > 'non-syntactic', i.e. not satisfying the constraints of legal R symbols) >> > can be handled by protecting them with backticks (``) >> >> > ## using non-syntactic names: >> > reformulate(c("`P/E`", "`% Growth`"), response = as.name("+-")) >> >> > It seems to me there could be room for a *documentation* patch (stating >> > explicitly that if termlabels has length > 1 its elements are >> > concatenated with "+", and explicitly stating that non-syntactic names >> > must be protected with back-ticks). (There is a little bit of obscurity >> > in the fact that the elements of termlabels don't have to be >> > syntactically valid names: many will be included in formulas if they can >> > be interpreted as *parseable* expressions, e.g. reformulate("x<2")) >> >> > I would be happy to give it a shot if the consensus is that it would >> > be worthwhile. >> >> I think it would be worthwhile to add to the docs a bit. >> >> [With currently just your and my vote, we have a 100% consensus >> ;-)] >> >> Martin >> >> > One workaround to the OP's problem is below (may be worth including >> > as an example in docs) >> >> >> z <- c("a variable","another variable") >> >> reformulate(z) >> > Error in parse(text = termtext, keep.source = FALSE) : >> > <text>:1:6: unexpected symbol >> > 1: ~ a variable >> > ^ >> >> reformulate(sprintf("`%s`",z)) >> > ~`a variable` + `another variable` >> >> >> >> >> > On 2019-03-29 11:54 a.m., J C Nash wrote: >> >> The main thing is to post the "small reproducible example". >> >> >> >> My (rather long term experience) can be written >> >> >> >> if (exists("reproducible example") ) { >> >> DeveloperFixHappens() >> >> } else { >> >> NULL >> >> } >> >> >> >> JN >> >> >> >> On 2019-03-29 11:38 a.m., Saren Tasciyan wrote: >> >>> Well, first I can't sign in bugzilla myself, that is why I wrote here first. Also, I don't know if I have the time at >> >>> the moment to provide tests, multiple examples or more. If that is not ok or welcomed, that is fine, I can come back, >> >>> whenever I have more time to properly report the bug. >> >>> >> >>> I didn't find the existing bug report, sorry for that. >> >>> >> >>> Yes, it is related. My problem was that I have column names with spaces and current solution doesn't solve it. I have a >> >>> solution, which works for me and maybe also for others. >> >>> >> >>> Either, someone can register me to bugzilla or I can post it here, which could give some direction to developers. I >> >>> don't mind whichever is preferred here. >> >>> >> >>> Best, >> >>> >> >>> Saren >> >>> >> >>> >> >>> On 29.03.19 09:29, Martin Maechler wrote: >> >>>>>>>>> Saren Tasciyan >> >>>>>>>>> ???? on Thu, 28 Mar 2019 17:02:10 +0100 writes: >> >>>> ???? > Hi, >> >>>> ???? > I have found a bug in reformulate function and have a solution for it. I >> >>>> ???? > was wondering, where I can submit it? >> >>>> >> >>>> ???? > Best, >> >>>> ???? > Saren >> >>>> >> >>>> >> >>>> Well, you could have given a small reproducible example >> >>>> depicting the bug, notably when posting here: >> >>>> Just a prose text with no R code or other technical content is >> >>>> almost always not really appropriate fo the R-devel mailing list. >> >>>> >> >>>> Further, in such a case you should google a bit and hopefully >> >>>> have found >> >>>> ??????? https://www.r-project.org/bugs.html >> >>>> >> >>>> which also mention reproducibility (and many more useful things). >> >>>> >> >>>> Then it also tells you about R's bug repository, also called >> >>>> "R's bugzilla" at https://bugs.r-project.org/ >> >>>> >> >>>> and if you are diligent (but here, I'd say bugzilla is >> >>>> (configured?) far from ideal), you'd also find bug PR#17359 >> >>>> >> >>>> ??? https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17359 >> >>>> >> >>>> which was reported already on Nov 2017 .. and only fixed >> >>>> yesterday (in the "cleanup old bugs" process that happens >> >>>> often before the big new spring release of R). >> >>>> >> >>>> So is your bug the same as that one? >> >>>> >> >>>> Martin >> >>>> >> >>>> ???? > -- >> >>>> ???? > Saren Tasciyan >> >>>> ???? > /PhD Student / Sixt Group/ >> >>>> ???? > Institute of Science and Technology Austria >> >>>> ???? > Am Campus 1 >> >>>> ???? > 3400 Klosterneuburg, Austria >> >>>> >> >>>> ???? > ______________________________________________ >> >>>> ???? > R-devel at r-project.org mailing list >> >>>> ???? > https://stat.ethz.ch/mailman/listinfo/r-devel >> >>>> >> >>>> ______________________________________________ >> >>>> R-devel at r-project.org mailing list >> >>>> https://stat.ethz.ch/mailman/listinfo/r-devel >> >> >> >> ______________________________________________ >> >> R-devel at r-project.org mailing list >> >> https://stat.ethz.ch/mailman/listinfo/r-devel >> >> >> >> > ______________________________________________ >> > R-devel at r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-devel >> > x[DELETED ATTACHMENT external: reformulate.diff, plain text] > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Saren Tasciyan
2019-Apr-18 11:30 UTC
[Rd] Bug in the "reformulate" function in stats package
Hi, Sorry for writing this late, I was very busy. I started this discussion here. I wish I could write to bugs.r-project.org, but I don't have an account and I will write here instead. Meanwhile, I solved my problem with a simpler fix (please see attached file)/. / This requires that term labels are not "ticked". I think this is better, since it is easier to have column names unticked. New development function is IMO unnecessarily complicated. It requires strings to be ticked or as.name(). It is more intuitive to have a vector of column names. Best, Saren On 05.04.19 09:38, Martin Maechler wrote:>>>>>> Ben Bolker >>>>>> on Thu, 4 Apr 2019 12:46:37 -0400 writes: > > Proposed patch > > Thank you Ben! > > > [the rest is technical nit-picking .. but hopefully interesting > to the smart R-devel reader base:] > > There was a very subtle thinko in your patch which is not easily > diagnosed from R's parse_Rd(): > > Error in parse_Rd("/u/maechler/R/D/r-devel/R/src/library/stats/man/delete.response.Rd", : > Unexpected end of input (in " quoted string opened at delete.response.Rd:78:63) > In addition: Warning message: > In parse_Rd("/u/maechler/R/D/r-devel/R/src/library/stats/man/delete.response.Rd", : > newline within quoted string at delete.response.Rd:74 > > and even I needed more than a minute to find out that the > culprit was that > > reformulate(sprintf("`%s`", x)) > > is not ok in *.Rd and must be > > reformulate(sprintf("`\%s`", x)) > > --------- > > > (I think .txt files work OK as attachments to the list?) > > yes, typically -- what really counts is if your e-mail program > marks them with MIME-type 'text/plain' > and most E-mail programs are very "silly" / "safe" nowadays and > don't expect to have smart users and hence mark (and sometimes > encode) everything unknown as non-text. > > Using very old flexible e-mail interfaces such as Emacs VM allow > you to specify the MIME-type in addition to the file *and* it > also proposes smart defaults, I think by using something like > unix 'file' to determine that your 'foo.diff' file is plain text. > {{ .. and we all know that Windows is sillily using file extensions > to determine file type and only knows Windows-extensions plus > those added explicitly by software installed; so nowadays *.rda > is marked as an Rstudio file ... [argh]. > }} > > Martin > > > On 2019-04-04 2:21 a.m., Martin Maechler wrote: > >>>>>>> Ben Bolker > >>>>>>> on Fri, 29 Mar 2019 12:34:50 -0400 writes: > >> > >> > I suspect that the issue is addressed (obliquely) in the examples, > >> > which shows that variables with spaces in them (or otherwise > >> > 'non-syntactic', i.e. not satisfying the constraints of legal R symbols) > >> > can be handled by protecting them with backticks (``) > >> > >> > ## using non-syntactic names: > >> > reformulate(c("`P/E`", "`% Growth`"), response = as.name("+-")) > >> > >> > It seems to me there could be room for a *documentation* patch (stating > >> > explicitly that if termlabels has length > 1 its elements are > >> > concatenated with "+", and explicitly stating that non-syntactic names > >> > must be protected with back-ticks). (There is a little bit of obscurity > >> > in the fact that the elements of termlabels don't have to be > >> > syntactically valid names: many will be included in formulas if they can > >> > be interpreted as *parseable* expressions, e.g. reformulate("x<2")) > >> > >> > I would be happy to give it a shot if the consensus is that it would > >> > be worthwhile. > >> > >> I think it would be worthwhile to add to the docs a bit. > >> > >> [With currently just your and my vote, we have a 100% consensus > >> ;-)] > >> > >> Martin > >> > >> > One workaround to the OP's problem is below (may be worth including > >> > as an example in docs) > >> > >> >> z <- c("a variable","another variable") > >> >> reformulate(z) > >> > Error in parse(text = termtext, keep.source = FALSE) : > >> > <text>:1:6: unexpected symbol > >> > 1: ~ a variable > >> > ^ > >> >> reformulate(sprintf("`%s`",z)) > >> > ~`a variable` + `another variable` > >> > >> > >> > >> > >> > On 2019-03-29 11:54 a.m., J C Nash wrote: > >> >> The main thing is to post the "small reproducible example". > >> >> > >> >> My (rather long term experience) can be written > >> >> > >> >> if (exists("reproducible example") ) { > >> >> DeveloperFixHappens() > >> >> } else { > >> >> NULL > >> >> } > >> >> > >> >> JN > >> >> > >> >> On 2019-03-29 11:38 a.m., Saren Tasciyan wrote: > >> >>> Well, first I can't sign in bugzilla myself, that is why I wrote here first. Also, I don't know if I have the time at > >> >>> the moment to provide tests, multiple examples or more. If that is not ok or welcomed, that is fine, I can come back, > >> >>> whenever I have more time to properly report the bug. > >> >>> > >> >>> I didn't find the existing bug report, sorry for that. > >> >>> > >> >>> Yes, it is related. My problem was that I have column names with spaces and current solution doesn't solve it. I have a > >> >>> solution, which works for me and maybe also for others. > >> >>> > >> >>> Either, someone can register me to bugzilla or I can post it here, which could give some direction to developers. I > >> >>> don't mind whichever is preferred here. > >> >>> > >> >>> Best, > >> >>> > >> >>> Saren > >> >>> > >> >>> > >> >>> On 29.03.19 09:29, Martin Maechler wrote: > >> >>>>>>>>> Saren Tasciyan > >> >>>>>>>>> ???? on Thu, 28 Mar 2019 17:02:10 +0100 writes: > >> >>>> ???? > Hi, > >> >>>> ???? > I have found a bug in reformulate function and have a solution for it. I > >> >>>> ???? > was wondering, where I can submit it? > >> >>>> > >> >>>> ???? > Best, > >> >>>> ???? > Saren > >> >>>> > >> >>>> > >> >>>> Well, you could have given a small reproducible example > >> >>>> depicting the bug, notably when posting here: > >> >>>> Just a prose text with no R code or other technical content is > >> >>>> almost always not really appropriate fo the R-devel mailing list. > >> >>>> > >> >>>> Further, in such a case you should google a bit and hopefully > >> >>>> have found > >> >>>> ??????? https://www.r-project.org/bugs.html > >> >>>> > >> >>>> which also mention reproducibility (and many more useful things). > >> >>>> > >> >>>> Then it also tells you about R's bug repository, also called > >> >>>> "R's bugzilla" at https://bugs.r-project.org/ > >> >>>> > >> >>>> and if you are diligent (but here, I'd say bugzilla is > >> >>>> (configured?) far from ideal), you'd also find bug PR#17359 > >> >>>> > >> >>>> ??? https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17359 > >> >>>> > >> >>>> which was reported already on Nov 2017 .. and only fixed > >> >>>> yesterday (in the "cleanup old bugs" process that happens > >> >>>> often before the big new spring release of R). > >> >>>> > >> >>>> So is your bug the same as that one? > >> >>>> > >> >>>> Martin > >> >>>> > >> >>>> ???? > -- > >> >>>> ???? > Saren Tasciyan > >> >>>> ???? > /PhD Student / Sixt Group/ > >> >>>> ???? > Institute of Science and Technology Austria > >> >>>> ???? > Am Campus 1 > >> >>>> ???? > 3400 Klosterneuburg, Austria > >> >>>> > >> >>>> ???? > ______________________________________________ > >> >>>> ???? > R-devel at r-project.org mailing list > >> >>>> ???? > https://stat.ethz.ch/mailman/listinfo/r-devel > >> >>>> > >> >>>> ______________________________________________ > >> >>>> R-devel at r-project.org mailing list > >> >>>> https://stat.ethz.ch/mailman/listinfo/r-devel > >> >> > >> >> ______________________________________________ > >> >> R-devel at r-project.org mailing list > >> >> https://stat.ethz.ch/mailman/listinfo/r-devel > >> >> > >> > >> > ______________________________________________ > >> > R-devel at r-project.org mailing list > >> > https://stat.ethz.ch/mailman/listinfo/r-devel > >> > > x[DELETED ATTACHMENT external: reformulate.diff, plain text] > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- Saren Tasciyan /PhD Student / Sixt Group/ Institute of Science and Technology Austria Am Campus 1 3400 Klosterneuburg, Austria