I recently had a long argument wrt the survival package, namely that the following code didn't do what they expected, and so they reported it as a bug ? survival::coxph( survival::Surv(time, status) ~ age + sex + survival::strata(inst), data=lung) a. The Google R style guide? recommends that one put :: everywhere b. This breaks the recognition of cluster as a "special" in the terms function. I've been stubborn and said that their misunderstanding of how formulas work is not my problem.?? But I'm sure that the issue will come up again, and multiple other packages will break. A big problem is that the code runs, it just gives the wrong answer. Suggestions? Terry T. [[alternative HTML version deleted]]
I mean if the person filing the bug regards style as more important than the truth of how R treats formulas then they?re literally talking in another language. I strongly recommend you do nothing or at most make a note in the documentation addressing this. Your time is too valuable. On Tue, 25 Feb 2020 at 12:56 am, Therneau, Terry M., Ph.D. via R-devel < r-devel at r-project.org> wrote:> I recently had a long argument wrt the survival package, namely that the > following code > didn't do what they expected, and so they reported it as a bug > > survival::coxph( survival::Surv(time, status) ~ age + sex + > survival::strata(inst), > data=lung) > > a. The Google R style guide recommends that one put :: everywhere > b. This breaks the recognition of cluster as a "special" in the terms > function. > > I've been stubborn and said that their misunderstanding of how formulas > work is not my > problem. But I'm sure that the issue will come up again, and multiple > other packages > will break. > > A big problem is that the code runs, it just gives the wrong answer. > > Suggestions? > > Terry T. > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
In the long run, coming up with a way to parse specials in formulas that is both clean and robust is a good idea - annoying users are a little bit like CRAN maintainers in this respect. I think I would probably do this by testing identical(eval(extracted_head), survival::Surv) - but this has lots of potential annoyances (what if extracted_head is a symbol that can't be found in any attached environment? Do we have to start with if (length(find(deparse(extracted_head))>0) ? In the short run, a clear note in the documentation seems entirely sufficient. On Mon, Feb 24, 2020 at 12:01 PM Hugh Parsonage <hugh.parsonage at gmail.com> wrote:> > I mean if the person filing the bug regards style as more important than > the truth of how R treats formulas then they?re literally talking in > another language. > > I strongly recommend you do nothing or at most make a note in the > documentation addressing this. Your time is too valuable. > > On Tue, 25 Feb 2020 at 12:56 am, Therneau, Terry M., Ph.D. via R-devel < > r-devel at r-project.org> wrote: > > > I recently had a long argument wrt the survival package, namely that the > > following code > > didn't do what they expected, and so they reported it as a bug > > > > survival::coxph( survival::Surv(time, status) ~ age + sex + > > survival::strata(inst), > > data=lung) > > > > a. The Google R style guide recommends that one put :: everywhere > > b. This breaks the recognition of cluster as a "special" in the terms > > function. > > > > I've been stubborn and said that their misunderstanding of how formulas > > work is not my > > problem. But I'm sure that the issue will come up again, and multiple > > other packages > > will break. > > > > A big problem is that the code runs, it just gives the wrong answer. > > > > Suggestions? > > > > Terry T. > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
On 24/02/2020 8:55 a.m., Therneau, Terry M., Ph.D. via R-devel wrote:> I recently had a long argument wrt the survival package, namely that the following code > didn't do what they expected, and so they reported it as a bug > > ? survival::coxph( survival::Surv(time, status) ~ age + sex + survival::strata(inst), > data=lung) > > a. The Google R style guide? recommends that one put :: everywhere > b. This breaks the recognition of cluster as a "special" in the terms function. > > I've been stubborn and said that their misunderstanding of how formulas work is not my > problem.?? But I'm sure that the issue will come up again, and multiple other packages > will break. > > A big problem is that the code runs, it just gives the wrong answer. > > Suggestions?I don't know how widely used survival::strata is versus the special strata (or cluster, or other specials). If you were just introducing this now, I'd try to make sure that only one of those worked: don't have any functions matching the names of specials, or have functions that generate an error if you call them. I did that in the much less widely used "tables" package, e.g. Heading() has special interpretation, and the Heading function is defined as Heading <- function(name = NULL, override = TRUE, character.only = FALSE, nearData = TRUE) stop("This is a pseudo-function, not meant to be called.") However, survival has far more users than tables does, so changing the name of your special functions or the corresponding regular functions could be a huge headache. Perhaps there's a way to set a flag before evaluating the function in the formula, and generate a warning if survival::strata is called when it looks like the special function is intended. Duncan Murdoch
Notice that the stats package contains the same issue: For some reason it defines an offset() function (for no particular reason, afaics) which just returns its argument. So> x <- rnorm(10) > y <- z <- 1:10 > lm(x~y+offset(z))Call: lm(formula = x ~ y + offset(z)) Coefficients: (Intercept) y 0.8253 -1.0840> lm(x~y+stats::offset(z))Call: lm(formula = x ~ y + stats::offset(z)) Coefficients: (Intercept) y stats::offset(z) 0.82531 -0.08397 NA So I'm inclined to say that formulas are formulas and functions using formulas interpret functions and operators at their own convenience. You also deserve what you get from> lm(x~base::`+`(y,z))Call: lm(formula = x ~ base::`+`(y, z)) Coefficients: (Intercept) base::`+`(y, z) 0.82531 -0.04198 -pd> On 24 Feb 2020, at 19:21 , Duncan Murdoch <murdoch.duncan at gmail.com> wrote: > > On 24/02/2020 8:55 a.m., Therneau, Terry M., Ph.D. via R-devel wrote: >> I recently had a long argument wrt the survival package, namely that the following code >> didn't do what they expected, and so they reported it as a bug >> survival::coxph( survival::Surv(time, status) ~ age + sex + survival::strata(inst), >> data=lung) >> a. The Google R style guide recommends that one put :: everywhere >> b. This breaks the recognition of cluster as a "special" in the terms function. >> I've been stubborn and said that their misunderstanding of how formulas work is not my >> problem. But I'm sure that the issue will come up again, and multiple other packages >> will break. >> A big problem is that the code runs, it just gives the wrong answer. >> Suggestions? > > I don't know how widely used survival::strata is versus the special strata (or cluster, or other specials). If you were just introducing this now, I'd try to make sure that only one of those worked: don't have any functions matching the names of specials, or have functions that generate an error if you call them. I did that in the much less widely used "tables" package, e.g. Heading() has special interpretation, and the Heading function is defined as > > Heading <- function(name = NULL, override = TRUE, > character.only = FALSE, > nearData = TRUE) > stop("This is a pseudo-function, not meant to be called.") > > However, survival has far more users than tables does, so changing the name of your special functions or the corresponding regular functions could be a huge headache. > > Perhaps there's a way to set a flag before evaluating the function in the formula, and generate a warning if survival::strata is called when it looks like the special function is intended. > > Duncan Murdoch > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Terry, speaking as a package author I would say that the package is the primary unit of organisation of R functionality, and package considerations should trump R style considerations. Packages should be self-contained as far as possible. Having said that, many of my own packages use---shall we say---distinct idiom which is easy to misunderstand. My suggestion would be to document the misunderstanding. Add the survival::coxph() expression you quote above to coxph.Rd, maybe under a \warning{} section, explaining both a reasonable but wrong, and the correct way, to parse such constructions. Best wishes Robin On Tue, Feb 25, 2020 at 2:56 AM Therneau, Terry M., Ph.D. via R-devel < r-devel at r-project.org> wrote:> I recently had a long argument wrt the survival package, namely that the > following code > didn't do what they expected, and so they reported it as a bug > > survival::coxph( survival::Surv(time, status) ~ age + sex + > survival::strata(inst), > data=lung) > > a. The Google R style guide recommends that one put :: everywhere > b. This breaks the recognition of cluster as a "special" in the terms > function. > > I've been stubborn and said that their misunderstanding of how formulas > work is not my > problem. But I'm sure that the issue will come up again, and multiple > other packages > will break. > > A big problem is that the code runs, it just gives the wrong answer. > > Suggestions? > > Terry T. > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
Thanks to all who commented.?? In some defense of the person who reported the "bug", it appeared to be a company policy from above, influenced by the fact that they had been often burned by not using :: when multiple packages use the same symbol. Some further technical detail:? coxph has 3 specials: strata, cluster, and tt.? For the last of these I took a different route, and did not make it an exported symbol in the survival package. (I try to be smarter over time).? Instead, coxph first looks at the formula.? If that formula contains tt() then it creates an envionment that contains the function? "tt <- funciton(x) x", whose parent is the environment of the formula, changes the formula's environment to this, and then proceeds as usual.??? As a curiousity, I'm going to change cluster to the same form, and them run my script that executes R CMD? check on every package that has surivval as depend/import/suggest and see how many break.?? I'm guessing I'll find several that used cluster for their own ends. The strata function will be worse. Duncan -- to use your approach, I'd instead hack the formula before calling model.frame, so that it does not try to call cluster? Another approach I tried was not exporting cluster(), but that fails when model.frame tries to call it. Terry On 2/24/20 12:21 PM, Duncan Murdoch wrote:> On 24/02/2020 8:55 a.m., Therneau, Terry M., Ph.D. via R-devel wrote: >> I recently had a long argument wrt the survival package, namely that the following code >> didn't do what they expected, and so they reported it as a bug >> >> ? ? survival::coxph( survival::Surv(time, status) ~ age + sex + survival::strata(inst), >> data=lung) >> >> a. The Google R style guide? recommends that one put :: everywhere >> b. This breaks the recognition of cluster as a "special" in the terms function. >> >> I've been stubborn and said that their misunderstanding of how formulas work is not my >> problem.?? But I'm sure that the issue will come up again, and multiple other packages >> will break. >> >> A big problem is that the code runs, it just gives the wrong answer. >> >> Suggestions? > > I don't know how widely used survival::strata is versus the special strata (or cluster, > or other specials).? If you were just introducing this now, I'd try to make sure that > only one of those worked:? don't have any functions matching the names of specials, or > have functions that generate an error if you call them.? I did that in the much less > widely used "tables" package, e.g. Heading() has special interpretation, and the Heading > function is defined as > > Heading <- function(name = NULL, override = TRUE, > ??????????????????? character.only = FALSE, > ??????????? nearData = TRUE) > ??? stop("This is a pseudo-function, not meant to be called.") > > However, survival has far more users than tables does, so changing the name of your > special functions or the corresponding regular functions could be a huge headache. > > Perhaps there's a way to set a flag before evaluating the function in the formula, and > generate a warning if survival::strata is called when it looks like the special function > is intended. > > Duncan Murdoch[[alternative HTML version deleted]]