Gabriel Becker
2023-Mar-02 22:37 UTC
[Rd] transform.data.frame() ignores unnamed arguments when no named argument is provided
On Thu, Mar 2, 2023 at 2:02?PM Antoine Fabri <antoine.fabri at gmail.com> wrote:> Thanks and good point about unspecified behavior. The way it behaves now > (when it doesn't ignore) is more consistent with data.frame() though so I > prefer that to a "warn and ignore" behaviour: > > data.frame(a = 1, b = 2, 3) > > #> a b X3 > > #> 1 1 2 3 > > > data.frame(a = 1, 2, 3) > > #> a X2 X3 > > #> 1 1 2 3 > > > (and in general warnings make for unpleasant debugging so I prefer when we > don't add new ones if avoidable) >I find silence to be much more unpleasant in practice when debugging, myself, but that may be a personal preference.> > > playing a bit more with it, it would make sense to me that the following > have the same output: > > > coefficient <- 3 > > > data.frame(value1 = 5) |> transform(coefficient, value2 = coefficient * > value1) > > #> value1 X3 value2 > > #> 1 5 3 15 > > > data.frame(value1 = 5, coefficient) |> transform(value2 = coefficient * > value1) > > #> value1 coefficient value2 > > #> 1 5 3 15 > >I'm not so sure. data.frame() is doing some substitute magic to get the column name coefficient there.> coefficient = 3> data.frame(value1 = 5, coefficient)value1 coefficient 1 5 3 Beyond that these two pieces of code are doing subtly but crucially different things; in the latter, coefficient is a variable in the data.frame, and when transform resolves that symbol during calculation of value2, it *gets the column in the incoming data.frame*. In the former case, coefficient does not exist in the data.frame, so the symbol is being resolved somewhere else in the scope chain (in this case, the global environment). These happen to be the same, except for the column name , but we can see the difference if we change the code to> coefficient <- 3> data.frame(value1 = 5, coefficient = 4) |> transform(value2 = value1 *coefficient) value1 coefficient value2 1 5 4 20> data.frame(value1 = 5) |> transform(coefficient = 4, value2 = value1 *coefficient) value1 coefficient *value2* 1 5 4 *15* Please note that another way this difference could rear its head is if these arent' directly one after eachother in a pipe:> coefficient <- 3> df1 <- data.frame(value1 = 5, coefficient)> coefficient <- 4> df2 <- data.frame(value1 = 5)> df1 |> transform(value2 = value1 * coefficient)value1 coefficient value2 1 5 3 15> df2 |> transform(coefficient, value2 = value1 * coefficient)value1 X4 value2 1 5 4 20 Cause you know someday the place where you do that transform and the place where coefficient is initially set are gonna be far away from eachother, so whether you put coefficient into the incoming data, or don't will matter. Best, ~G [[alternative HTML version deleted]]> > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
Martin Maechler
2023-Mar-03 15:25 UTC
[Rd] transform.data.frame() ignores unnamed arguments when no named argument is provided
>>>>> Gabriel Becker >>>>> on Thu, 2 Mar 2023 14:37:18 -0800 writes:> On Thu, Mar 2, 2023 at 2:02?PM Antoine Fabri > <antoine.fabri at gmail.com> wrote: >> Thanks and good point about unspecified behavior. The way >> it behaves now (when it doesn't ignore) is more >> consistent with data.frame() though so I prefer that to a >> "warn and ignore" behaviour: >> >> data.frame(a = 1, b = 2, 3) >> >> #> a b X3 >> >> #> 1 1 2 3 >> >> >> data.frame(a = 1, 2, 3) >> >> #> a X2 X3 >> >> #> 1 1 2 3 >> >> >> (and in general warnings make for unpleasant debugging so >> I prefer when we don't add new ones if avoidable) >> > I find silence to be much more unpleasant in practice when > debugging, myself, but that may be a personal preference. +1 I also *strongly* disagree with the claim " in general warnings make for unpleasant debugging " That may be true for beginners (for whom debugging is often not really feasible anyway ..), but somewhat experienced useRs should know about options(warn = 1) # or options(warn = 2) # plus options(error = recover) # or tryCatch( ..., warning = ..) or {even more} Martin -- Martin Maechler ETH Zurich and R Core team
Possibly Parallel Threads
- transform.data.frame() ignores unnamed arguments when no named argument is provided
- transform.data.frame() ignores unnamed arguments when no named argument is provided
- transform.data.frame() ignores unnamed arguments when no named argument is provided
- transform.data.frame() ignores unnamed arguments when no named argument is provided
- transform.data.frame() ignores unnamed arguments when no named argument is provided