Rui Barradas
2022-Jul-20 08:15 UTC
[Rd] [External] Warning with new placeholder piped to data.frame extractors `[` and `[[`.
Hello, I agree with several points you've made. The code of the data.frame methods for `[` and `[[` is already complicated enough and a revision is probably not worth the effort, constructs like piping to `[` and `[[` is not furthering the cause of readability and a new base R dplyr::pull like function would put an extra development and maintenace burden on the R Core Team, to which we are in great debt for their excellent and already difficult and time consuming work developing, maintaining and making R evolve along the years. My question, if the named argument syntax is mandatory then it should not throw a warning, seems to have raised a consensus that this use of the new pipe operator and placeholder should be discouraged (Toby), considered a bug (Gabriel) or maybe intentional (Duncan). Definitely an unclear idiom to be avoided and not a priority. I still find it strange but if R is telling the programmer to write better code then follow the advice. (As a side note, all of the following work as expected: 1:6 |> `[`(x = _, 2) 1:6 |> `[[`(x = _, 2) matrix(1:6, nrow = 3) |> `[`(x = _, 2, 2) matrix(1:6, nrow = 3) |> `[`(x = _, 2, ) matrix(1:6, nrow = 3) |> `[`(x = _, , 2) list(1:6, b = 7:10) |> `[`(x = _, 2) list(1:6, b = 7:10) |> `[[`(x = _, 2) list(1:6, b = 7:10) |> `$`(x = _, 'b') So this is specific to the data.frame methods.) Hope this helps, Rui Barradas ?s 23:44 de 18/07/2022, luke-tierney at uiowa.edu escreveu:> On Sat, 16 Jul 2022, Rui Barradas wrote: > >> Hello, >> >> When piping to any of `[.data.frame` or `[[.data.frame`, the >> placeholder in mandatory. >> >> >> df1 <- data.frame(y = 1:10, f = rep(c("a", "b"), each = 5)) >> >> aggregate(y ~ f, df1, mean) |> `[`('y') >> # Error: function '[' not supported in RHS call of a pipe >> >> aggregate(y ~ f, df1, mean) |> `[[`('y') >> # Error: function '[' not supported in RHS call of a pipe >> >> >> >> But if used it throws a warning. >> >> >> >> aggregate(y ~ f, df1, mean) |> `[`(x = _, 'y') >> #? Warning in `[.data.frame`(x = aggregate(y ~ f, df1, mean), "y"): >> named arguments >> #? other than 'drop' are discouraged >> #??? y >> #? 1 3 >> #? 2 8 >> >> aggregate(y ~ f, df1, mean) |> `[[`(x = _, 'y') >> #? Warning in `[[.data.frame`(x = aggregate(y ~ f, df1, mean), "y"): >> named >> #? arguments other than 'exact' are discouraged >> #? [1] 3 8 >> > > The pipe syntax requirs that the placeolder be used as a named > argument.? If you do that, then the syntax is legal and parses > successfully. > >> Hasn't this become inconsistent behavior? >> More than merely right, the named argument is mandatory, it shouldn't >> give warnings. > > Any R function can decide whether it wants to allow explicitly named > arguments.? Disallowing or discouraging using explicitly named > arguments requires some work and is usually not a good idea. In the > case of the data.frame mechods for [ and [[ the decision was made to > discourage using named arguments other than 'exact'. This seems to > have been to allow a more an expedient way to implement these > functions. This could be revisited, but I doubt is is worth the effort. > > For me the main reason for using pipes is to make code more > readable. Using `[` and such constructs is not furthering that > cause. When I use pipes I am almost always using tidyverse > features, so I have dpyr::pull available, which is more readable, > to me at least. Arguably, base R could have a similar function, > but again I doubt this would be a good investment of time. > > An option that we have experimented with is to allow the placeholder > at the head of an extraction chain. This is supported in the > experimental branch at > https://svn.r-project.org/R/branches/R-syntax. So for example: > > ??? > mtcars |> _$cyl[1] > ??? [1] 6 > > This may make it into R-devel for the next release, but it still needs > more testing. > > Best, > > luke > >> >> Hope this helps, >> >> Rui Barradas >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> >
iuke-tier@ey m@iii@g oii uiow@@edu
2022-Jul-21 01:16 UTC
[Rd] [External] Warning with new placeholder piped to data.frame extractors `[` and `[[`.
On Wed, 20 Jul 2022, Rui Barradas wrote:> Hello, > > I agree with several points you've made. > > The code of the data.frame methods for `[` and `[[` is already complicated > enough and a revision is probably not worth the effort, constructs like > piping to `[` and `[[` is not furthering the cause of readability and a new > base R dplyr::pull like function would put an extra development and > maintenace burden on the R Core Team, to which we are in great debt for their > excellent and already difficult and time consuming work developing, > maintaining and making R evolve along the years. > > My question, if the named argument syntax is mandatory then it should not > throw a warning, seems to have raised a consensus that this use of the new > pipe operator and placeholder should be discouraged (Toby), considered a bug > (Gabriel) or maybe intentional (Duncan). Definitely an unclear idiom to be > avoided and not a priority. > > I still find it strange but if R is telling the programmer to write better > code then follow the advice. > > (As a side note, all of the following work as expected: > > 1:6 |> `[`(x = _, 2) > 1:6 |> `[[`(x = _, 2)Depends on what you expext. This is probably not what you expect: > `[`(2, x = 1:6) [1] 2 NA NA NA NA NA For performance reasons many primitives were implemented to not do argument matching on named arguments but to accept arguments by position. This is particularly true for syntactically special functions like arithmetic and extraction operators. You can use named arguments in these, but the names are ignored by the default methods, which just go by position. S3 methods implemented as R functions usually will handle the named arguments in the usual way, but can choose not to, as the data.frame extraction methods do. Arguably the performance issue is now moot as almost all performance-critical code will be byte compiled. But adding argument matching in all primitives is not something I can see getting high priority at the moment. As far as I can see, it looks like dropping the warning for a named 'x' argument in the S3 extraction methods for data.frame would be fairly straightforward and shouldn't cause any disruption. But this wouldn't make it into a release until the placeholder is allowed at the head of an extraction chain, assuming we go there. Best, luke> > matrix(1:6, nrow = 3) |> `[`(x = _, 2, 2) > matrix(1:6, nrow = 3) |> `[`(x = _, 2, ) > matrix(1:6, nrow = 3) |> `[`(x = _, , 2) > > list(1:6, b = 7:10) |> `[`(x = _, 2) > list(1:6, b = 7:10) |> `[[`(x = _, 2) > list(1:6, b = 7:10) |> `$`(x = _, 'b') > > So this is specific to the data.frame methods.) > > Hope this helps, > > Rui Barradas > > ?s 23:44 de 18/07/2022, luke-tierney at uiowa.edu escreveu: >> On Sat, 16 Jul 2022, Rui Barradas wrote: >> >>> Hello, >>> >>> When piping to any of `[.data.frame` or `[[.data.frame`, the placeholder >>> in mandatory. >>> >>> >>> df1 <- data.frame(y = 1:10, f = rep(c("a", "b"), each = 5)) >>> >>> aggregate(y ~ f, df1, mean) |> `[`('y') >>> # Error: function '[' not supported in RHS call of a pipe >>> >>> aggregate(y ~ f, df1, mean) |> `[[`('y') >>> # Error: function '[' not supported in RHS call of a pipe >>> >>> >>> >>> But if used it throws a warning. >>> >>> >>> >>> aggregate(y ~ f, df1, mean) |> `[`(x = _, 'y') >>> #? Warning in `[.data.frame`(x = aggregate(y ~ f, df1, mean), "y"): named >>> arguments >>> #? other than 'drop' are discouraged >>> #??? y >>> #? 1 3 >>> #? 2 8 >>> >>> aggregate(y ~ f, df1, mean) |> `[[`(x = _, 'y') >>> #? Warning in `[[.data.frame`(x = aggregate(y ~ f, df1, mean), "y"): named >>> #? arguments other than 'exact' are discouraged >>> #? [1] 3 8 >>> >> >> The pipe syntax requirs that the placeolder be used as a named >> argument.? If you do that, then the syntax is legal and parses >> successfully. >> >>> Hasn't this become inconsistent behavior? >>> More than merely right, the named argument is mandatory, it shouldn't give >>> warnings. >> >> Any R function can decide whether it wants to allow explicitly named >> arguments.? Disallowing or discouraging using explicitly named >> arguments requires some work and is usually not a good idea. In the >> case of the data.frame mechods for [ and [[ the decision was made to >> discourage using named arguments other than 'exact'. This seems to >> have been to allow a more an expedient way to implement these >> functions. This could be revisited, but I doubt is is worth the effort. >> >> For me the main reason for using pipes is to make code more >> readable. Using `[` and such constructs is not furthering that >> cause. When I use pipes I am almost always using tidyverse >> features, so I have dpyr::pull available, which is more readable, >> to me at least. Arguably, base R could have a similar function, >> but again I doubt this would be a good investment of time. >> >> An option that we have experimented with is to allow the placeholder >> at the head of an extraction chain. This is supported in the >> experimental branch at >> https://svn.r-project.org/R/branches/R-syntax. So for example: >> >> ??? > mtcars |> _$cyl[1] >> ??? [1] 6 >> >> This may make it into R-devel for the next release, but it still needs >> more testing. >> >> Best, >> >> luke >> >>> >>> Hope this helps, >>> >>> Rui Barradas >>> >>> ______________________________________________ >>> R-devel at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >> >-- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tierney at uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu