Michael Chirico
2025-Jun-06 18:59 UTC
[Rd] Suggestion: default print method for S3 generics could offer some insights on '...' among registered methods
There is a big difference in how to think of '...' for non-generic functions like data.frame() vs. S3 generics. In the former, it means "any number of inputs" [e.g. columns]; in the latter, it means "any number of inputs [think c()], as well as any arguments that might be interpreted by class implementations". Understanding the difference for a given generic can require carefully reading lots of documentation. print(<generic>), which is useful for so many other contexts, can be a dead end. One idea is to extend the print() method to suggest to the reader which other arguments are available (among registered generics). Often ?<generic> will include the most common implementation, but not always so. For rbind (in a --vanilla session), we currently have one method, rbind.data.frame, that offers three arguments not present in the generic: make.row.names, stringsAsFactors, and factor.exclude. The proposal would be to mention this in the print(rbind) output somehow, e.g.> print(rbind)function (..., deparse.level = 1) .Internal(rbind(deparse.level, ...)) <bytecode: 0x73d4fd824e20> <environment: namespace:base> +Other arguments implemented by methods + factor.exclude: rbind.data.frame + make.row.names: rbind.data.frame + stringsAsFactors: rbind.data.frame I suggest grouping by argument, not generic, although something like this could be OK too: +Signatures of other methods + rbind.data.frame(..., deparse.level = 1, make.row.names = TRUE, stringsAsFactors = FALSE, + factor.exclude = TRUE) Where it gets more interesting is when there are many methods, e.g. for as.data.frame (again, in a --vanilla session):> print(as.data.frame)function (x, row.names = NULL, optional = FALSE, ...) { if (is.null(x)) return(as.data.frame(list())) UseMethod("as.data.frame") } <bytecode: 0x73d4fc1e70d0> <environment: namespace:base> +Other arguments implemented by methods + base: as.data.frame.table + check.names: as.data.frame.list + col.names: as.data.frame.list + cut.names: as.data.frame.list + fix.empty.names: as.data.frame.list + make.names: as.data.frame.matrix, as.data.frame.model.matrix + new.names: as.data.frame.list + nm: as.data.frame.bibentry, as.data.frame.complex, as.data.frame.Date, + as.data.frame.difftime, as.data.frame.factor, as.data.frame.integer, + as.data.frame.logical, as.data.frame.noquote, as.data.frame.numeric, + as.data.frame.numeric_version, as.data.frame.ordered, + as.data.frame.person, as.data.frame.POSIXct, as.data.frame.raw + responseName: as.data.frame.table + sep: as.data.frame.table + stringsAsFactors: as.data.frame.character, as.data.frame.list, + as.data.frame.matrix, as.data.frame.table Or +Signatures of other methods + as.data.frame.aovproj(x, ...) + as.data.frame.array(x, row.names = NULL, optional = FALSE, ...) + as.data.frame.AsIs(x, row.names = NULL, optional = FALSE, ...) + as.data.frame.bibentry(x, row.names = NULL, optional = FALSE, ..., nm = deparse1(substitute(x))) + as.data.frame.character(x, ..., stringsAsFactors = FALSE) + as.data.frame.citation(x, row.names = NULL, optional = FALSE, ...) + as.data.frame.complex(x, row.names = NULL, optional = FALSE, ..., nm = deparse1(substitute(x))) + as.data.frame.data.frame(x, row.names = NULL, ...) + as.data.frame.Date(x, row.names = NULL, optional = FALSE, ..., nm deparse1(substitute(x))) + as.data.frame.default(x, ...) + as.data.frame.difftime(x, row.names = NULL, optional = FALSE, ..., nm = deparse1(substitute(x))) + as.data.frame.factor(x, row.names = NULL, optional = FALSE, ..., nm = deparse1(substitute(x))) + as.data.frame.ftable(x, row.names = NULL, optional = FALSE, ...) + as.data.frame.integer(x, row.names = NULL, optional = FALSE, ..., nm = deparse1(substitute(x))) + as.data.frame.list(x, row.names = NULL, optional = FALSE, ..., cut.names = FALSE, + col.names = names(x), fix.empty.names = TRUE, new.names !missing(col.names), + check.names = !optional, stringsAsFactors = FALSE) + as.data.frame.logical(x, row.names = NULL, optional = FALSE, ..., nm = deparse1(substitute(x))) + as.data.frame.logLik(x, ...) + as.data.frame.matrix(x, row.names = NULL, optional = FALSE, make.names = TRUE, + ..., stringsAsFactors = FALSE) + as.data.frame.model.matrix(x, row.names = NULL, optional = FALSE, make.names = TRUE, + ...) + as.data.frame.noquote(x, row.names = NULL, optional = FALSE, ..., nm = deparse1(substitute(x))) + as.data.frame.numeric(x, row.names = NULL, optional = FALSE, ..., nm = deparse1(substitute(x))) + as.data.frame.numeric_version(x, row.names = NULL, optional FALSE, ..., nm = deparse1(substitute(x))) + as.data.frame.ordered(x, row.names = NULL, optional = FALSE, ..., nm = deparse1(substitute(x))) + as.data.frame.person(x, row.names = NULL, optional = FALSE, ..., nm = deparse1(substitute(x))) + as.data.frame.POSIXct(x, row.names = NULL, optional = FALSE, ..., nm = deparse1(substitute(x))) + as.data.frame.POSIXlt(x, row.names = NULL, optional = FALSE, ...) + as.data.frame.raw(x, row.names = NULL, optional = FALSE, ..., nm deparse1(substitute(x))) + as.data.frame.table(x, row.names = NULL, ..., responseName "Freq", stringsAsFactors = TRUE, + sep = "", base = list(LETTERS)) + as.data.frame.ts(x, ...) Obviously that's a bit more cluttered, but as.data.frame() should be a pretty unusual case. It also highlights better the differences in the two approaches: the former economizes on space and focuses on what sorts of arguments are available; the latter shows the defaults, does not hide the arguments shared with the generic, and will always produce as many lines as there are methods. There are other edge cases to think through (multiple registrations, interactions with S4, primitives, ...), but I want to first check with the list if this looks workable & valuable enough to pursue. Mike C ---- Code that helped with the above: f = as.data.frame # NB: methods() and getAnywhere() require {utils} m = methods(f) generic_args = names(formals(f)) f_methods = lapply(m, \(fn) getAnywhere(fn)$objs[[1L]]) names(f_methods) = m new_args = sapply(f_methods, \(g) setdiff(names(formals(g)), generic_args)) with( # group by argument name data.frame(method = rep(names(new_args), lengths(new_args)), arg unlist(new_args), row.names=NULL), {tbl = tapply(method, arg, toString); writeLines(paste0(names(tbl), ": ", tbl))} ) signatures=sapply(f_methods, \(g) paste(head(format(args(g)), -1), collapse="\n")) writeLines(paste0(names(signatures), gsub("^\\s*function\\s*", "", signatures)))
Joshua Ulrich
2025-Jun-08 22:57 UTC
[Rd] Suggestion: default print method for S3 generics could offer some insights on '...' among registered methods
Hi Mike, On Fri, Jun 6, 2025 at 1:59?PM Michael Chirico <michaelchirico4 at gmail.com> wrote:> > There is a big difference in how to think of '...' for non-generic > functions like data.frame() vs. S3 generics. > > In the former, it means "any number of inputs" [e.g. columns]; in the > latter, it means "any number of inputs [think c()], as well as any > arguments that might be interpreted by class implementations". > > Understanding the difference for a given generic can require carefully > reading lots of documentation. print(<generic>), which is useful for > so many other contexts, can be a dead end. > > One idea is to extend the print() method to suggest to the reader > which other arguments are available (among registered generics). Often > ?<generic> will include the most common implementation, but not always > so. > > For rbind (in a --vanilla session), we currently have one method, > rbind.data.frame, that offers three arguments not present in the > generic: make.row.names, stringsAsFactors, and factor.exclude. The > proposal would be to mention this in the print(rbind) output somehow, > e.g. > > > print(rbind) > function (..., deparse.level = 1) > .Internal(rbind(deparse.level, ...)) > <bytecode: 0x73d4fd824e20> > <environment: namespace:base> > > +Other arguments implemented by methods > + factor.exclude: rbind.data.frame > + make.row.names: rbind.data.frame > + stringsAsFactors: rbind.data.frame > > I suggest grouping by argument, not generic, although something like > this could be OK too: > > +Signatures of other methods > + rbind.data.frame(..., deparse.level = 1, make.row.names = TRUE, > stringsAsFactors = FALSE, > + factor.exclude = TRUE) > > Where it gets more interesting is when there are many methods, e.g. > for as.data.frame (again, in a --vanilla session): > > > print(as.data.frame) > function (x, row.names = NULL, optional = FALSE, ...) > { > if (is.null(x)) > return(as.data.frame(list())) > UseMethod("as.data.frame") > } > <bytecode: 0x73d4fc1e70d0> > <environment: namespace:base> > > +Other arguments implemented by methods > + base: as.data.frame.table > + check.names: as.data.frame.list > + col.names: as.data.frame.list > + cut.names: as.data.frame.list > + fix.empty.names: as.data.frame.list > + make.names: as.data.frame.matrix, as.data.frame.model.matrix > + new.names: as.data.frame.list > + nm: as.data.frame.bibentry, as.data.frame.complex, as.data.frame.Date, > + as.data.frame.difftime, as.data.frame.factor, as.data.frame.integer, > + as.data.frame.logical, as.data.frame.noquote, as.data.frame.numeric, > + as.data.frame.numeric_version, as.data.frame.ordered, > + as.data.frame.person, as.data.frame.POSIXct, as.data.frame.raw > + responseName: as.data.frame.table > + sep: as.data.frame.table > + stringsAsFactors: as.data.frame.character, as.data.frame.list, > + as.data.frame.matrix, as.data.frame.table > > Or > > +Signatures of other methods > + as.data.frame.aovproj(x, ...) > + as.data.frame.array(x, row.names = NULL, optional = FALSE, ...) > + as.data.frame.AsIs(x, row.names = NULL, optional = FALSE, ...) > + as.data.frame.bibentry(x, row.names = NULL, optional = FALSE, ..., > nm = deparse1(substitute(x))) > + as.data.frame.character(x, ..., stringsAsFactors = FALSE) > + as.data.frame.citation(x, row.names = NULL, optional = FALSE, ...) > + as.data.frame.complex(x, row.names = NULL, optional = FALSE, ..., > nm = deparse1(substitute(x))) > + as.data.frame.data.frame(x, row.names = NULL, ...) > + as.data.frame.Date(x, row.names = NULL, optional = FALSE, ..., nm > deparse1(substitute(x))) > + as.data.frame.default(x, ...) > + as.data.frame.difftime(x, row.names = NULL, optional = FALSE, ..., > nm = deparse1(substitute(x))) > + as.data.frame.factor(x, row.names = NULL, optional = FALSE, ..., nm > = deparse1(substitute(x))) > + as.data.frame.ftable(x, row.names = NULL, optional = FALSE, ...) > + as.data.frame.integer(x, row.names = NULL, optional = FALSE, ..., > nm = deparse1(substitute(x))) > + as.data.frame.list(x, row.names = NULL, optional = FALSE, ..., > cut.names = FALSE, > + col.names = names(x), fix.empty.names = TRUE, new.names > !missing(col.names), > + check.names = !optional, stringsAsFactors = FALSE) > + as.data.frame.logical(x, row.names = NULL, optional = FALSE, ..., > nm = deparse1(substitute(x))) > + as.data.frame.logLik(x, ...) > + as.data.frame.matrix(x, row.names = NULL, optional = FALSE, > make.names = TRUE, > + ..., stringsAsFactors = FALSE) > + as.data.frame.model.matrix(x, row.names = NULL, optional = FALSE, > make.names = TRUE, > + ...) > + as.data.frame.noquote(x, row.names = NULL, optional = FALSE, ..., > nm = deparse1(substitute(x))) > + as.data.frame.numeric(x, row.names = NULL, optional = FALSE, ..., > nm = deparse1(substitute(x))) > + as.data.frame.numeric_version(x, row.names = NULL, optional > FALSE, ..., nm = deparse1(substitute(x))) > + as.data.frame.ordered(x, row.names = NULL, optional = FALSE, ..., > nm = deparse1(substitute(x))) > + as.data.frame.person(x, row.names = NULL, optional = FALSE, ..., nm > = deparse1(substitute(x))) > + as.data.frame.POSIXct(x, row.names = NULL, optional = FALSE, ..., > nm = deparse1(substitute(x))) > + as.data.frame.POSIXlt(x, row.names = NULL, optional = FALSE, ...) > + as.data.frame.raw(x, row.names = NULL, optional = FALSE, ..., nm > deparse1(substitute(x))) > + as.data.frame.table(x, row.names = NULL, ..., responseName > "Freq", stringsAsFactors = TRUE, > + sep = "", base = list(LETTERS)) > + as.data.frame.ts(x, ...) > > Obviously that's a bit more cluttered, but as.data.frame() should be a > pretty unusual case. It also highlights better the differences in the > two approaches: the former economizes on space and focuses on what > sorts of arguments are available; the latter shows the defaults, does > not hide the arguments shared with the generic, and will always > produce as many lines as there are methods. > > There are other edge cases to think through (multiple registrations, > interactions with S4, primitives, ...), but I want to first check with > the list if this looks workable & valuable enough to pursue. >I like and appreciate the intent behind your suggestion, though I don't like all the extra output from printing the generic. I want to look at the function body when I print it. And as you show, it can output a lot of information you're probably not interested in. What about adding the number of methods to printed output for generics, and a suggestion to use `methods(some_generic)` to get a list of them? Then you can use help(some_method) or args(some_method) to get more information about the specific method(s) you're interested in. Best, Josh> Mike C > > ---- > > Code that helped with the above: > > f = as.data.frame > # NB: methods() and getAnywhere() require {utils} > m = methods(f) > generic_args = names(formals(f)) > f_methods = lapply(m, \(fn) getAnywhere(fn)$objs[[1L]]) > names(f_methods) = m > new_args = sapply(f_methods, \(g) setdiff(names(formals(g)), generic_args)) > with( # group by argument name > data.frame(method = rep(names(new_args), lengths(new_args)), arg > unlist(new_args), row.names=NULL), > {tbl = tapply(method, arg, toString); writeLines(paste0(names(tbl), > ": ", tbl))} > ) > signatures=sapply(f_methods, \(g) paste(head(format(args(g)), -1), > collapse="\n")) > writeLines(paste0(names(signatures), gsub("^\\s*function\\s*", "", signatures))) > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- Joshua Ulrich | about.me/joshuaulrich FOSS Trading | www.fosstrading.com