Ivan Krylov
2023-Apr-14 10:38 UTC
[Rd] sum(), min(), max(), prod() vs. named arguments in ...
Hello R-devel, As mentioned on Fosstodon [1] and discussed during RCOH [2], named arguments to sum(), min(), max() and prod() are currently processed the same way as the un-named ones. Additionally, when specifying the na.rm argument more than once, the last specified value is silently taken into account. Most of this behaviour is exactly as documented, but it's arguably a source of mistakes when the na.rm=TRUE argument is mistyped and converted to 1. Internally, all these calls land in do_summary(), and the na.rm argument is processed by fixup_NaRm(). mean() is safe because it's dispatched via S3 lookup. Do users normally name their arguments to sum()? If not, it could be relatively safe to make it a warning to pass named arguments to sum() that are not na.rm and later transition it to an error: --- src/main/summary.c (revision 84252) +++ src/main/summary.c (working copy) @@ -419,6 +419,8 @@ na_value = CAR(a); if(prev == R_NilValue) args = CDR(a); else SETCDR(prev, CDR(a)); + } else if (TAG(a) && TAG(a) != R_NilValue) { + warning("Named argument \"%s\" here is probably a mistake", CHAR(PRINTNAME(TAG(a)))); } prev = a; } (Can TAG(a) ever be NULL or anything other than R_NilValue or a symbol here? We'll probably need to translate this message, too.) Passing na.rm more than once could be made into an error right away: --- src/main/summary.c (revision 84252) +++ src/main/summary.c (working copy) @@ -409,6 +409,7 @@ attribute_hidden SEXP fixup_NaRm(SEXP args) { + Rboolean seen_NaRm = FALSE; SEXP t, na_value; /* Need to make sure na.rm is last and exists */ @@ -415,7 +416,9 @@ na_value = ScalarLogical(FALSE); for(SEXP a = args, prev = R_NilValue; a != R_NilValue; a = CDR(a)) { if(TAG(a) == R_NaRmSymbol) { + if(seen_NaRm) error("Please specify na.rm only once"); + seen_NaRm = TRUE; if(CDR(a) == R_NilValue) return args; na_value = CAR(a); if(prev == R_NilValue) args = CDR(a); else SETCDR(prev, CDR(a)); (Can we use error(_("formal argument \"%s\" matched by multiple actual arguments"), "na.rm") to emit an already translated error message?) Additionally, is it desirable to have S3 methods for sum() and its friends? Currently, they are dispatched, but something confusing happens to the arguments: according to the method, the dots contain what I would expect them to contain (only the numbers to sum), but when calling NextMethod() to compute the actual sum, extra arguments get processed: sum.foo <- function(..., ExtraArg = 999, na.rm = FALSE) { print(ExtraArg) str(...) NextMethod(generic = 'sum', ..., na.rm = na.rm) } sum(structure(100, class = 'foo')) # [1] 999 # 'foo' num 100 # [1] 100 sum(structure(100, class = 'foo'), ExtraArg = -1) # [1] -1 # 'foo' num 100 # <-- ExtraArg = -1 missing from ... # [1] 99 # <-- but still subtracted Maybe I'm just misusing NextMethod(). This is probably not important in practice due to how S3 dispatch works. -- Best regards, Ivan [1] https://fosstodon.org/@nibsalot/110105034507818285 [2] https://github.com/r-devel/rcontribution/blob/main/office_hours/2023-04-13_EMEA-APAC.md
Possibly Parallel Threads
- sum(), min(), max(), prod() vs. named arguments in ...
- making makepredictcall() work
- [PATCH 2/3] New API: guestfs_blockdev_setra: Adjust readahead for filesystems and devices.
- Speed improvement to evalList
- round() and signif() do not check argument names when a single argument is given