Deepayan Sarkar
2023-Jan-16 14:49 UTC
[Rd] Recycling in arithmetic operations involving zero-length vectors
On Mon, Jan 16, 2023 at 7:28 PM Duncan Murdoch <murdoch.duncan at gmail.com> wrote:> > On 16/01/2023 6:55 a.m., David Winsemius wrote: > > > > > > Sent from my iPhone > > > >> On Jan 16, 2023, at 6:11 PM, Duncan Murdoch <murdoch.duncan at gmail.com> wrote: > >> > >> ?On 16/01/2023 5:23 a.m., Roland Fu? wrote: > >>> Dear R-core, > >>> The language definition is very clear: > >>> "As from R 1.4.0, any arithmetic operation involving a zero-length > >>> vector has a zero-length result." > >>> Thus, `1 + numeric()` returns `numeric(0)`. However, I don't find this > >>> very intuitive because usually the shorter vector is recycled to the > >>> length of the longer vector. Would it be possible to throw at least a > >>> warning for such cases? I don't expect them to be intended by most users. > >>> Best regards, > >> > >> The previous paragraph says "If the length of the longer vector is not a multiple of the shorter one, a warning is given." Since 1 is not a multiple of 0, that implies a warning should be given here. > >> > >> However, R 1.4.0 was released more than 20 years ago, so I would guess there are lots of packages intentionally using this. For example, it's a way to propagate bad inputs through a long calculation that allows a single test at the end. > >> > >> And even unintentional uses are unlikely to lead to problematic results: numeric(0) is usually a pretty clear signal that something is wrong. > >> > >> So I'd suggest a documentation change: "As from R 1.4.0, any arithmetic operation involving a zero-length vector has a zero-length result *without a warning*." > > > > I doubt that a documentation change will help very much. Roland is responding here with his and my surprise at the lack of a warning after witnessing my answer to an R newb Q where the impression at seeing ?numeric(0) was understood as the value 0. I suggested that many interpretations were possible and that a warning was given for NA generation. I stand with Roland in thinking a warning is appropriate. > > I didn't see this exchange, but I don't understand "a warning was given > for NA generation". We don't get a warning for 1 + NA. Do you mean > you'd like to get one? > > In any case, I think your anecdote illustrates a different problem: > printing numeric() as numeric(0) confused a beginning user. I've also > seen people get confused by that. > > Perhaps the change should be to the way numeric(0) is printed, but that > would also have consequences, since some people test the way output is > printed. > > Or perhaps we should just recognize that it's in the nature of being a > beginning user to be confused sometimes, and just help them to grow out > of that stage. > > Before a change like one of these is made, someone should make it in a > local copy, then run R CMD check on every package on CRAN to see how > disruptive it is. Maybe adding a warning() will uncover so few > intentional uses that fixing them is worthwhile.To even do that, we would have to first decide which "cases" should produce a warning. Let's say `1 + x` should give a warning when x = numeric(0). Then should `x^2` also produce a warning? Should `x^0.5`? Should `sqrt(x)`? Should `log(x)`? -Deepayan> The trouble is, running checks across CRAN is a very resource-intensive > exercise, and analyzing the results is a very developer-intensive > exercise. I'm sure the doc change is easier. > > Duncan Murdoch > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
GILLIBERT, Andre
2023-Jan-16 15:53 UTC
[Rd] Recycling in arithmetic operations involving zero-length vectors
Duncan Murdoch <murdoch.duncan at gmail.com> wrote:> To even do that, we would have to first decide which "cases" should produce a warning.> Let's say `1 + x` should give a warning when x = numeric(0). Then should `x^2` also produce a warning? Should `x^0.5`? Should `sqrt(x)`? > Should `log(x)`?The most probable errors would be in functions taking two arguments (e.g. `+`) and for which one argument has length >= 2 while the other has length 0. In my experience, most code with accidental zero-length propagations (e.g. typo in data_frame$field) quickly lead to errors, that are easy to debug (except for beginners), and so, do not need a warning. The only cases where zero-length propagation is really dangerous in my experience is in code using an aggregating function like sum(), all() or any(), because it silently returns a valid value for a zero-length argument. Emitting warnings for sum(numeric(0)) would probably have too many false positives but a (length >= 2) vs (length == 0) warning for common binary operators could sometimes catch the issue before it reaches the aggregating function. -- Sincerely Andr? GILLIBERT