Duncan Murdoch
2023-Jan-16 13:58 UTC
[Rd] Recycling in arithmetic operations involving zero-length vectors
On 16/01/2023 6:55 a.m., David Winsemius wrote:> > > Sent from my iPhone > >> On Jan 16, 2023, at 6:11 PM, Duncan Murdoch <murdoch.duncan at gmail.com> wrote: >> >> ?On 16/01/2023 5:23 a.m., Roland Fu? wrote: >>> Dear R-core, >>> The language definition is very clear: >>> "As from R 1.4.0, any arithmetic operation involving a zero-length >>> vector has a zero-length result." >>> Thus, `1 + numeric()` returns `numeric(0)`. However, I don't find this >>> very intuitive because usually the shorter vector is recycled to the >>> length of the longer vector. Would it be possible to throw at least a >>> warning for such cases? I don't expect them to be intended by most users. >>> Best regards, >> >> The previous paragraph says "If the length of the longer vector is not a multiple of the shorter one, a warning is given." Since 1 is not a multiple of 0, that implies a warning should be given here. >> >> However, R 1.4.0 was released more than 20 years ago, so I would guess there are lots of packages intentionally using this. For example, it's a way to propagate bad inputs through a long calculation that allows a single test at the end. >> >> And even unintentional uses are unlikely to lead to problematic results: numeric(0) is usually a pretty clear signal that something is wrong. >> >> So I'd suggest a documentation change: "As from R 1.4.0, any arithmetic operation involving a zero-length vector has a zero-length result *without a warning*." > > I doubt that a documentation change will help very much. Roland is responding here with his and my surprise at the lack of a warning after witnessing my answer to an R newb Q where the impression at seeing ?numeric(0) was understood as the value 0. I suggested that many interpretations were possible and that a warning was given for NA generation. I stand with Roland in thinking a warning is appropriate.I didn't see this exchange, but I don't understand "a warning was given for NA generation". We don't get a warning for 1 + NA. Do you mean you'd like to get one? In any case, I think your anecdote illustrates a different problem: printing numeric() as numeric(0) confused a beginning user. I've also seen people get confused by that. Perhaps the change should be to the way numeric(0) is printed, but that would also have consequences, since some people test the way output is printed. Or perhaps we should just recognize that it's in the nature of being a beginning user to be confused sometimes, and just help them to grow out of that stage. Before a change like one of these is made, someone should make it in a local copy, then run R CMD check on every package on CRAN to see how disruptive it is. Maybe adding a warning() will uncover so few intentional uses that fixing them is worthwhile. The trouble is, running checks across CRAN is a very resource-intensive exercise, and analyzing the results is a very developer-intensive exercise. I'm sure the doc change is easier. Duncan Murdoch
Deepayan Sarkar
2023-Jan-16 14:49 UTC
[Rd] Recycling in arithmetic operations involving zero-length vectors
On Mon, Jan 16, 2023 at 7:28 PM Duncan Murdoch <murdoch.duncan at gmail.com> wrote:> > On 16/01/2023 6:55 a.m., David Winsemius wrote: > > > > > > Sent from my iPhone > > > >> On Jan 16, 2023, at 6:11 PM, Duncan Murdoch <murdoch.duncan at gmail.com> wrote: > >> > >> ?On 16/01/2023 5:23 a.m., Roland Fu? wrote: > >>> Dear R-core, > >>> The language definition is very clear: > >>> "As from R 1.4.0, any arithmetic operation involving a zero-length > >>> vector has a zero-length result." > >>> Thus, `1 + numeric()` returns `numeric(0)`. However, I don't find this > >>> very intuitive because usually the shorter vector is recycled to the > >>> length of the longer vector. Would it be possible to throw at least a > >>> warning for such cases? I don't expect them to be intended by most users. > >>> Best regards, > >> > >> The previous paragraph says "If the length of the longer vector is not a multiple of the shorter one, a warning is given." Since 1 is not a multiple of 0, that implies a warning should be given here. > >> > >> However, R 1.4.0 was released more than 20 years ago, so I would guess there are lots of packages intentionally using this. For example, it's a way to propagate bad inputs through a long calculation that allows a single test at the end. > >> > >> And even unintentional uses are unlikely to lead to problematic results: numeric(0) is usually a pretty clear signal that something is wrong. > >> > >> So I'd suggest a documentation change: "As from R 1.4.0, any arithmetic operation involving a zero-length vector has a zero-length result *without a warning*." > > > > I doubt that a documentation change will help very much. Roland is responding here with his and my surprise at the lack of a warning after witnessing my answer to an R newb Q where the impression at seeing ?numeric(0) was understood as the value 0. I suggested that many interpretations were possible and that a warning was given for NA generation. I stand with Roland in thinking a warning is appropriate. > > I didn't see this exchange, but I don't understand "a warning was given > for NA generation". We don't get a warning for 1 + NA. Do you mean > you'd like to get one? > > In any case, I think your anecdote illustrates a different problem: > printing numeric() as numeric(0) confused a beginning user. I've also > seen people get confused by that. > > Perhaps the change should be to the way numeric(0) is printed, but that > would also have consequences, since some people test the way output is > printed. > > Or perhaps we should just recognize that it's in the nature of being a > beginning user to be confused sometimes, and just help them to grow out > of that stage. > > Before a change like one of these is made, someone should make it in a > local copy, then run R CMD check on every package on CRAN to see how > disruptive it is. Maybe adding a warning() will uncover so few > intentional uses that fixing them is worthwhile.To even do that, we would have to first decide which "cases" should produce a warning. Let's say `1 + x` should give a warning when x = numeric(0). Then should `x^2` also produce a warning? Should `x^0.5`? Should `sqrt(x)`? Should `log(x)`? -Deepayan> The trouble is, running checks across CRAN is a very resource-intensive > exercise, and analyzing the results is a very developer-intensive > exercise. I'm sure the doc change is easier. > > Duncan Murdoch > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel