Henrik Bengtsson
2015-May-04 19:20 UTC
[Rd] Shouldn't vector indexing with negative out-of-range index give an error?
In Section 'Indexing by vectors' of 'R Language Definition' (http://cran.r-project.org/doc/manuals/r-release/R-lang.html#Indexing-by-vectors) it says: "Integer. All elements of i must have the same sign. If they are positive, the elements of x with those index numbers are selected. If i contains negative elements, all elements except those indicated are selected. If i is positive and exceeds length(x) then the corresponding selection is NA. A negative out of bounds value for i causes an error. A special case is the zero index, which has null effects: x[0] is an empty vector and otherwise including zeros among positive or negative indices has the same effect as if they were omitted." However, that "A negative out of bounds value for i causes an error" in the second paragraph does not seem to apply. Instead, R silently ignore negative indices that are out of range. For example:> x <- 1:4 > x[-9L][1] 1 2 3 4> x[-c(1:9)]integer(0)> x[-c(3:9)][1] 1 2> y <- as.list(1:4) > y[-c(1:9)]list() Is the observed non-error the correct behavior and therefore the documentation is incorrect, or is it vice verse? (...or is it me missing something) I get the above on R devel, R 3.2.0, and as far back as R 2.11.0 (haven't check earlier versions). Thank you, Henrik
Martin Maechler
2015-May-05 14:01 UTC
[Rd] Shouldn't vector indexing with negative out-of-range index give an error?
>>>>> Henrik Bengtsson <henrik.bengtsson at ucsf.edu> >>>>> on Mon, 4 May 2015 12:20:44 -0700 writes:> In Section 'Indexing by vectors' of 'R Language Definition' > (http://cran.r-project.org/doc/manuals/r-release/R-lang.html#Indexing-by-vectors) > it says: > "Integer. All elements of i must have the same sign. If they are > positive, the elements of x with those index numbers are selected. If > i contains negative elements, all elements except those indicated are > selected. > If i is positive and exceeds length(x) then the corresponding > selection is NA. A negative out of bounds value for i causes an error. > A special case is the zero index, which has null effects: x[0] is an > empty vector and otherwise including zeros among positive or negative > indices has the same effect as if they were omitted." > However, that "A negative out of bounds value for i causes an error" > in the second paragraph does not seem to apply. Instead, R silently > ignore negative indices that are out of range. For example: >> x <- 1:4 >> x[-9L] > [1] 1 2 3 4 >> x[-c(1:9)] > integer(0) >> x[-c(3:9)] > [1] 1 2 >> y <- as.list(1:4) >> y[-c(1:9)] > list() > Is the observed non-error the correct behavior and therefore the > documentation is incorrect, or is it vice verse? (...or is it me > missing something) > I get the above on R devel, R 3.2.0, and as far back as R 2.11.0 > (haven't check earlier versions). Thank you, Henrik! I've checked further back: The change happened between R 2.5.1 and R 2.6.0. The previous behavior was > (1:3)[-(3:5)] Error: subscript out of bounds If you start reading NEWS.2, you see a *lot* of new features (and bug fixes) in the 2.6.0 news, but from my browsing, none of them mentioned the new behavior as feature. Let's -- for a moment -- declare it a bug in the code, i.e., not in the documentation: - As 2.6.0 happened quite a while ago (Oct. 2007), we could wonder how much R code will break if we fix the bug. - Is the R package authors' community willing to do the necessary cleanup in their packages ? ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- Now, after reading the source code for a while, and looking at the changes, I've found the log entry ------------------------------------------------------------------------ r42123 | ihaka | 2007-07-05 02:00:05 +0200 (Thu, 05 Jul 2007) | 4 lines Changed the behaviour of out-of-bounds negative subscripts to match that of S. Such values are now ignored rather than tripping an error. ------------------------------------------------------------------------ So, it was changed on purpose, by one of the true "R"s, very much on purpose. Making it a *warning* instead of the original error may have been both more cautious and more helpful for detecting programming errors. OTOH, John Chambers, the father of S and hence grandfather of R, may have had good reasons why it seemed more logical to silently ignore such out of bound negative indices: One could argue that x[-5] means "leave away the 5-th element of x" and if there is no 5-th element of x, leaving it away should be a no-op. After all this musing and history detection, my gut decision would be to only change the documentation which Ross forgot to change. But of course, it may be interesting to hear other programmeR's feedback on this. Martin
John Chambers
2015-May-05 15:45 UTC
[Rd] Shouldn't vector indexing with negative out-of-range index give an error?
When someone suggests that we "might have had a reason" for some peculiarity in the original S, my usual reaction is "Or else we never thought of the problem". In this case, however, there is a relevant statement in the 1988 "blue book". In the discussion of subscripting (p 358) the definition for negative i says: "the indices consist of the elements of seq(along=x) that do not match any elements in -i". Suggesting that no bounds checking on -i takes place. John On May 5, 2015, at 7:01 AM, Martin Maechler <maechler at lynne.stat.math.ethz.ch> wrote:>>>>>> Henrik Bengtsson <henrik.bengtsson at ucsf.edu> >>>>>> on Mon, 4 May 2015 12:20:44 -0700 writes: > >> In Section 'Indexing by vectors' of 'R Language Definition' >> (http://cran.r-project.org/doc/manuals/r-release/R-lang.html#Indexing-by-vectors) >> it says: > >> "Integer. All elements of i must have the same sign. If they are >> positive, the elements of x with those index numbers are selected. If >> i contains negative elements, all elements except those indicated are >> selected. > >> If i is positive and exceeds length(x) then the corresponding >> selection is NA. A negative out of bounds value for i causes an error. > >> A special case is the zero index, which has null effects: x[0] is an >> empty vector and otherwise including zeros among positive or negative >> indices has the same effect as if they were omitted." > >> However, that "A negative out of bounds value for i causes an error" >> in the second paragraph does not seem to apply. Instead, R silently >> ignore negative indices that are out of range. For example: > >>> x <- 1:4 >>> x[-9L] >> [1] 1 2 3 4 >>> x[-c(1:9)] >> integer(0) >>> x[-c(3:9)] >> [1] 1 2 > >>> y <- as.list(1:4) >>> y[-c(1:9)] >> list() > >> Is the observed non-error the correct behavior and therefore the >> documentation is incorrect, or is it vice verse? (...or is it me >> missing something) > >> I get the above on R devel, R 3.2.0, and as far back as R 2.11.0 >> (haven't check earlier versions). > > Thank you, Henrik! > > I've checked further back: The change happened between R 2.5.1 and R 2.6.0. > > The previous behavior was > >> (1:3)[-(3:5)] > Error: subscript out of bounds > > If you start reading NEWS.2, you see a *lot* of new features > (and bug fixes) in the 2.6.0 news, but from my browsing, none of > them mentioned the new behavior as feature. > > Let's -- for a moment -- declare it a bug in the code, i.e., not > in the documentation: > > - As 2.6.0 happened quite a while ago (Oct. 2007), > we could wonder how much R code will break if we fix the bug. > > - Is the R package authors' community willing to do the necessary > cleanup in their packages ? > > ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- > > > Now, after reading the source code for a while, and looking at > the changes, I've found the log entry > > ------------------------------------------------------------------------ > r42123 | ihaka | 2007-07-05 02:00:05 +0200 (Thu, 05 Jul 2007) | 4 lines > > Changed the behaviour of out-of-bounds negative > subscripts to match that of S. Such values are > now ignored rather than tripping an error. > > ------------------------------------------------------------------------ > > So, it was changed on purpose, by one of the true "R"s, very > much on purpose. > > Making it a *warning* instead of the original error > may have been both more cautious and more helpful for > detecting programming errors. > > OTOH, John Chambers, the father of S and hence grandfather of R, > may have had good reasons why it seemed more logical to silently > ignore such out of bound negative indices: > One could argue that > > x[-5] means "leave away the 5-th element of x" > > and if there is no 5-th element of x, leaving it away should be a no-op. > > After all this musing and history detection, my gut decision > would be to only change the documentation which Ross forgot to change. > > But of course, it may be interesting to hear other programmeR's feedback on this. > > Martin
Martin Maechler
2015-May-06 08:33 UTC
[Rd] Shouldn't vector indexing with negative out-of-range index give an error?
>>>>> John Chambers <jmc at stat.stanford.edu> >>>>> on Tue, 5 May 2015 08:39:46 -0700 writes:> When someone suggests that we "might have had a reason" for some peculiarity in the original S, my usual reaction is "Or else we never thought of the problem". > In this case, however, there is a relevant statement in the 1988 "blue book". In the discussion of subscripting (p 358) the definition for negative i says: "the indices consist of the elements of seq(along=x) that do not match any elements in -i". > Suggesting that no bounds checking on -i takes place. > John Indeed! Thanks a lot John, for the perspective and clarification! I'm committing a patch to the documentation now. Martin > On May 5, 2015, at 7:01 AM, Martin Maechler <maechler at lynne.stat.math.ethz.ch> wrote: >>>>>>> Henrik Bengtsson <henrik.bengtsson at ucsf.edu> >>>>>>> on Mon, 4 May 2015 12:20:44 -0700 writes: >> >>> In Section 'Indexing by vectors' of 'R Language Definition' >>> (http://cran.r-project.org/doc/manuals/r-release/R-lang.html#Indexing-by-vectors) >>> it says: >> >>> "Integer. All elements of i must have the same sign. If they are >>> positive, the elements of x with those index numbers are selected. If >>> i contains negative elements, all elements except those indicated are >>> selected. >> >>> If i is positive and exceeds length(x) then the corresponding >>> selection is NA. A negative out of bounds value for i causes an error. >> >>> A special case is the zero index, which has null effects: x[0] is an >>> empty vector and otherwise including zeros among positive or negative >>> indices has the same effect as if they were omitted." >> >>> However, that "A negative out of bounds value for i causes an error" >>> in the second paragraph does not seem to apply. Instead, R silently >>> ignore negative indices that are out of range. For example: >> >>>> x <- 1:4 >>>> x[-9L] >>> [1] 1 2 3 4 >>>> x[-c(1:9)] >>> integer(0) >>>> x[-c(3:9)] >>> [1] 1 2 >> >>>> y <- as.list(1:4) >>>> y[-c(1:9)] >>> list() >> >>> Is the observed non-error the correct behavior and therefore the >>> documentation is incorrect, or is it vice verse? (...or is it me >>> missing something) >> >>> I get the above on R devel, R 3.2.0, and as far back as R 2.11.0 >>> (haven't check earlier versions). >> >> Thank you, Henrik! >> >> I've checked further back: The change happened between R 2.5.1 and R 2.6.0. >> >> The previous behavior was >> >>> (1:3)[-(3:5)] >> Error: subscript out of bounds >> >> If you start reading NEWS.2, you see a *lot* of new features >> (and bug fixes) in the 2.6.0 news, but from my browsing, none of >> them mentioned the new behavior as feature. >> >> Let's -- for a moment -- declare it a bug in the code, i.e., not >> in the documentation: >> >> - As 2.6.0 happened quite a while ago (Oct. 2007), >> we could wonder how much R code will break if we fix the bug. >> >> - Is the R package authors' community willing to do the necessary >> cleanup in their packages ? >> >> ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- >> >> >> Now, after reading the source code for a while, and looking at >> the changes, I've found the log entry >> >> ------------------------------------------------------------------------ >> r42123 | ihaka | 2007-07-05 02:00:05 +0200 (Thu, 05 Jul 2007) | 4 lines >> >> Changed the behaviour of out-of-bounds negative >> subscripts to match that of S. Such values are >> now ignored rather than tripping an error. >> >> ------------------------------------------------------------------------ >> >> So, it was changed on purpose, by one of the true "R"s, very >> much on purpose. >> >> Making it a *warning* instead of the original error >> may have been both more cautious and more helpful for >> detecting programming errors. >> >> OTOH, John Chambers, the father of S and hence grandfather of R, >> may have had good reasons why it seemed more logical to silently >> ignore such out of bound negative indices: >> One could argue that >> >> x[-5] means "leave away the 5-th element of x" >> >> and if there is no 5-th element of x, leaving it away should be a no-op. >> >> After all this musing and history detection, my gut decision >> would be to only change the documentation which Ross forgot to change. >> >> But of course, it may be interesting to hear other programmeR's feedback on this. >> >> Martin
Henrik Bengtsson
2015-May-06 16:04 UTC
[Rd] Shouldn't vector indexing with negative out-of-range index give an error?
On Wed, May 6, 2015 at 1:33 AM, Martin Maechler <maechler at lynne.stat.math.ethz.ch> wrote:>>>>>> John Chambers <jmc at stat.stanford.edu> >>>>>> on Tue, 5 May 2015 08:39:46 -0700 writes: > > > When someone suggests that we "might have had a reason" for some peculiarity in the original S, my usual reaction is "Or else we never thought of the problem". > > In this case, however, there is a relevant statement in the 1988 "blue book". In the discussion of subscripting (p 358) the definition for negative i says: "the indices consist of the elements of seq(along=x) that do not match any elements in -i". > > > Suggesting that no bounds checking on -i takes place. > > > John > > Indeed! > Thanks a lot John, for the perspective and clarification! > > I'm committing a patch to the documentation now.Thank you both and also credits to Dongcan Jiang for pointing out to me that errors were indeed not generated in this case. I agree with the decision. It's interesting to notice that now the only way an error is generated is when index-vector subsetting is done using mixed positive and negative indices, e.g. x[c(-1,1)]. /Henrik> Martin > > > > On May 5, 2015, at 7:01 AM, Martin Maechler <maechler at lynne.stat.math.ethz.ch> wrote: > > >>>>>>> Henrik Bengtsson <henrik.bengtsson at ucsf.edu> > >>>>>>> on Mon, 4 May 2015 12:20:44 -0700 writes: > >> > >>> In Section 'Indexing by vectors' of 'R Language Definition' > >>> (http://cran.r-project.org/doc/manuals/r-release/R-lang.html#Indexing-by-vectors) > >>> it says: > >> > >>> "Integer. All elements of i must have the same sign. If they are > >>> positive, the elements of x with those index numbers are selected. If > >>> i contains negative elements, all elements except those indicated are > >>> selected. > >> > >>> If i is positive and exceeds length(x) then the corresponding > >>> selection is NA. A negative out of bounds value for i causes an error. > >> > >>> A special case is the zero index, which has null effects: x[0] is an > >>> empty vector and otherwise including zeros among positive or negative > >>> indices has the same effect as if they were omitted." > >> > >>> However, that "A negative out of bounds value for i causes an error" > >>> in the second paragraph does not seem to apply. Instead, R silently > >>> ignore negative indices that are out of range. For example: > >> > >>>> x <- 1:4 > >>>> x[-9L] > >>> [1] 1 2 3 4 > >>>> x[-c(1:9)] > >>> integer(0) > >>>> x[-c(3:9)] > >>> [1] 1 2 > >> > >>>> y <- as.list(1:4) > >>>> y[-c(1:9)] > >>> list() > >> > >>> Is the observed non-error the correct behavior and therefore the > >>> documentation is incorrect, or is it vice verse? (...or is it me > >>> missing something) > >> > >>> I get the above on R devel, R 3.2.0, and as far back as R 2.11.0 > >>> (haven't check earlier versions). > >> > >> Thank you, Henrik! > >> > >> I've checked further back: The change happened between R 2.5.1 and R 2.6.0. > >> > >> The previous behavior was > >> > >>> (1:3)[-(3:5)] > >> Error: subscript out of bounds > >> > >> If you start reading NEWS.2, you see a *lot* of new features > >> (and bug fixes) in the 2.6.0 news, but from my browsing, none of > >> them mentioned the new behavior as feature. > >> > >> Let's -- for a moment -- declare it a bug in the code, i.e., not > >> in the documentation: > >> > >> - As 2.6.0 happened quite a while ago (Oct. 2007), > >> we could wonder how much R code will break if we fix the bug. > >> > >> - Is the R package authors' community willing to do the necessary > >> cleanup in their packages ? > >> > >> ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- > >> > >> > >> Now, after reading the source code for a while, and looking at > >> the changes, I've found the log entry > >> > >> ------------------------------------------------------------------------ > >> r42123 | ihaka | 2007-07-05 02:00:05 +0200 (Thu, 05 Jul 2007) | 4 lines > >> > >> Changed the behaviour of out-of-bounds negative > >> subscripts to match that of S. Such values are > >> now ignored rather than tripping an error. > >> > >> ------------------------------------------------------------------------ > >> > >> So, it was changed on purpose, by one of the true "R"s, very > >> much on purpose. > >> > >> Making it a *warning* instead of the original error > >> may have been both more cautious and more helpful for > >> detecting programming errors. > >> > >> OTOH, John Chambers, the father of S and hence grandfather of R, > >> may have had good reasons why it seemed more logical to silently > >> ignore such out of bound negative indices: > >> One could argue that > >> > >> x[-5] means "leave away the 5-th element of x" > >> > >> and if there is no 5-th element of x, leaving it away should be a no-op. > >> > >> After all this musing and history detection, my gut decision > >> would be to only change the documentation which Ross forgot to change. > >> > >> But of course, it may be interesting to hear other programmeR's feedback on this. > >> > >> Martin >
Apparently Analagous Threads
- Shouldn't vector indexing with negative out-of-range index give an error?
- Shouldn't vector indexing with negative out-of-range index give an error?
- Shouldn't vector indexing with negative out-of-range index give an error?
- R Language Definition: Subsetting matrices with negative indices is *not* an error
- R Language Definition: Subsetting matrices with negative indices is *not* an error