ripley@stats.ox.ac.uk
2005-May-06 14:44 UTC
(PR#7824) [Rd] handling of zero and negative indices in
I've put this in (with some different wording). Although S blithely accepts mis-dimensioned index matrices I agree this is wrong and have made it an error. On Fri, 29 Apr 2005 tplate@acm.org wrote:> This message contains a description of what looks like a bug, examples > of the suspect behavior, a proposed change to the C code to change this > behavior, example of behavior with the fix, and suggestions for 3 places > to update the documentation to reflect the proposed behavior. It is > submitted for consideration for inclusion in R. Comments are requested. > > Currently, the code for subscripting by matrices checks that values in > the matrix are not greater than the dimensions of the array being > indexed. However, it does not check for zero or negative indices, and > blindly does index computation with them as though they were positive > indices (including for negative indices whose absolute value exceeds the > dimensions of the array being indexed). Here are examples of indexing > by matrices that do not return unequivocally sensible results (in most > cases): > > > x <- matrix(1:6,ncol=2) > > dim(x) > [1] 3 2 > > x > [,1] [,2] > [1,] 1 4 > [2,] 2 5 > [3,] 3 6 > > x[rbind(c(1,1), c(2,2))] > [1] 1 5 > > x[rbind(c(1,1), c(2,2), c(0,1))] > [1] 1 5 > > x[rbind(c(1,1), c(2,2), c(0,0))] > Error: only 0's may be mixed with negative subscripts > > x[rbind(c(1,1), c(2,2), c(0,2))] > [1] 1 5 3 > > x[rbind(c(1,1), c(2,2), c(0,3))] > Error: subscript out of bounds > > x[rbind(c(1,1), c(2,2), c(1,0))] > Error: only 0's may be mixed with negative subscripts > > x[rbind(c(1,1), c(2,2), c(2,0))] > Error: only 0's may be mixed with negative subscripts > > x[rbind(c(1,1), c(2,2), c(3,0))] > [1] 1 5 > > x[rbind(c(1,1), c(2,2), c(1,2))] > [1] 1 5 4 > > x[rbind(c(1,1), c(2,2), c(-1,2))] > [1] 1 5 2 > > x[rbind(c(1,1), c(2,2), c(-2,2))] > [1] 1 5 1 > > x[rbind(c(1,1), c(2,2), c(-3,2))] > [1] 1 5 > > x[rbind(c(1,1), c(2,2), c(-4,2))] > Error: only 0's may be mixed with negative subscripts > > x[rbind(c(1,1), c(2,2), c(-1,-1))] > Error: subscript out of bounds > > > > # range checks are not applied to negative indices > > x <- matrix(1:6, ncol=3) > > dim(x) > [1] 2 3 > > x[rbind(c(1,1), c(2,2), c(-3,3))] > [1] 1 4 1 > > x[rbind(c(1,1), c(2,2), c(-4,3))] > [1] 1 4 > > > > version > _ > platform i386-pc-mingw32 > arch i386 > os mingw32 > system i386, mingw32 > status > major 2 > minor 1.0 > year 2005 > month 04 > day 18 > language R > > > > The followed version of mat2indsub() (to replace the one in > src/main/subscript.c) allows zero indices and explicitly disallows all > negative indices: > > /* Special Matrix Subscripting: Handles the case x[i] where */ > /* x is an n-way array and i is a matrix with n columns. */ > /* This code returns a vector containing the integer subscripts */ > /* to be extracted when x is regarded as unravelled. */ > /* Negative indices are not allowed. */ > /* A zero anywhere in a row will cause a zero in the same */ > /* position in the result. */ > > SEXP mat2indsub(SEXP dims, SEXP s) > { > int tdim, j, i, k, nrs = nrows(s); > SEXP rvec; > > PROTECT(rvec = allocVector(INTSXP, nrs)); > s = coerceVector(s, INTSXP); > setIVector(INTEGER(rvec), nrs, 0); > > for (i = 0; i < nrs; i++) { > tdim = 1; > /* compute 0-based subscripts for a row (0 in the input */ > /* gets -1 in the output here) */ > for (j = 0; j < LENGTH(dims); j++) { > k = INTEGER(s)[i + j * nrs]; > if(k == NA_INTEGER) { > INTEGER(rvec)[i] = NA_INTEGER; > break; > } > if(k < 0) > error(_("cannot have negative values in matrices used as subscripts")); > if(k == 0) { > INTEGER(rvec)[i] = -1; > break; > } > if (k > INTEGER(dims)[j]) > error(_("subscript out of bounds")); > INTEGER(rvec)[i] += (k - 1) * tdim; > tdim *= INTEGER(dims)[j]; > } > /* transform to 1 based subscripting (0 in the input */ > /* gets 0 in the output here) */ > if(INTEGER(rvec)[i] != NA_INTEGER) > INTEGER(rvec)[i]++; > } > UNPROTECT(1); > return (rvec); > } > > With this version, the above commands (+ a couple more) produce the > following output (with commands that fail suitably wrapped in try() so > that they can be included in a test files.) > > > x <- matrix(1:6,ncol=2) > > dim(x) > [1] 3 2 > > x > [,1] [,2] > [1,] 1 4 > [2,] 2 5 > [3,] 3 6 > > x[rbind(c(1,1), c(2,2))] > [1] 1 5 > > x[rbind(c(1,1), c(2,2), c(0,1))] > [1] 1 5 > > x[rbind(c(1,1), c(2,2), c(0,0))] > [1] 1 5 > > x[rbind(c(1,1), c(2,2), c(0,2))] > [1] 1 5 > > x[rbind(c(1,1), c(2,2), c(0,3))] > [1] 1 5 > > x[rbind(c(1,1), c(2,2), c(1,0))] > [1] 1 5 > > x[rbind(c(1,1), c(2,2), c(2,0))] > [1] 1 5 > > x[rbind(c(1,1), c(2,2), c(3,0))] > [1] 1 5 > > x[rbind(c(1,0), c(0,2), c(3,0))] > numeric(0) > > x[rbind(c(1,0), c(0,0), c(3,0))] > numeric(0) > > x[rbind(c(1,1), c(2,2), c(1,2))] > [1] 1 5 4 > > x[rbind(c(1,1), c(2,NA), c(1,2))] > [1] 1 NA 4 > > x[rbind(c(1,0), c(2,NA), c(1,2))] > [1] NA 4 > > try(x[rbind(c(1,1), c(2,2), c(-1,2))]) > Error in try(x[rbind(c(1, 1), c(2, 2), c(-1, 2))]) : > cannot have negative values in matrices used as subscripts > > try(x[rbind(c(1,1), c(2,2), c(-2,2))]) > Error in try(x[rbind(c(1, 1), c(2, 2), c(-2, 2))]) : > cannot have negative values in matrices used as subscripts > > try(x[rbind(c(1,1), c(2,2), c(-3,2))]) > Error in try(x[rbind(c(1, 1), c(2, 2), c(-3, 2))]) : > cannot have negative values in matrices used as subscripts > > try(x[rbind(c(1,1), c(2,2), c(-4,2))]) > Error in try(x[rbind(c(1, 1), c(2, 2), c(-4, 2))]) : > cannot have negative values in matrices used as subscripts > > try(x[rbind(c(1,1), c(2,2), c(-1,-1))]) > Error in try(x[rbind(c(1, 1), c(2, 2), c(-1, -1))]) : > cannot have negative values in matrices used as subscripts > > > > # verify that range checks are applied to negative indices > > x <- matrix(1:6, ncol=3) > > dim(x) > [1] 2 3 > > try(x[rbind(c(1,1), c(2,2), c(-3,3))]) > Error in try(x[rbind(c(1, 1), c(2, 2), c(-3, 3))]) : > cannot have negative values in matrices used as subscripts > > try(x[rbind(c(1,1), c(2,2), c(-4,3))]) > Error in try(x[rbind(c(1, 1), c(2, 2), c(-4, 3))]) : > cannot have negative values in matrices used as subscripts > > > > > The documentation page for ?Extract could have the following sentences > added to the end of the 2nd para in the description of arguments "i, j, > ...": > > Negative indices are not allowed in matrices used as indices. > NA and zero values are allowed: rows containing a zero are > omitted from the result, and rows containing an NA produce an > NA in the result. > > In "Introduction to R", the same material could be added to the end of > the section "Index Arrays" (in the chapter "Arrays and matrices"). > > In "R Language Definition", the third paragraph of the section "Indexing > matrices and arrays" (Section 3.4.2 in my copy) currently reads: > >> It is possible to use a matrix of integers as an index. In >> this case, the number of columns of the matrix should match >> the number of dimensions of the structure, and the result >> will be a vector with length as the number of rows of the >> matrix. The following example shows how to extract the >> elements m[1, 1] and m[2, 2] in one operation. > > It could be changed to the following: > > It is possible to use a matrix of integers as an index. > Negative indices are not allowed in matrices used as > indices. NA and zero values are allowed: rows containing a > zero are omitted from the result, and rows containing an NA > produce an NA in the result. To use a matrix as an index, > the number of columns of the matrix should match the number > of dimensions of the structure, and the result will be a > vector with length as the number of rows of the matrix > (except when the matrix contains zeros). The following > example shows how to extract the elements m[1, 1] and m[2, > 2] in one operation. > > > Additionally, indexing by matrix just goes ahead and uses matrix > elements as vector indices in cases where the number of columns in the > matrix does not match the number of dimensions in the array. E.g.: > > > x <- matrix(1:6,ncol=2) > > x[rbind(c(1,1,1), c(2,2,2))] > [1] 1 2 1 2 1 2 > > > > Would it not be preferable behavior to stop with an error in such cases? > > ______________________________________________ > R-devel@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > >-- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595