Martin Maechler
2019-Sep-18 08:35 UTC
[Rd] '==' operator: inconsistency in data.frame(...) == NULL
>>>>> Hilmar Berger >>>>> on Sat, 14 Sep 2019 13:31:27 +0200 writes:> Dear all, > I did some more tests regarding the == operator in Ops.data.frame (see > below).? All tests done in R 3.6.1 (x86_64-w64-mingw32). > I find that errors are thrown also when comparing a zero length > data.frame to atomic objects with length>0 which should be a valid case > according to the documentation. This can be traced to a check in the > last line of Ops.data.frame which tests for the presence of an empty > result value (i.e. list() ) but does not handle a list of empty values > (i.e. list(logical(0))) which in fact is generated in those cases. > There is a simple fix (see also below). I'm pretty sure what you write above is wrong: For some reason you must have changed more in your own version of Ops.data.frame : Because there's a line value <- unlist(value, ...) there, value is *not* list(logical(0)) there, but rather logical(0) and then indeed, your proposed line change (at the end of Ops.data.frame) has no effect for the examples you give. Note also that your analysis -- treating all 0-extent data frames or matrices the same -- is very incomplete. A 0 x 0 matrix is not the same as a 0 x 1 matrix etc, and similar for data frames. Here's an extended "testing" script which takes into account some of the above : ##----------------------------------------------------------- d0 <- data.frame(a = numeric(0)) # zero length data.frame d00 <- unname(d0) # zero length data.frame __without names__ d3 <- data.frame(a=1:3) # non-empty data.frame d30. <- d3[,FALSE] # 3 x 0 -- take into account, too ! d30 <- unname(d30.) m01. <- matrix(,0,1, dimnames=list(NULL,"a")) # 0 x 1 matrix with dimnames m01 <- unname(m01.) m00. <- matrix(,0,0, dimnames=list(NULL,NULL)) # 0 x 0 matrix with dimnames m00 <- unname(m00.) m3 <- data.matrix(d3) ##------------------------ ## error A: ## Error in matrix(if (is.null(value)) logical() else value, nrow = nr, dimnames = list(rn, : ## length of 'dimnames' [2] not equal to array extent d0 == 1 # error A d00 == 1 # <0 x 0 matrix> d30. == 1 # <3 x 0 matrix> d30 == 1 # <3 x 0 matrix> d3 == 1 # <3 x 1 matrix> d0 == logical(0) # error A d00 == logical(0) # <0 x 0 matrix> d30. == logical() # <3 x 0 matrix> d30 == logical() # <3 x 0 matrix> d3 == logical(0) # error A d0 == NULL # error A d00 == NULL # <0 x 0 matrix> d30. == NULL # <3 x 0 matrix> d30 == NULL # <3 x 0 matrix> d3 == NULL # error A m00 == d0 # error A m00 == d00 # <0 x 0 matrix> m00 == d3 # error A # 0-length matrix for comparison : identical(m00., m00. == 1) ## 0 x 0 matrix *with* "invisible" dimnames [ NULL, NULL ] identical(m00., m00. == logical(0)) identical(m00., m00. == NULL) identical(m00, m00 == 1) ## 0 x 0 matrix w/o dimnames identical(m00, m00 == logical(0)) identical(m00, m00 == NULL) ## 0 x 1 --------------------- identical(m01., m01. == 1) # < 0 x 1 matrix> *with* dimnames identical(m01., m01. == logical(0)) # " " " identical(m01., m01. == NULL) # " " " identical(m01, m01 == 1) # < 0 x 1 matrix> w/o dimnames identical(m01, m01 == logical(0)) # < 0 x 1 matrix> identical(m01, m01 == NULL) # < 0 x 1 matrix> ##----------------------------------------------------------- Best regards, Martin
Martin Maechler
2019-Sep-18 09:29 UTC
[Rd] '==' operator: inconsistency in data.frame(...) == NULL
>>>>> Martin Maechler >>>>> on Wed, 18 Sep 2019 10:35:42 +0200 writes:>>>>> Hilmar Berger >>>>> on Sat, 14 Sep 2019 13:31:27 +0200 writes: >> Dear all, >> I did some more tests regarding the == operator in Ops.data.frame (see >> below).? All tests done in R 3.6.1 (x86_64-w64-mingw32). >> I find that errors are thrown also when comparing a zero length >> data.frame to atomic objects with length>0 which should be a valid case >> according to the documentation. This can be traced to a check in the >> last line of Ops.data.frame which tests for the presence of an empty >> result value (i.e. list() ) but does not handle a list of empty values >> (i.e. list(logical(0))) which in fact is generated in those cases. >> There is a simple fix (see also below). > I'm pretty sure what you write above is wrong: For some reason > you must have changed more in your own version of Ops.data.frame : > Because there's a line > value <- unlist(value, ...) > there, value is *not* list(logical(0)) there, but rather logical(0) > and then indeed, your proposed line change (at the end of Ops.data.frame) > has no effect for the examples you give. On the other hand, there *is* a simple "fix" at the end of Ops.data.frame() which makes all your examples "work" (i.e. not give an error), namely ---------------------------------------------------------------------- @@ -1685,7 +1684,7 @@ else { ## 'Logic' ("&","|") and 'Compare' ("==",">","<","!=","<=",">=") : value <- unlist(value, recursive = FALSE, use.names = FALSE) matrix(if(is.null(value)) logical() else value, - nrow = nr, dimnames = list(rn,cn)) + nrow = nr, ncol = length(cn), dimnames = list(rn,cn)) } ---------------------------------------------------------------------- i.e., explicitly specifying 'ncol' compatibly with the column names. However, I guess that this change would *not* signal errors where it *should* and so am *not* (yet?) proposing to "do" it. Another remark, on S4 which you've raised several times: As you may know that the 'Matrix' package (part of every "regular" R installation) uses S4 "everywhere" and it does define many methods for its Matrix classes, all in source file Matrix/R/Ops.R the development version (in svn / subversion) being online on R-forge here: https://r-forge.r-project.org/scm/viewvc.php/pkg/Matrix/R/Ops.R?view=markup&root=matrix and "of course", there we define S4 group methods for Ops all the time, and (almost) never S3 ones... [[but I hope you don't want to start combining data frames with Matrix package matrices, now !]] Martin Maechler ETH Zurich and R Core Team
Hilmar Berger
2019-Sep-24 17:31 UTC
[Rd] '==' operator: inconsistency in data.frame(...) == NULL
Dear Martin, thanks a lot for looking into this. Of course you were right that the fix was not complete - I apologize for not having tested what I believed to be the solution. My comments on the S4 classes seemed to stem from a misunderstanding on my side. I now believe to understand that S4 classes that inherit from R base object types might dispatch Ops for the same object types. If the base object value of such S4 classes is unset and therefore empty, this empty value will be passed on to e.g. Ops.data.frame where it would trigger the same issue as e.g. logical(0). setClass("MyClass", slots = list(x="numeric", label="character"), contains = "numeric") a = new("MyClass", x=3, label="FOO") a at .Data > logical(0) a == data.frame(a=1:3) # error I understand that this is all as expected and the error should most likely disappear with the fix you submitted for other 0-extent cases. Thanks again and best regards, Hilmar Am 18/09/2019 um 11:29 schrieb Martin Maechler:>>>>>> Martin Maechler >>>>>> on Wed, 18 Sep 2019 10:35:42 +0200 writes: > >>>>> Hilmar Berger > >>>>> on Sat, 14 Sep 2019 13:31:27 +0200 writes: > > >> Dear all, > >> I did some more tests regarding the == operator in Ops.data.frame (see > >> below).? All tests done in R 3.6.1 (x86_64-w64-mingw32). > > >> I find that errors are thrown also when comparing a zero length > >> data.frame to atomic objects with length>0 which should be a valid case > >> according to the documentation. This can be traced to a check in the > >> last line of Ops.data.frame which tests for the presence of an empty > >> result value (i.e. list() ) but does not handle a list of empty values > >> (i.e. list(logical(0))) which in fact is generated in those cases. > > >> There is a simple fix (see also below). > > > I'm pretty sure what you write above is wrong: For some reason > > you must have changed more in your own version of Ops.data.frame : > > > Because there's a line > > > value <- unlist(value, ...) > > > there, value is *not* list(logical(0)) there, but rather logical(0) > > and then indeed, your proposed line change (at the end of Ops.data.frame) > > has no effect for the examples you give. > > On the other hand, there *is* a simple "fix" at the end of > Ops.data.frame() which makes all your examples "work" (i.e. not > give an error), namely > > ---------------------------------------------------------------------- > > @@ -1685,7 +1684,7 @@ > else { ## 'Logic' ("&","|") and 'Compare' ("==",">","<","!=","<=",">=") : > value <- unlist(value, recursive = FALSE, use.names = FALSE) > matrix(if(is.null(value)) logical() else value, > - nrow = nr, dimnames = list(rn,cn)) > + nrow = nr, ncol = length(cn), dimnames = list(rn,cn)) > } > > ---------------------------------------------------------------------- > > i.e., explicitly specifying 'ncol' compatibly with the column names. > However, I guess that this change would *not* signal errors > where it *should* and so am *not* (yet?) proposing to "do" it. > > Another remark, on S4 which you've raised several times: > As you may know that the 'Matrix' package (part of every > "regular" R installation) uses S4 "everywhere" and it does > define many methods for its Matrix classes, all in source file Matrix/R/Ops.R > the development version (in svn / subversion) being online on R-forge here: > > https://r-forge.r-project.org/scm/viewvc.php/pkg/Matrix/R/Ops.R?view=markup&root=matrix > > and "of course", there we define S4 group methods for Ops all > the time, and (almost) never S3 ones... > [[but I hope you don't want to start combining data frames > with Matrix package matrices, now !]] > > Martin Maechler > ETH Zurich and R Core Team
Possibly Parallel Threads
- '==' operator: inconsistency in data.frame(...) == NULL
- '==' operator: inconsistency in data.frame(...) == NULL
- '==' operator: inconsistency in data.frame(...) == NULL
- '==' operator: inconsistency in data.frame(...) == NULL
- '==' operator: inconsistency in data.frame(...) == NULL