Jennifer Lyon
2019-Nov-18 22:22 UTC
[Rd] readBin should check that its endian argument is a legal value
I think it would be helpful if readBin checked that its endian argument is a legal value. Why? I was reviewing some of our code and noticed that the author had readBin(..., endian="network") and never having heard of "network", I looked at the man page for readBin, and it hadn't heard of "network" either. Not good. I then looked at the R code for readBin, which has this line: swap <- endian != .Platform$endian and swap is passed into the .Internal(readBin). Further use of Google revealed that "network" is a known endian in the universe, and our code was working by essentially a lucky chance that the data was "big" and our current machines are "little". Really not good. I don't know enough about endian stuff to know if it makes sense that "network" should be one of the choices for endian for readBin (which from the documentation currently are: "big", "little" or "swap"), but in my opinion R should have failed with the choice of "network" in our code in the current version of R. I did eventually find an aside in the R Data Import/Export document that "network" is "big" so I will patch our code to be legal. Of course some code may depend on this undocumented behavior, but I would guess that in the vast majority of cases an illegal value really is a mistake. Jen> sessionInfo()R version 3.6.1 (2019-07-05) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 18.04.3 LTS Matrix products: default BLAS: /home/mbr/r-project/R-3.6.1/lib/libRblas.so LAPACK: /home/mbr/r-project/R-3.6.1/lib/libRlapack.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.6.1 [[alternative HTML version deleted]]
Tomas Kalibera
2019-Dec-18 11:12 UTC
[Rd] readBin should check that its endian argument is a legal value
Thank you for reporting this problem, R-devel now has a check in readBin and writeBin. I've identified two CRAN packages with an incorrect value for "endian" and reported to maintainers, unfortunately in their case the intention was to specify "little". Best Tomas On 11/18/19 11:22 PM, Jennifer Lyon wrote:> I think it would be helpful if readBin checked that its endian argument is > a legal value. > > Why? I was reviewing some of our code and noticed that the author had > readBin(..., endian="network") and never having heard of "network", I > looked at the man page for readBin, and it hadn't heard of "network" > either. Not good. > > I then looked at the R code for readBin, which has this line: > swap <- endian != .Platform$endian > > and swap is passed into the .Internal(readBin). Further use of Google > revealed that "network" is a known endian in the universe, and our code was > working by essentially a lucky chance that the data was "big" and our > current machines are "little". Really not good. I don't know enough about > endian stuff to know if it makes sense that "network" should be one of the > choices for endian for readBin (which from the documentation currently are: > "big", "little" or "swap"), but in my opinion R should have failed with the > choice of "network" in our code in the current version of R. I did > eventually find an aside in the R Data Import/Export document that > "network" is "big" so I will patch our code to be legal. Of course some > code may depend on this undocumented behavior, but I would guess that in > the vast majority of cases an illegal value really is a mistake. > > Jen > >> sessionInfo() > R version 3.6.1 (2019-07-05) > Platform: x86_64-pc-linux-gnu (64-bit) > Running under: Ubuntu 18.04.3 LTS > > Matrix products: default > BLAS: /home/mbr/r-project/R-3.6.1/lib/libRblas.so > LAPACK: /home/mbr/r-project/R-3.6.1/lib/libRlapack.so > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] compiler_3.6.1 > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Jennifer Lyon
2019-Dec-18 16:12 UTC
[Rd] readBin should check that its endian argument is a legal value
Thank you! Jen On Wed, Dec 18, 2019 at 4:12 AM Tomas Kalibera <tomas.kalibera at gmail.com> wrote:> Thank you for reporting this problem, R-devel now has a check in readBin > and writeBin. > > I've identified two CRAN packages with an incorrect value for "endian" > and reported to maintainers, unfortunately in their case the intention > was to specify "little". > > Best > Tomas > > On 11/18/19 11:22 PM, Jennifer Lyon wrote: > > I think it would be helpful if readBin checked that its endian argument > is > > a legal value. > > > > Why? I was reviewing some of our code and noticed that the author had > > readBin(..., endian="network") and never having heard of "network", I > > looked at the man page for readBin, and it hadn't heard of "network" > > either. Not good. > > > > I then looked at the R code for readBin, which has this line: > > swap <- endian != .Platform$endian > > > > and swap is passed into the .Internal(readBin). Further use of Google > > revealed that "network" is a known endian in the universe, and our code > was > > working by essentially a lucky chance that the data was "big" and our > > current machines are "little". Really not good. I don't know enough about > > endian stuff to know if it makes sense that "network" should be one of > the > > choices for endian for readBin (which from the documentation currently > are: > > "big", "little" or "swap"), but in my opinion R should have failed with > the > > choice of "network" in our code in the current version of R. I did > > eventually find an aside in the R Data Import/Export document that > > "network" is "big" so I will patch our code to be legal. Of course some > > code may depend on this undocumented behavior, but I would guess that in > > the vast majority of cases an illegal value really is a mistake. > > > > Jen > > > >> sessionInfo() > > R version 3.6.1 (2019-07-05) > > Platform: x86_64-pc-linux-gnu (64-bit) > > Running under: Ubuntu 18.04.3 LTS > > > > Matrix products: default > > BLAS: /home/mbr/r-project/R-3.6.1/lib/libRblas.so > > LAPACK: /home/mbr/r-project/R-3.6.1/lib/libRlapack.so > > > > locale: > > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > > > attached base packages: > > [1] stats graphics grDevices utils datasets methods base > > > > loaded via a namespace (and not attached): > > [1] compiler_3.6.1 > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > >[[alternative HTML version deleted]]
Reasonably Related Threads
- readBin should check that its endian argument is a legal value
- getting corrupted data when using readBin() after seek() on a gzfile connection
- Questions about readBin function (Was: dec2bin?)
- Typo fix for readBin.Rd
- Using readBin to read binary "unformatted" output files from Fortran?