Duncan Murdoch
2018-Apr-19 00:06 UTC
[Rd] R Bug: write.table for matrix of more than 2, 147, 483, 648 elements
On 18/04/2018 5:08 PM, Tousey, Colton wrote:> Hello, > > I want to report a bug in R that is limiting my capabilities to export a matrix with write.csv or write.table with over 2,147,483,648 elements (C's int limit). I found this bug already reported about before: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17182. However, there appears to be no solution or fixes in upcoming R version releases. > > The error message is coming from the writetable part of the utils package in the io.c source code(https://svn.r-project.org/R/trunk/src/library/utils/src/io.c): > /* quick integrity check */ > if(XLENGTH(x) != (R_len_t)nr * nc) > error(_("corrupt matrix -- dims not not match length")); > > The issue is that nr*nc is an integer and the size of my matrix, 2.8 billion elements, exceeds C's limit, so the check forces the code to fail.Yes, looks like a typo: R_len_t is an int, and that's how nr was declared. It should be R_xlen_t, which is bigger on machines that support big vectors. I haven't tested the change; there may be something else in that function that assumes short vectors. Duncan Murdoch> > My version: >> R.Version() > $platform > [1] "x86_64-w64-mingw32" > > $arch > [1] "x86_64" > > $os > [1] "mingw32" > > $system > [1] "x86_64, mingw32" > > $status > [1] "" > > $major > [1] "3" > > $minor > [1] "4.3" > > $year > [1] "2017" > > $month > [1] "11" > > $day > [1] "30" > > $`svn rev` > [1] "73796" > > $language > [1] "R" > > $version.string > [1] "R version 3.4.3 (2017-11-30)" > > $nickname > [1] "Kite-Eating Tree" > > Thank you, > Colton > > > Colton Tousey > Research Associate II > P: 816.585.0300 E: colton.tousey at kc.frb.org > FEDERAL RESERVE BANK OF KANSAS CITY > 1 Memorial Drive * Kansas City, Missouri 64198 * www.kansascityfed.org > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >
Tomas Kalibera
2018-Apr-19 07:30 UTC
[Rd] R Bug: write.table for matrix of more than 2, 147, 483, 648 elements
On 04/19/2018 02:06 AM, Duncan Murdoch wrote:> On 18/04/2018 5:08 PM, Tousey, Colton wrote: >> Hello, >> >> I want to report a bug in R that is limiting my capabilities to >> export a matrix with write.csv or write.table with over 2,147,483,648 >> elements (C's int limit). I found this bug already reported about >> before: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17182. >> However, there appears to be no solution or fixes in upcoming R >> version releases. >> >> The error message is coming from the writetable part of the utils >> package in the io.c source >> code(https://svn.r-project.org/R/trunk/src/library/utils/src/io.c): >> /* quick integrity check */ >> ???????????????? if(XLENGTH(x) != (R_len_t)nr * nc) >> ???????????????????? error(_("corrupt matrix -- dims not not match >> length")); >> >> The issue is that nr*nc is an integer and the size of my matrix, 2.8 >> billion elements, exceeds C's limit, so the check forces the code to >> fail. > > Yes, looks like a typo:? R_len_t is an int, and that's how nr was > declared.? It should be R_xlen_t, which is bigger on machines that > support big vectors. > > I haven't tested the change; there may be something else in that > function that assumes short vectors.Indeed, I think the function won't work for long vectors because of EncodeElement2 and EncodeElement0. EncodeElement2/0 would have to be changed, including their signatures Tomas> > Duncan Murdoch > >> >> My version: >>> R.Version() >> $platform >> [1] "x86_64-w64-mingw32" >> >> $arch >> [1] "x86_64" >> >> $os >> [1] "mingw32" >> >> $system >> [1] "x86_64, mingw32" >> >> $status >> [1] "" >> >> $major >> [1] "3" >> >> $minor >> [1] "4.3" >> >> $year >> [1] "2017" >> >> $month >> [1] "11" >> >> $day >> [1] "30" >> >> $`svn rev` >> [1] "73796" >> >> $language >> [1] "R" >> >> $version.string >> [1] "R version 3.4.3 (2017-11-30)" >> >> $nickname >> [1] "Kite-Eating Tree" >> >> Thank you, >> Colton >> >> >> Colton Tousey >> Research Associate II >> P: 816.585.0300?? E: colton.tousey at kc.frb.org >> FEDERAL RESERVE BANK OF KANSAS CITY >> 1 Memorial Drive?? *?? Kansas City, Missouri 64198?? * >> www.kansascityfed.org >> >> ????[[alternative HTML version deleted]] >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Serguei Sokol
2018-Apr-19 09:47 UTC
[Rd] R Bug: write.table for matrix of more than 2, 147, 483, 648 elements
Le 19/04/2018 ? 09:30, Tomas Kalibera a ?crit?:> On 04/19/2018 02:06 AM, Duncan Murdoch wrote: >> On 18/04/2018 5:08 PM, Tousey, Colton wrote: >>> Hello, >>> >>> I want to report a bug in R that is limiting my capabilities to >>> export a matrix with write.csv or write.table with over >>> 2,147,483,648 elements (C's int limit). I found this bug already >>> reported about before: >>> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17182. However, >>> there appears to be no solution or fixes in upcoming R version >>> releases. >>> >>> The error message is coming from the writetable part of the utils >>> package in the io.c source >>> code(https://svn.r-project.org/R/trunk/src/library/utils/src/io.c): >>> /* quick integrity check */ >>> ???????????????? if(XLENGTH(x) != (R_len_t)nr * nc) >>> ???????????????????? error(_("corrupt matrix -- dims not not match >>> length")); >>> >>> The issue is that nr*nc is an integer and the size of my matrix, 2.8 >>> billion elements, exceeds C's limit, so the check forces the code to >>> fail. >> >> Yes, looks like a typo:? R_len_t is an int, and that's how nr was >> declared.? It should be R_xlen_t, which is bigger on machines that >> support big vectors. >> >> I haven't tested the change; there may be something else in that >> function that assumes short vectors. > Indeed, I think the function won't work for long vectors because of > EncodeElement2 and EncodeElement0. EncodeElement2/0 would have to be > changed, including their signaturesThat would be a definite fix but before such deep rewriting is undertaken may the following small fix (in addition to "(R_xlen_t)nr * nc") will be sufficient for cases where nr and nc are in int range but their product can reach long vector limit: replace ??? tmp = EncodeElement2(x, i + j*nr, quote_col[j], qmethod, ??? ??? ??? ??? ??? &strBuf, sdec); by ??? tmp = EncodeElement2(VECTOR_ELT(x, (R_xlen_t)i + j*nr), 0, quote_col[j], qmethod, ??? ??? ??? ??? ??? &strBuf, sdec); Serguei
Possibly Parallel Threads
- R Bug: write.table for matrix of more than 2, 147, 483, 648 elements
- R Bug: write.table for matrix of more than 2, 147, 483, 648 elements
- R Bug: write.table for matrix of more than 2, 147, 483, 648 elements
- R Bug: write.table for matrix of more than 2, 147, 483, 648 elements
- R Bug: write.table for matrix of more than 2, 147, 483, 648 elements