Martin Maechler
2020-Sep-08 08:52 UTC
[Rd] Operations with long altrep vectors cause segfaults on Windows
>>>>> Martin Maechler >>>>> on Tue, 8 Sep 2020 10:40:24 +0200 writes:>>>>> Hugh Parsonage >>>>> on Tue, 8 Sep 2020 18:08:11 +1000 writes:>> I can only reproduce on Windows, but reliably (both 4.0.0 and 4.0.2): >> $> R --vanilla >> x <- c(0L, -2e9:2e9) >> # > Segmentation fault >> Tried to reproduce on Linux but the above worked as expected. Not an >> issue merely with the length of the vector; for example, x <- >> rep_len(1:10, 1e10) works, though the altrep vector must be long to >> reproduce: >> x <- c(0L, -1e9:1e9) #ok >> Segmentation faults occur with the following too: >> x <- (-2e9:2e9) + 1L > Your operation would "need" (not in theory, but in practice) > to go from altrep to regular vectors. > I guess the segfault occurs because of something like this : > R asks Windows to hand it a huge amount of memory and Windows replies > "ok, here is the memory pointer" > and then R tries to write to there, but illegally (because > Windows should have told R that it does not really have enough > memory for that ..). > I cannot reproduce the segmentation fault .. but I can confirm > there is a bug there that shows for me on Windows but not on > Linux: > "My" Windows is on a terminalserver not with too many GB of memory > (but then in a version of Windows that recognizes that it cannot > get so much memory): > ------------------------- Here some transcript (thanks to > using Emacs w/ ESS also on Windows) ------------------ > R Under development (unstable) (2020-08-24 r79074) -- "Unsuffered Consequences" > Copyright (C) 2020 The R Foundation for Statistical Computing > Platform: x86_64-w64-mingw32/x64 (64-bit) > R ist freie Software und kommt OHNE JEGLICHE GARANTIE. > Sie sind eingeladen, es unter bestimmten Bedingungen weiter zu verbreiten. > Tippen Sie 'license()' or 'licence()' f?r Details dazu. > R ist ein Gemeinschaftsprojekt mit vielen Beitragenden. > Tippen Sie 'contributors()' f?r mehr Information und 'citation()', > um zu erfahren, wie R oder R packages in Publikationen zitiert werden k?nnen. > Tippen Sie 'demo()' f?r einige Demos, 'help()' f?r on-line Hilfe, oder > 'help.start()' f?r eine HTML Browserschnittstelle zur Hilfe. > Tippen Sie 'q()', um R zu verlassen. >> x <- (-2e9:2e9) + 1L > Fehler: kann Vektor der Gr??e 14.9 GB nicht allozieren >> y <- c(0L, -2e9:2e9) > Fehler: kann Vektor der Gr??e 14.9 GB nicht allozieren >> Sys.setenv(LANGUAGE="en") >> y <- c(0L, -2e9:2e9) > Error: cannot allocate vector of size 14.9 Gb >> y <- -1e9:4e9 >> .Internal(inspect(y)) > @0x00000000195a6808 14 REALSXP g0c0 [REF(65535)] -1000000000 : -294967296 (compact) >> .Machine$integer.max / 1e9 > [1] 2.147484 >> y <- -1e6:2.2e9 >> .Internal(inspect(y)) > @0x000000000a11a5d8 14 REALSXP g0c0 [REF(65535)] -1000000 : -2094967296 (compact) >> y <- -1e6:2e9 >> .Internal(inspect(y)) > @0x000000000a13adf0 13 INTSXP g0c0 [REF(65535)] -1000000 : 2000000000 (compact) >> > ------------------------- end of transcript ----------------------------------- > So indeed, no seg.fault, R notices that it can't get 15 GB of > memory. > But the bug is bad news: We have *silent* integer overflow happening > according to what .Internal(inspect(y)) shows... > .... less bad new: Probably the bug is only in the 'internal inspect' code > where a format specifier is used in C's printf() that does not work > correctly on Windows, at least the way it is currently compiled .. > On (64-bit) Linux, I get >> y <- -1e9:4e9 ; .Internal(inspect(y)) > @7d86388 14 REALSXP g0c0 [REF(65535)] -1000000000 : 4000000000 (compact) >> y <- c(0L, y) > Error: cannot allocate vector of size 37.3 Gb > which seems much better ... until I do find a bug, may again > only in the C code underlying .Internal(inspect(.)) : >> y <- -1e9:2e9 ; .Internal(inspect(y)) > @7d86ac0 13 INTSXP g0c0 [REF(65535)] Error: long vectors not supported yet: ../../../R/src/main/altclasses.c:139 >> Indeed, the purported "integer overflow" (above) does not happen. It is "only" a 'printf' related bug inside .Internal(inspect(.)) on Windows. *interestingly*, the above bug I've noticed on (64-bit) Linux does *not* show on Windows (64-bit), at least not for that case: On Windows, things are fine as long as they remain (compacted aka 'ALTREP') INTSXP: > y <- -1e3:2e9 ;.Internal(inspect(y)) @0x000000000a285648 13 INTSXP g0c0 [REF(65535)] -1000 : 2000000000 (compact) > y <- -1e3:2.1e9 ;.Internal(inspect(y)) @0x0000000019925930 13 INTSXP g0c0 [REF(65535)] -1000 : 2100000000 (compact) and here, y is correct, just the printing from .Internal(inspect(y)) is bugous (probably prints the double as an integer): > y <- -1e3:2.2e9 ; .Internal(inspect(y)) @0x00000000195c0178 14 REALSXP g0c0 [REF(65535)] -1000 : -2094967296 (compact) > length(y) [1] 2200001001 > tail(y) [1] 2.2e+09 2.2e+09 2.2e+09 2.2e+09 2.2e+09 2.2e+09 > tail(y) - 2.2e9 [1] -5 -4 -3 -2 -1 0 >
Hugh Parsonage
2020-Sep-08 09:15 UTC
[Rd] Operations with long altrep vectors cause segfaults on Windows
Thanks Martin. On further testing, it seems that the segmentation fault can only occur when the amount of obtainable memory is sufficiently high. On my machine (admittedly with other processes running): $ R --vanilla --max-mem-size=30G -e "x <- c(0L, -2e9:2e9)" Segmentation fault $ R --vanilla --max-mem-size=29G -e "x <- c(0L, -2e9:2e9)" Error: cannot allocate vector of size 14.9 Gb Execution halted On Tue, 8 Sep 2020 at 18:52, Martin Maechler <maechler at stat.math.ethz.ch> wrote:> > >>>>> Martin Maechler > >>>>> on Tue, 8 Sep 2020 10:40:24 +0200 writes: > > >>>>> Hugh Parsonage > >>>>> on Tue, 8 Sep 2020 18:08:11 +1000 writes: > > >> I can only reproduce on Windows, but reliably (both 4.0.0 and 4.0.2): > > >> $> R --vanilla > >> x <- c(0L, -2e9:2e9) > > >> # > Segmentation fault > > >> Tried to reproduce on Linux but the above worked as expected. Not an > >> issue merely with the length of the vector; for example, x <- > >> rep_len(1:10, 1e10) works, though the altrep vector must be long to > >> reproduce: > > >> x <- c(0L, -1e9:1e9) #ok > > >> Segmentation faults occur with the following too: > > >> x <- (-2e9:2e9) + 1L > > > Your operation would "need" (not in theory, but in practice) > > to go from altrep to regular vectors. > > I guess the segfault occurs because of something like this : > > > R asks Windows to hand it a huge amount of memory and Windows replies > > "ok, here is the memory pointer" > > and then R tries to write to there, but illegally (because > > Windows should have told R that it does not really have enough > > memory for that ..). > > > I cannot reproduce the segmentation fault .. but I can confirm > > there is a bug there that shows for me on Windows but not on > > Linux: > > > "My" Windows is on a terminalserver not with too many GB of memory > > (but then in a version of Windows that recognizes that it cannot > > get so much memory): > > > ------------------------- Here some transcript (thanks to > > using Emacs w/ ESS also on Windows) ------------------ > > > R Under development (unstable) (2020-08-24 r79074) -- "Unsuffered Consequences" > > Copyright (C) 2020 The R Foundation for Statistical Computing > > Platform: x86_64-w64-mingw32/x64 (64-bit) > > > R ist freie Software und kommt OHNE JEGLICHE GARANTIE. > > Sie sind eingeladen, es unter bestimmten Bedingungen weiter zu verbreiten. > > Tippen Sie 'license()' or 'licence()' f?r Details dazu. > > > R ist ein Gemeinschaftsprojekt mit vielen Beitragenden. > > Tippen Sie 'contributors()' f?r mehr Information und 'citation()', > > um zu erfahren, wie R oder R packages in Publikationen zitiert werden k?nnen. > > > Tippen Sie 'demo()' f?r einige Demos, 'help()' f?r on-line Hilfe, oder > > 'help.start()' f?r eine HTML Browserschnittstelle zur Hilfe. > > Tippen Sie 'q()', um R zu verlassen. > > >> x <- (-2e9:2e9) + 1L > > Fehler: kann Vektor der Gr??e 14.9 GB nicht allozieren > >> y <- c(0L, -2e9:2e9) > > Fehler: kann Vektor der Gr??e 14.9 GB nicht allozieren > >> Sys.setenv(LANGUAGE="en") > >> y <- c(0L, -2e9:2e9) > > Error: cannot allocate vector of size 14.9 Gb > >> y <- -1e9:4e9 > >> .Internal(inspect(y)) > > @0x00000000195a6808 14 REALSXP g0c0 [REF(65535)] -1000000000 : -294967296 (compact) > >> .Machine$integer.max / 1e9 > > [1] 2.147484 > >> y <- -1e6:2.2e9 > >> .Internal(inspect(y)) > > @0x000000000a11a5d8 14 REALSXP g0c0 [REF(65535)] -1000000 : -2094967296 (compact) > >> y <- -1e6:2e9 > >> .Internal(inspect(y)) > > @0x000000000a13adf0 13 INTSXP g0c0 [REF(65535)] -1000000 : 2000000000 (compact) > >> > > ------------------------- end of transcript ----------------------------------- > > > So indeed, no seg.fault, R notices that it can't get 15 GB of > > memory. > > > But the bug is bad news: We have *silent* integer overflow happening > > according to what .Internal(inspect(y)) shows... > > > .... less bad new: Probably the bug is only in the 'internal inspect' code > > where a format specifier is used in C's printf() that does not work > > correctly on Windows, at least the way it is currently compiled .. > > > > On (64-bit) Linux, I get > > >> y <- -1e9:4e9 ; .Internal(inspect(y)) > > @7d86388 14 REALSXP g0c0 [REF(65535)] -1000000000 : 4000000000 (compact) > > >> y <- c(0L, y) > > Error: cannot allocate vector of size 37.3 Gb > > > which seems much better ... until I do find a bug, may again > > only in the C code underlying .Internal(inspect(.)) : > > >> y <- -1e9:2e9 ; .Internal(inspect(y)) > > @7d86ac0 13 INTSXP g0c0 [REF(65535)] Error: long vectors not supported yet: ../../../R/src/main/altclasses.c:139 > >> > > Indeed, the purported "integer overflow" (above) does not > happen. > It is "only" a 'printf' related bug inside .Internal(inspect(.)) on Windows. > > *interestingly*, the above bug I've noticed on (64-bit) Linux > does *not* show on Windows (64-bit), at least not for that case: > > On Windows, things are fine as long as they remain (compacted > aka 'ALTREP') INTSXP: > > > y <- -1e3:2e9 ;.Internal(inspect(y)) > @0x000000000a285648 13 INTSXP g0c0 [REF(65535)] -1000 : 2000000000 (compact) > > y <- -1e3:2.1e9 ;.Internal(inspect(y)) > @0x0000000019925930 13 INTSXP g0c0 [REF(65535)] -1000 : 2100000000 (compact) > > and here, y is correct, just the printing from > .Internal(inspect(y)) is bugous (probably prints the double as an integer): > > > y <- -1e3:2.2e9 ; .Internal(inspect(y)) > @0x00000000195c0178 14 REALSXP g0c0 [REF(65535)] -1000 : -2094967296 (compact) > > length(y) > [1] 2200001001 > > tail(y) > [1] 2.2e+09 2.2e+09 2.2e+09 2.2e+09 2.2e+09 2.2e+09 > > tail(y) - 2.2e9 > [1] -5 -4 -3 -2 -1 0 > > >
iuke-tier@ey m@iii@g oii uiow@@edu
2020-Sep-08 14:32 UTC
[Rd] [External] Re: Operations with long altrep vectors cause segfaults on Windows
On Tue, 8 Sep 2020, Hugh Parsonage wrote:> Thanks Martin. On further testing, it seems that the segmentation > fault can only occur when the amount of obtainable memory is > sufficiently high. On my machine (admittedly with other processes > running): > > $ R --vanilla --max-mem-size=30G -e "x <- c(0L, -2e9:2e9)" > Segmentation fault > > $ R --vanilla --max-mem-size=29G -e "x <- c(0L, -2e9:2e9)" > Error: cannot allocate vector of size 14.9 Gb > Execution haltedUnfortunately I don't have access to a Windows machine with enough memory to get to the point of failure. If you have rtools and gdb installed can you run in gdb and see where the segfault is happening? Best, luke> > On Tue, 8 Sep 2020 at 18:52, Martin Maechler <maechler at stat.math.ethz.ch> wrote: >> >>>>>>> Martin Maechler >>>>>>> on Tue, 8 Sep 2020 10:40:24 +0200 writes: >> >>>>>>> Hugh Parsonage >>>>>>> on Tue, 8 Sep 2020 18:08:11 +1000 writes: >> >> >> I can only reproduce on Windows, but reliably (both 4.0.0 and 4.0.2): >> >> >> $> R --vanilla >> >> x <- c(0L, -2e9:2e9) >> >> >> # > Segmentation fault >> >> >> Tried to reproduce on Linux but the above worked as expected. Not an >> >> issue merely with the length of the vector; for example, x <- >> >> rep_len(1:10, 1e10) works, though the altrep vector must be long to >> >> reproduce: >> >> >> x <- c(0L, -1e9:1e9) #ok >> >> >> Segmentation faults occur with the following too: >> >> >> x <- (-2e9:2e9) + 1L >> >> > Your operation would "need" (not in theory, but in practice) >> > to go from altrep to regular vectors. >> > I guess the segfault occurs because of something like this : >> >> > R asks Windows to hand it a huge amount of memory and Windows replies >> > "ok, here is the memory pointer" >> > and then R tries to write to there, but illegally (because >> > Windows should have told R that it does not really have enough >> > memory for that ..). >> >> > I cannot reproduce the segmentation fault .. but I can confirm >> > there is a bug there that shows for me on Windows but not on >> > Linux: >> >> > "My" Windows is on a terminalserver not with too many GB of memory >> > (but then in a version of Windows that recognizes that it cannot >> > get so much memory): >> >> > ------------------------- Here some transcript (thanks to >> > using Emacs w/ ESS also on Windows) ------------------ >> >> > R Under development (unstable) (2020-08-24 r79074) -- "Unsuffered Consequences" >> > Copyright (C) 2020 The R Foundation for Statistical Computing >> > Platform: x86_64-w64-mingw32/x64 (64-bit) >> >> > R ist freie Software und kommt OHNE JEGLICHE GARANTIE. >> > Sie sind eingeladen, es unter bestimmten Bedingungen weiter zu verbreiten. >> > Tippen Sie 'license()' or 'licence()' f?r Details dazu. >> >> > R ist ein Gemeinschaftsprojekt mit vielen Beitragenden. >> > Tippen Sie 'contributors()' f?r mehr Information und 'citation()', >> > um zu erfahren, wie R oder R packages in Publikationen zitiert werden k?nnen. >> >> > Tippen Sie 'demo()' f?r einige Demos, 'help()' f?r on-line Hilfe, oder >> > 'help.start()' f?r eine HTML Browserschnittstelle zur Hilfe. >> > Tippen Sie 'q()', um R zu verlassen. >> >> >> x <- (-2e9:2e9) + 1L >> > Fehler: kann Vektor der Gr??e 14.9 GB nicht allozieren >> >> y <- c(0L, -2e9:2e9) >> > Fehler: kann Vektor der Gr??e 14.9 GB nicht allozieren >> >> Sys.setenv(LANGUAGE="en") >> >> y <- c(0L, -2e9:2e9) >> > Error: cannot allocate vector of size 14.9 Gb >> >> y <- -1e9:4e9 >> >> .Internal(inspect(y)) >> > @0x00000000195a6808 14 REALSXP g0c0 [REF(65535)] -1000000000 : -294967296 (compact) >> >> .Machine$integer.max / 1e9 >> > [1] 2.147484 >> >> y <- -1e6:2.2e9 >> >> .Internal(inspect(y)) >> > @0x000000000a11a5d8 14 REALSXP g0c0 [REF(65535)] -1000000 : -2094967296 (compact) >> >> y <- -1e6:2e9 >> >> .Internal(inspect(y)) >> > @0x000000000a13adf0 13 INTSXP g0c0 [REF(65535)] -1000000 : 2000000000 (compact) >> >> >> > ------------------------- end of transcript ----------------------------------- >> >> > So indeed, no seg.fault, R notices that it can't get 15 GB of >> > memory. >> >> > But the bug is bad news: We have *silent* integer overflow happening >> > according to what .Internal(inspect(y)) shows... >> >> > .... less bad new: Probably the bug is only in the 'internal inspect' code >> > where a format specifier is used in C's printf() that does not work >> > correctly on Windows, at least the way it is currently compiled .. >> >> >> > On (64-bit) Linux, I get >> >> >> y <- -1e9:4e9 ; .Internal(inspect(y)) >> > @7d86388 14 REALSXP g0c0 [REF(65535)] -1000000000 : 4000000000 (compact) >> >> >> y <- c(0L, y) >> > Error: cannot allocate vector of size 37.3 Gb >> >> > which seems much better ... until I do find a bug, may again >> > only in the C code underlying .Internal(inspect(.)) : >> >> >> y <- -1e9:2e9 ; .Internal(inspect(y)) >> > @7d86ac0 13 INTSXP g0c0 [REF(65535)] Error: long vectors not supported yet: ../../../R/src/main/altclasses.c:139 >> >> >> >> Indeed, the purported "integer overflow" (above) does not >> happen. >> It is "only" a 'printf' related bug inside .Internal(inspect(.)) on Windows. >> >> *interestingly*, the above bug I've noticed on (64-bit) Linux >> does *not* show on Windows (64-bit), at least not for that case: >> >> On Windows, things are fine as long as they remain (compacted >> aka 'ALTREP') INTSXP: >> >> > y <- -1e3:2e9 ;.Internal(inspect(y)) >> @0x000000000a285648 13 INTSXP g0c0 [REF(65535)] -1000 : 2000000000 (compact) >> > y <- -1e3:2.1e9 ;.Internal(inspect(y)) >> @0x0000000019925930 13 INTSXP g0c0 [REF(65535)] -1000 : 2100000000 (compact) >> >> and here, y is correct, just the printing from >> .Internal(inspect(y)) is bugous (probably prints the double as an integer): >> >> > y <- -1e3:2.2e9 ; .Internal(inspect(y)) >> @0x00000000195c0178 14 REALSXP g0c0 [REF(65535)] -1000 : -2094967296 (compact) >> > length(y) >> [1] 2200001001 >> > tail(y) >> [1] 2.2e+09 2.2e+09 2.2e+09 2.2e+09 2.2e+09 2.2e+09 >> > tail(y) - 2.2e9 >> [1] -5 -4 -3 -2 -1 0 >> > >> > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tierney at uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
iuke-tier@ey m@iii@g oii uiow@@edu
2020-Sep-08 14:42 UTC
[Rd] [External] Re: Operations with long altrep vectors cause segfaults on Windows
On Tue, 8 Sep 2020, Martin Maechler wrote:>>>>>> Martin Maechler >>>>>> on Tue, 8 Sep 2020 10:40:24 +0200 writes: > >>>>>> Hugh Parsonage >>>>>> on Tue, 8 Sep 2020 18:08:11 +1000 writes: > > >> I can only reproduce on Windows, but reliably (both 4.0.0 and 4.0.2): > > >> $> R --vanilla > >> x <- c(0L, -2e9:2e9) > > >> # > Segmentation fault > > >> Tried to reproduce on Linux but the above worked as expected. Not an > >> issue merely with the length of the vector; for example, x <- > >> rep_len(1:10, 1e10) works, though the altrep vector must be long to > >> reproduce: > > >> x <- c(0L, -1e9:1e9) #ok > > >> Segmentation faults occur with the following too: > > >> x <- (-2e9:2e9) + 1L > > > Your operation would "need" (not in theory, but in practice) > > to go from altrep to regular vectors. > > I guess the segfault occurs because of something like this : > > > R asks Windows to hand it a huge amount of memory and Windows replies > > "ok, here is the memory pointer" > > and then R tries to write to there, but illegally (because > > Windows should have told R that it does not really have enough > > memory for that ..). > > > I cannot reproduce the segmentation fault .. but I can confirm > > there is a bug there that shows for me on Windows but not on > > Linux: > > > "My" Windows is on a terminalserver not with too many GB of memory > > (but then in a version of Windows that recognizes that it cannot > > get so much memory): > > > ------------------------- Here some transcript (thanks to > > using Emacs w/ ESS also on Windows) ------------------ > > > R Under development (unstable) (2020-08-24 r79074) -- "Unsuffered Consequences" > > Copyright (C) 2020 The R Foundation for Statistical Computing > > Platform: x86_64-w64-mingw32/x64 (64-bit) > > > R ist freie Software und kommt OHNE JEGLICHE GARANTIE. > > Sie sind eingeladen, es unter bestimmten Bedingungen weiter zu verbreiten. > > Tippen Sie 'license()' or 'licence()' f?r Details dazu. > > > R ist ein Gemeinschaftsprojekt mit vielen Beitragenden. > > Tippen Sie 'contributors()' f?r mehr Information und 'citation()', > > um zu erfahren, wie R oder R packages in Publikationen zitiert werden k?nnen. > > > Tippen Sie 'demo()' f?r einige Demos, 'help()' f?r on-line Hilfe, oder > > 'help.start()' f?r eine HTML Browserschnittstelle zur Hilfe. > > Tippen Sie 'q()', um R zu verlassen. > > >> x <- (-2e9:2e9) + 1L > > Fehler: kann Vektor der Gr??e 14.9 GB nicht allozieren > >> y <- c(0L, -2e9:2e9) > > Fehler: kann Vektor der Gr??e 14.9 GB nicht allozieren > >> Sys.setenv(LANGUAGE="en") > >> y <- c(0L, -2e9:2e9) > > Error: cannot allocate vector of size 14.9 Gb > >> y <- -1e9:4e9 > >> .Internal(inspect(y)) > > @0x00000000195a6808 14 REALSXP g0c0 [REF(65535)] -1000000000 : -294967296 (compact) > >> .Machine$integer.max / 1e9 > > [1] 2.147484 > >> y <- -1e6:2.2e9 > >> .Internal(inspect(y)) > > @0x000000000a11a5d8 14 REALSXP g0c0 [REF(65535)] -1000000 : -2094967296 (compact) > >> y <- -1e6:2e9 > >> .Internal(inspect(y)) > > @0x000000000a13adf0 13 INTSXP g0c0 [REF(65535)] -1000000 : 2000000000 (compact) > >> > > ------------------------- end of transcript ----------------------------------- > > > So indeed, no seg.fault, R notices that it can't get 15 GB of > > memory. > > > But the bug is bad news: We have *silent* integer overflow happening > > according to what .Internal(inspect(y)) shows... > > > .... less bad new: Probably the bug is only in the 'internal inspect' code > > where a format specifier is used in C's printf() that does not work > > correctly on Windows, at least the way it is currently compiled .. > > > > On (64-bit) Linux, I get > > >> y <- -1e9:4e9 ; .Internal(inspect(y)) > > @7d86388 14 REALSXP g0c0 [REF(65535)] -1000000000 : 4000000000 (compact) > > >> y <- c(0L, y) > > Error: cannot allocate vector of size 37.3 Gb > > > which seems much better ... until I do find a bug, may again > > only in the C code underlying .Internal(inspect(.)) : > > >> y <- -1e9:2e9 ; .Internal(inspect(y)) > > @7d86ac0 13 INTSXP g0c0 [REF(65535)] Error: long vectors not supported yet: ../../../R/src/main/altclasses.c:139 > >> > > Indeed, the purported "integer overflow" (above) does not > happen. > It is "only" a 'printf' related bug inside .Internal(inspect(.)) on Windows. > > *interestingly*, the above bug I've noticed on (64-bit) Linux > does *not* show on Windows (64-bit), at least not for that case: > > On Windows, things are fine as long as they remain (compacted > aka 'ALTREP') INTSXP: > > > y <- -1e3:2e9 ;.Internal(inspect(y)) > @0x000000000a285648 13 INTSXP g0c0 [REF(65535)] -1000 : 2000000000 (compact) > > y <- -1e3:2.1e9 ;.Internal(inspect(y)) > @0x0000000019925930 13 INTSXP g0c0 [REF(65535)] -1000 : 2100000000 (compact) > > and here, y is correct, just the printing from > .Internal(inspect(y)) is bugous (probably prints the double as an integer):It's a '%ld' that probably needs to be '%lld' for Windows. Will fix sometime soon. Best, luke> > > y <- -1e3:2.2e9 ; .Internal(inspect(y)) > @0x00000000195c0178 14 REALSXP g0c0 [REF(65535)] -1000 : -2094967296 (compact) > > length(y) > [1] 2200001001 > > tail(y) > [1] 2.2e+09 2.2e+09 2.2e+09 2.2e+09 2.2e+09 2.2e+09 > > tail(y) - 2.2e9 > [1] -5 -4 -3 -2 -1 0 > > > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tierney at uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
Martin Maechler
2020-Sep-08 15:57 UTC
[Rd] [External] Re: Operations with long altrep vectors cause segfaults on Windows
>>>>> luke-tierney >>>>> on Tue, 8 Sep 2020 09:42:43 -0500 (CDT) writes:> On Tue, 8 Sep 2020, Martin Maechler wrote: >>>>>>> Martin Maechler >>>>>>> on Tue, 8 Sep 2020 10:40:24 +0200 writes: >> >>>>>>> Hugh Parsonage >>>>>>> on Tue, 8 Sep 2020 18:08:11 +1000 writes: >> >> >> I can only reproduce on Windows, but reliably (both 4.0.0 and 4.0.2): >> >> >> $> R --vanilla >> >> x <- c(0L, -2e9:2e9) >> >> >> # > Segmentation fault >> >> >> Tried to reproduce on Linux but the above worked as expected. Not an >> >> issue merely with the length of the vector; for example, x <- >> >> rep_len(1:10, 1e10) works, though the altrep vector must be long to >> >> reproduce: >> >> >> x <- c(0L, -1e9:1e9) #ok >> >> >> Segmentation faults occur with the following too: >> >> >> x <- (-2e9:2e9) + 1L >> >> > Your operation would "need" (not in theory, but in practice) >> > to go from altrep to regular vectors. >> > I guess the segfault occurs because of something like this : >> >> > R asks Windows to hand it a huge amount of memory and Windows replies >> > "ok, here is the memory pointer" >> > and then R tries to write to there, but illegally (because >> > Windows should have told R that it does not really have enough >> > memory for that ..). >> >> > I cannot reproduce the segmentation fault .. but I can confirm >> > there is a bug there that shows for me on Windows but not on >> > Linux: >> >> > "My" Windows is on a terminalserver not with too many GB of memory >> > (but then in a version of Windows that recognizes that it cannot >> > get so much memory): >> >> > ------------------------- Here some transcript (thanks to >> > using Emacs w/ ESS also on Windows) ------------------ >> >> > R Under development (unstable) (2020-08-24 r79074) -- "Unsuffered Consequences" >> > Copyright (C) 2020 The R Foundation for Statistical Computing >> > Platform: x86_64-w64-mingw32/x64 (64-bit) >> >> > R ist freie Software und kommt OHNE JEGLICHE GARANTIE. >> > Sie sind eingeladen, es unter bestimmten Bedingungen weiter zu verbreiten. >> > Tippen Sie 'license()' or 'licence()' f?r Details dazu. >> >> > R ist ein Gemeinschaftsprojekt mit vielen Beitragenden. >> > Tippen Sie 'contributors()' f?r mehr Information und 'citation()', >> > um zu erfahren, wie R oder R packages in Publikationen zitiert werden k?nnen. >> >> > Tippen Sie 'demo()' f?r einige Demos, 'help()' f?r on-line Hilfe, oder >> > 'help.start()' f?r eine HTML Browserschnittstelle zur Hilfe. >> > Tippen Sie 'q()', um R zu verlassen. >> >> >> x <- (-2e9:2e9) + 1L >> > Fehler: kann Vektor der Gr??e 14.9 GB nicht allozieren >> >> y <- c(0L, -2e9:2e9) >> > Fehler: kann Vektor der Gr??e 14.9 GB nicht allozieren >> >> Sys.setenv(LANGUAGE="en") >> >> y <- c(0L, -2e9:2e9) >> > Error: cannot allocate vector of size 14.9 Gb >> >> y <- -1e9:4e9 >> >> .Internal(inspect(y)) >> > @0x00000000195a6808 14 REALSXP g0c0 [REF(65535)] -1000000000 : -294967296 (compact) >> >> .Machine$integer.max / 1e9 >> > [1] 2.147484 >> >> y <- -1e6:2.2e9 >> >> .Internal(inspect(y)) >> > @0x000000000a11a5d8 14 REALSXP g0c0 [REF(65535)] -1000000 : -2094967296 (compact) >> >> y <- -1e6:2e9 >> >> .Internal(inspect(y)) >> > @0x000000000a13adf0 13 INTSXP g0c0 [REF(65535)] -1000000 : 2000000000 (compact) >> >> >> > ------------------------- end of transcript ----------------------------------- >> >> > So indeed, no seg.fault, R notices that it can't get 15 GB of >> > memory. >> >> > But the bug is bad news: We have *silent* integer overflow happening >> > according to what .Internal(inspect(y)) shows... >> >> > .... less bad new: Probably the bug is only in the 'internal inspect' code >> > where a format specifier is used in C's printf() that does not work >> > correctly on Windows, at least the way it is currently compiled .. >> >> >> > On (64-bit) Linux, I get >> >> >> y <- -1e9:4e9 ; .Internal(inspect(y)) >> > @7d86388 14 REALSXP g0c0 [REF(65535)] -1000000000 : 4000000000 (compact) >> >> >> y <- c(0L, y) >> > Error: cannot allocate vector of size 37.3 Gb >> >> > which seems much better ... until I do find a bug, may again >> > only in the C code underlying .Internal(inspect(.)) : >> >> >> y <- -1e9:2e9 ; .Internal(inspect(y)) >> > @7d86ac0 13 INTSXP g0c0 [REF(65535)] Error: long vectors not supported yet: ../../../R/src/main/altclasses.c:139 >> >> >> >> Indeed, the purported "integer overflow" (above) does not >> happen. >> It is "only" a 'printf' related bug inside .Internal(inspect(.)) on Windows. >> >> *interestingly*, the above bug I've noticed on (64-bit) Linux >> does *not* show on Windows (64-bit), at least not for that case: >> >> On Windows, things are fine as long as they remain (compacted >> aka 'ALTREP') INTSXP: >> >> > y <- -1e3:2e9 ;.Internal(inspect(y)) >> @0x000000000a285648 13 INTSXP g0c0 [REF(65535)] -1000 : 2000000000 (compact) >> > y <- -1e3:2.1e9 ;.Internal(inspect(y)) >> @0x0000000019925930 13 INTSXP g0c0 [REF(65535)] -1000 : 2100000000 (compact) >> >> and here, y is correct, just the printing from >> .Internal(inspect(y)) is bugous (probably prints the double as an integer): > It's a '%ld' that probably needs to be '%lld' for Windows. Will fix > sometime soon. > Best, > luke I had guessed at something like that .. but "interestingly" it was quite different: Our code use int n = LENGTH(.); and the error message above was triggered there. I've committed a fix to both R-devel and R-patched (and added a regression test), but I still wonder why the above error had not triggered on Windows... Martin >> >> > y <- -1e3:2.2e9 ; .Internal(inspect(y)) >> @0x00000000195c0178 14 REALSXP g0c0 [REF(65535)] -1000 : -2094967296 (compact) >> > length(y) >> [1] 2200001001 >> > tail(y) >> [1] 2.2e+09 2.2e+09 2.2e+09 2.2e+09 2.2e+09 2.2e+09 >> > tail(y) - 2.2e9 >> [1] -5 -4 -3 -2 -1 0 >> > >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > -- > Luke Tierney > Ralph E. Wareham Professor of Mathematical Sciences > University of Iowa Phone: 319-335-3386 > Department of Statistics and Fax: 319-335-3017 > Actuarial Science > 241 Schaeffer Hall email: luke-tierney at uiowa.edu > Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
Possibly Parallel Threads
- [External] Re: Operations with long altrep vectors cause segfaults on Windows
- [External] Re: Operations with long altrep vectors cause segfaults on Windows
- [External] Re: Operations with long altrep vectors cause segfaults on Windows
- [External] Re: Operations with long altrep vectors cause segfaults on Windows
- Operations with long altrep vectors cause segfaults on Windows