Hugh Parsonage
2020-Sep-08 08:08 UTC
[Rd] Operations with long altrep vectors cause segfaults on Windows
I can only reproduce on Windows, but reliably (both 4.0.0 and 4.0.2): $> R --vanilla x <- c(0L, -2e9:2e9) # > Segmentation fault Tried to reproduce on Linux but the above worked as expected. Not an issue merely with the length of the vector; for example, x <- rep_len(1:10, 1e10) works, though the altrep vector must be long to reproduce: x <- c(0L, -1e9:1e9) #ok Segmentation faults occur with the following too: x <- (-2e9:2e9) + 1L
Martin Maechler
2020-Sep-08 08:40 UTC
[Rd] Operations with long altrep vectors cause segfaults on Windows
>>>>> Hugh Parsonage >>>>> on Tue, 8 Sep 2020 18:08:11 +1000 writes:> I can only reproduce on Windows, but reliably (both 4.0.0 and 4.0.2): > $> R --vanilla > x <- c(0L, -2e9:2e9) > # > Segmentation fault > Tried to reproduce on Linux but the above worked as expected. Not an > issue merely with the length of the vector; for example, x <- > rep_len(1:10, 1e10) works, though the altrep vector must be long to > reproduce: > x <- c(0L, -1e9:1e9) #ok > Segmentation faults occur with the following too: > x <- (-2e9:2e9) + 1L Your operation would "need" (not in theory, but in practice) to go from altrep to regular vectors. I guess the segfault occurs because of something like this : R asks Windows to hand it a huge amount of memory and Windows replies "ok, here is the memory pointer" and then R tries to write to there, but illegally (because Windows should have told R that it does not really have enough memory for that ..). I cannot reproduce the segmentation fault .. but I can confirm there is a bug there that shows for me on Windows but not on Linux: "My" Windows is on a terminalserver not with too many GB of memory (but then in a version of Windows that recognizes that it cannot get so much memory): ------------------------- Here some transcript (thanks to using Emacs w/ ESS also on Windows) ------------------ R Under development (unstable) (2020-08-24 r79074) -- "Unsuffered Consequences" Copyright (C) 2020 The R Foundation for Statistical Computing Platform: x86_64-w64-mingw32/x64 (64-bit) R ist freie Software und kommt OHNE JEGLICHE GARANTIE. Sie sind eingeladen, es unter bestimmten Bedingungen weiter zu verbreiten. Tippen Sie 'license()' or 'licence()' f?r Details dazu. R ist ein Gemeinschaftsprojekt mit vielen Beitragenden. Tippen Sie 'contributors()' f?r mehr Information und 'citation()', um zu erfahren, wie R oder R packages in Publikationen zitiert werden k?nnen. Tippen Sie 'demo()' f?r einige Demos, 'help()' f?r on-line Hilfe, oder 'help.start()' f?r eine HTML Browserschnittstelle zur Hilfe. Tippen Sie 'q()', um R zu verlassen.> x <- (-2e9:2e9) + 1LFehler: kann Vektor der Gr??e 14.9 GB nicht allozieren> y <- c(0L, -2e9:2e9)Fehler: kann Vektor der Gr??e 14.9 GB nicht allozieren> Sys.setenv(LANGUAGE="en") > y <- c(0L, -2e9:2e9)Error: cannot allocate vector of size 14.9 Gb> y <- -1e9:4e9 > .Internal(inspect(y))@0x00000000195a6808 14 REALSXP g0c0 [REF(65535)] -1000000000 : -294967296 (compact)> .Machine$integer.max / 1e9[1] 2.147484> y <- -1e6:2.2e9 > .Internal(inspect(y))@0x000000000a11a5d8 14 REALSXP g0c0 [REF(65535)] -1000000 : -2094967296 (compact)> y <- -1e6:2e9 > .Internal(inspect(y))@0x000000000a13adf0 13 INTSXP g0c0 [REF(65535)] -1000000 : 2000000000 (compact)>------------------------- end of transcript ----------------------------------- So indeed, no seg.fault, R notices that it can't get 15 GB of memory. But the bug is bad news: We have *silent* integer overflow happening according to what .Internal(inspect(y)) shows... .... less bad new: Probably the bug is only in the 'internal inspect' code where a format specifier is used in C's printf() that does not work correctly on Windows, at least the way it is currently compiled .. On (64-bit) Linux, I get> y <- -1e9:4e9 ; .Internal(inspect(y))@7d86388 14 REALSXP g0c0 [REF(65535)] -1000000000 : 4000000000 (compact)> y <- c(0L, y)Error: cannot allocate vector of size 37.3 Gb which seems much better ... until I do find a bug, may again only in the C code underlying .Internal(inspect(.)) :> y <- -1e9:2e9 ; .Internal(inspect(y))@7d86ac0 13 INTSXP g0c0 [REF(65535)] Error: long vectors not supported yet: ../../../R/src/main/altclasses.c:139>
Martin Maechler
2020-Sep-08 08:52 UTC
[Rd] Operations with long altrep vectors cause segfaults on Windows
>>>>> Martin Maechler >>>>> on Tue, 8 Sep 2020 10:40:24 +0200 writes:>>>>> Hugh Parsonage >>>>> on Tue, 8 Sep 2020 18:08:11 +1000 writes:>> I can only reproduce on Windows, but reliably (both 4.0.0 and 4.0.2): >> $> R --vanilla >> x <- c(0L, -2e9:2e9) >> # > Segmentation fault >> Tried to reproduce on Linux but the above worked as expected. Not an >> issue merely with the length of the vector; for example, x <- >> rep_len(1:10, 1e10) works, though the altrep vector must be long to >> reproduce: >> x <- c(0L, -1e9:1e9) #ok >> Segmentation faults occur with the following too: >> x <- (-2e9:2e9) + 1L > Your operation would "need" (not in theory, but in practice) > to go from altrep to regular vectors. > I guess the segfault occurs because of something like this : > R asks Windows to hand it a huge amount of memory and Windows replies > "ok, here is the memory pointer" > and then R tries to write to there, but illegally (because > Windows should have told R that it does not really have enough > memory for that ..). > I cannot reproduce the segmentation fault .. but I can confirm > there is a bug there that shows for me on Windows but not on > Linux: > "My" Windows is on a terminalserver not with too many GB of memory > (but then in a version of Windows that recognizes that it cannot > get so much memory): > ------------------------- Here some transcript (thanks to > using Emacs w/ ESS also on Windows) ------------------ > R Under development (unstable) (2020-08-24 r79074) -- "Unsuffered Consequences" > Copyright (C) 2020 The R Foundation for Statistical Computing > Platform: x86_64-w64-mingw32/x64 (64-bit) > R ist freie Software und kommt OHNE JEGLICHE GARANTIE. > Sie sind eingeladen, es unter bestimmten Bedingungen weiter zu verbreiten. > Tippen Sie 'license()' or 'licence()' f?r Details dazu. > R ist ein Gemeinschaftsprojekt mit vielen Beitragenden. > Tippen Sie 'contributors()' f?r mehr Information und 'citation()', > um zu erfahren, wie R oder R packages in Publikationen zitiert werden k?nnen. > Tippen Sie 'demo()' f?r einige Demos, 'help()' f?r on-line Hilfe, oder > 'help.start()' f?r eine HTML Browserschnittstelle zur Hilfe. > Tippen Sie 'q()', um R zu verlassen. >> x <- (-2e9:2e9) + 1L > Fehler: kann Vektor der Gr??e 14.9 GB nicht allozieren >> y <- c(0L, -2e9:2e9) > Fehler: kann Vektor der Gr??e 14.9 GB nicht allozieren >> Sys.setenv(LANGUAGE="en") >> y <- c(0L, -2e9:2e9) > Error: cannot allocate vector of size 14.9 Gb >> y <- -1e9:4e9 >> .Internal(inspect(y)) > @0x00000000195a6808 14 REALSXP g0c0 [REF(65535)] -1000000000 : -294967296 (compact) >> .Machine$integer.max / 1e9 > [1] 2.147484 >> y <- -1e6:2.2e9 >> .Internal(inspect(y)) > @0x000000000a11a5d8 14 REALSXP g0c0 [REF(65535)] -1000000 : -2094967296 (compact) >> y <- -1e6:2e9 >> .Internal(inspect(y)) > @0x000000000a13adf0 13 INTSXP g0c0 [REF(65535)] -1000000 : 2000000000 (compact) >> > ------------------------- end of transcript ----------------------------------- > So indeed, no seg.fault, R notices that it can't get 15 GB of > memory. > But the bug is bad news: We have *silent* integer overflow happening > according to what .Internal(inspect(y)) shows... > .... less bad new: Probably the bug is only in the 'internal inspect' code > where a format specifier is used in C's printf() that does not work > correctly on Windows, at least the way it is currently compiled .. > On (64-bit) Linux, I get >> y <- -1e9:4e9 ; .Internal(inspect(y)) > @7d86388 14 REALSXP g0c0 [REF(65535)] -1000000000 : 4000000000 (compact) >> y <- c(0L, y) > Error: cannot allocate vector of size 37.3 Gb > which seems much better ... until I do find a bug, may again > only in the C code underlying .Internal(inspect(.)) : >> y <- -1e9:2e9 ; .Internal(inspect(y)) > @7d86ac0 13 INTSXP g0c0 [REF(65535)] Error: long vectors not supported yet: ../../../R/src/main/altclasses.c:139 >> Indeed, the purported "integer overflow" (above) does not happen. It is "only" a 'printf' related bug inside .Internal(inspect(.)) on Windows. *interestingly*, the above bug I've noticed on (64-bit) Linux does *not* show on Windows (64-bit), at least not for that case: On Windows, things are fine as long as they remain (compacted aka 'ALTREP') INTSXP: > y <- -1e3:2e9 ;.Internal(inspect(y)) @0x000000000a285648 13 INTSXP g0c0 [REF(65535)] -1000 : 2000000000 (compact) > y <- -1e3:2.1e9 ;.Internal(inspect(y)) @0x0000000019925930 13 INTSXP g0c0 [REF(65535)] -1000 : 2100000000 (compact) and here, y is correct, just the printing from .Internal(inspect(y)) is bugous (probably prints the double as an integer): > y <- -1e3:2.2e9 ; .Internal(inspect(y)) @0x00000000195c0178 14 REALSXP g0c0 [REF(65535)] -1000 : -2094967296 (compact) > length(y) [1] 2200001001 > tail(y) [1] 2.2e+09 2.2e+09 2.2e+09 2.2e+09 2.2e+09 2.2e+09 > tail(y) - 2.2e9 [1] -5 -4 -3 -2 -1 0 >
Possibly Parallel Threads
- Operations with long altrep vectors cause segfaults on Windows
- [External] Re: Operations with long altrep vectors cause segfaults on Windows
- [External] Re: Operations with long altrep vectors cause segfaults on Windows
- Operations with long altrep vectors cause segfaults on Windows
- [External] Re: Operations with long altrep vectors cause segfaults on Windows