Valerie Cavett
2018-Jun-12 12:25 UTC
[R] R 3.5.0, vector memory exhausted error on readBin
Thanks so much for taking a look at this. Before setting a new value, I opened a fresh session of R and checked to see whether there was any value set for R_MAX_VSIZE. There was not, so we'll assume the default as you described. Next, I tried to set a value with Sys.setenv("R_MAX_VSIZE" = 8e9) When the system environment is checked again, there is now a value of? R_MAX_SIZE? ? ? ? ? ? ? ? ? ? ? ? ? 8e+09 Unfortunately, when I try to read in a small binary file, I still encounter the same error.? I restored R 3.3 and checked the system environment to confirm that there was no R_MAX_SIZE configured in the startup file, then tested readBin as follows: hertz <- 6000 bin.read = file("20180611_A4", "rb") datavals = readBin(bin.read, integer(), size = 2, n = 8*hertz*60*60000, endian = "little") datavals is a large integer with 6046880 elements, 23.1 Mb. If I then set the R_MAX_SIZE to 8e9, this also works just fine since the file is not really that large. However, if I switch back to the newest R version (3.5.0), I encounter the same error: > datavals = readBin(bin.read, integer(), size = 2, n = 8*hertz*60*60000, endian = "little") Error: vector memory exhausted (limit reached?) I?m at a loss for why this is an issue (same machine) in R 3.5.0, but not in 3.3.2 or 3.4.4. If you have any further suggestions, I?d greatly appreciate them. From: luke-tierney at uiowa.edu <luke-tierney at uiowa.edu> Sent: Tuesday, June 12, 2018 5:26:37 AM To: Valerie Cavett Cc: r-help at R-project.org Subject: Re: [R] R 3.5.0, vector memory exhausted error on readBin ? This item in NEWS explains the change: ???? ? The environment variable R_MAX_VSIZE can now be used to specify ?????? the maximal vector heap size. On macOS, unless specified by this ?????? environment variable, the maximal vector heap size is set to the ?????? maximum of 16GB and the available physical memory. This is to ?????? avoid having the R process killed when macOS over-commits memory. You can set R_MAX_VSIZE to a larger value but you should do some experimenting to decide on a safe value for your system. Mac OS is quite good at using virtual memory up to a point but then gets very bad. For my 4 GB mac numeric(8e9) works but numeric(9e9) causes R to be killed, so a setting of around 60GB _might_ be safe. File size probably doesn't matter in your example since you are setting a large value for n - I can't tell how large since you didn't provide your value of 'hertz'. Best, luke On Mon, 11 Jun 2018, Valerie Cavett wrote:> I???ve been reading in binary data collected via LabView for a project, and after upgrading to R 3.5.0, the code returns an error indicating that the 'vector memory is exhausted???.? I???m happy to provide a sample binary file; even ones that are quite small (12 MB) generate this error. (I wasn???t sure whether a binary file attached to this email would trigger a spam filter.) > > bin.read = file(files[i], "rb???) > datavals = readBin(bin.read, integer(), size = 2, n = 8*hertz*60*60000, endian = "little???) > > Error: vector memory exhausted (limit reached?) > > > sessionInfo() > R version 3.5.0 (2018-04-23) > Platform: x86_64-apple-darwin15.6.0 (64-bit) > Running under: macOS Sierra 10.12.6 > > > This does not happen in R 3.4 (R version 3.4.4 (2018-03-15) -- "Someone to Lean On???) - the vector is created and populated by the binary file values without issue, even at a 1GB binary file size. > > Other files that are read in as csv files, even at 1GB, load correctly to 3.5, so I assume that this is a function of a vector being explicitly defined/changed in some way from 3.4 to 3.5. > > Any help, suggestions or workarounds are greatly appreciated! > Val > >??????? [[alternative HTML version deleted]] > >-- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa????????????????? Phone:???????????? 319-335-3386 Department of Statistics and??????? Fax:?????????????? 319-335-3017 ??? Actuarial Science 241 Schaeffer Hall????????????????? email:?? luke-tierney at uiowa.edu Iowa City, IA 52242???????????????? WWW:? http://www.stat.uiowa.edu
luke-tier@ey m@ili@g off uiow@@edu
2018-Jun-12 14:14 UTC
[R] R 3.5.0, vector memory exhausted error on readBin
The environment variable R_MAX_VSIZE is read at start-up so need to be set outside R. If you are starting R from a shell you can use env R_MAX_VSIZE=700Gb R If you use a GUI you might need to set the variable in another way. Here is a reproducible version of your example: hertz <- 6000 binfile <- tempfile() writeBin(1L, binfile, size = 2) v <- readBin(binfile, integer(), size = 2, n = 8*hertz*60*60000) unlink(binfile) With the limit raised to 700Gb or more this will work in R 3.5.0 but you lose the protection of the lower default setting. You need a value that high because your 'n' value is asking readBin to allocate a buffer 643.7 Gb. Mac OS lets you allocate this much address space, as long as you don't try to use all of it (this is memory overcommitment). Running this example on a Linux system with 128Gb of memory produces Error: cannot allocate vector of size 643.7 Gb I suspect this will fail on pretty much any Windows system as well. My recommendation would be to figure out a lower upper bound on the number of elements to read, maybe using file.size, and use that for 'n' in your readBin call. That will allow your code to be more portable and avoid the risks of removing the allocation protection. Best, luke On Tue, 12 Jun 2018, Valerie Cavett wrote:> Thanks so much for taking a look at this. >> Before setting a new value, I opened a fresh session of R and checked to see whether there was any value set for R_MAX_VSIZE. There was not, so we'll assume the default as you described. >> Next, I tried to set a value with > Sys.setenv("R_MAX_VSIZE" = 8e9) > > > When the system environment is checked again, there is now a value of?? R_MAX_SIZE? ? ? ? ? ? ? ? ? ? ? ? ? 8e+09> > > Unfortunately, when I try to read in a small binary file, I still encounter the same error.??> > I restored R 3.3 and checked the system environment to confirm that there was no R_MAX_SIZE configured in the startup file, then tested readBin as follows: > > > hertz <- 6000 > bin.read = file("20180611_A4", "rb") > datavals = readBin(bin.read, integer(), size = 2, n = 8*hertz*60*60000, endian = "little") > > > datavals is a large integer with 6046880 elements, 23.1 Mb. > > > If I then set the R_MAX_SIZE to 8e9, this also works just fine since the file is not really that large. > > > However, if I switch back to the newest R version (3.5.0), I encounter the same error: > > > > datavals = readBin(bin.read, integer(), size = 2, n = 8*hertz*60*60000, endian = "little") > Error: vector memory exhausted (limit reached?) > > > I?m at a loss for why this is an issue (same machine) in R 3.5.0, but not in 3.3.2 or 3.4.4. If you have any further suggestions, I?d greatly appreciate them. > > > From: luke-tierney at uiowa.edu <luke-tierney at uiowa.edu> > Sent: Tuesday, June 12, 2018 5:26:37 AM > To: Valerie Cavett > Cc: r-help at R-project.org > Subject: Re: [R] R 3.5.0, vector memory exhausted error on readBin >??> This item in NEWS explains the change: > > ???? ? The environment variable R_MAX_VSIZE can now be used to specify > ?????? the maximal vector heap size. On macOS, unless specified by this > ?????? environment variable, the maximal vector heap size is set to the > ?????? maximum of 16GB and the available physical memory. This is to > ?????? avoid having the R process killed when macOS over-commits memory. > > You can set R_MAX_VSIZE to a larger value but you should do some > experimenting to decide on a safe value for your system. Mac OS is > quite good at using virtual memory up to a point but then gets very > bad. For my 4 GB mac numeric(8e9) works but numeric(9e9) causes R to > be killed, so a setting of around 60GB _might_ be safe. > > File size probably doesn't matter in your example since you are > setting a large value for n - I can't tell how large since you didn't > provide your value of 'hertz'. > > Best, > > luke > > On Mon, 11 Jun 2018, Valerie Cavett wrote: > >> I???ve been reading in binary data collected via LabView for a project, and after upgrading to R 3.5.0, the code returns an error indicating that the 'vector memory is exhausted???.? I???m happy to provide a sample binary file; even ones that are quite small (12 MB) generate this error. (I wasn???t sure whether a binary file attached to this email would trigger a spam filter.) >> >> bin.read = file(files[i], "rb???) >> datavals = readBin(bin.read, integer(), size = 2, n = 8*hertz*60*60000, endian = "little???) >> >> Error: vector memory exhausted (limit reached?) >> >> >> sessionInfo() >> R version 3.5.0 (2018-04-23) >> Platform: x86_64-apple-darwin15.6.0 (64-bit) >> Running under: macOS Sierra 10.12.6 >> >> >> This does not happen in R 3.4 (R version 3.4.4 (2018-03-15) -- "Someone to Lean On???) - the vector is created and populated by the binary file values without issue, even at a 1GB binary file size. >> >> Other files that are read in as csv files, even at 1GB, load correctly to 3.5, so I assume that this is a function of a vector being explicitly defined/changed in some way from 3.4 to 3.5. >> >> Any help, suggestions or workarounds are greatly appreciated! >> Val >> >> ??????? [[alternative HTML version deleted]] >> >> > >-- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tierney at uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
Valerie Cavett
2018-Jun-12 19:02 UTC
[R] R 3.5.0, vector memory exhausted error on readBin
Ah - I see the problem - thanks so much for the clarification! Just in case anyone else is using Rstudio on a mac and runs into this issue, I ended up following the instructions from http://btibert3.github.io/2015/12/08/Environment-Variables-in-Rstudio-on-Mac.html to add the following line to the .Renviron file: R_MAX_VSIZE=700Gb On restart (R 3.5.0), this did the trick and the files read normally. Thanks again for all the assistance! Val From: luke-tierney at uiowa.edu <luke-tierney at uiowa.edu> Sent: Tuesday, June 12, 2018 10:14 AM To: Valerie Cavett Cc: r-help at R-project.org Subject: Re: [R] R 3.5.0, vector memory exhausted error on readBin ? The environment variable R_MAX_VSIZE is read at start-up so need to be set outside R. If you are starting R from a shell you can use ???? env R_MAX_VSIZE=700Gb R If you use a GUI you might need to set the variable in another way. Here is a reproducible version of your example: ???? hertz <- 6000 ???? binfile <- tempfile() ???? writeBin(1L, binfile, size = 2) ???? v <- readBin(binfile, integer(), size = 2, n = 8*hertz*60*60000) ???? unlink(binfile) With the limit raised to 700Gb or more this will work in R 3.5.0 but you lose the protection of the lower default setting. You need a value that high because your 'n' value is asking readBin to allocate a buffer 643.7 Gb. Mac OS lets you allocate this much address space, as long as you don't try to use all of it (this is memory overcommitment). Running this example on a Linux system with 128Gb of memory produces ???? Error: cannot allocate vector of size 643.7 Gb I suspect this will fail on pretty much any Windows system as well. My recommendation would be to figure out a lower upper bound on the number of elements to read, maybe using file.size, and use that for 'n' in your readBin call. That will allow your code to be more portable and avoid the risks of removing the allocation protection. Best, luke On Tue, 12 Jun 2018, Valerie Cavett wrote:> Thanks so much for taking a look at this. >> Before setting a new value, I opened a fresh session of R and checked to see whether there was any value set for R_MAX_VSIZE. There was not, so we'll assume the default as you described. >> Next, I tried to set a value with > Sys.setenv("R_MAX_VSIZE" = 8e9) > > > When the system environment is checked again, there is now a value of???????? R_MAX_SIZE? ? ? ? ? ? ? ? ? ? ? ? ? 8e+09> > > Unfortunately, when I try to read in a small binary file, I still encounter the same error.??> > I restored R 3.3 and checked the system environment to confirm that there was no R_MAX_SIZE configured in the startup file, then tested readBin as follows: > > >??????? hertz <- 6000 >??????? bin.read = file("20180611_A4", "rb") >??????? datavals = readBin(bin.read, integer(), size = 2, n = 8*hertz*60*60000, endian = "little") > > > datavals is a large integer with 6046880 elements, 23.1 Mb. > > > If I then set the R_MAX_SIZE to 8e9, this also works just fine since the file is not really that large. > > > However, if I switch back to the newest R version (3.5.0), I encounter the same error: > > >??????? > datavals = readBin(bin.read, integer(), size = 2, n = 8*hertz*60*60000, endian = "little") >??????? Error: vector memory exhausted (limit reached?) > > > I?m at a loss for why this is an issue (same machine) in R 3.5.0, but not in 3.3.2 or 3.4.4. If you have any further suggestions, I?d greatly appreciate them. > > > From: luke-tierney at uiowa.edu <luke-tierney at uiowa.edu> > Sent: Tuesday, June 12, 2018 5:26:37 AM > To: Valerie Cavett > Cc: r-help at R-project.org > Subject: Re: [R] R 3.5.0, vector memory exhausted error on readBin >??> This item in NEWS explains the change: > > ???? ? The environment variable R_MAX_VSIZE can now be used to specify > ?????? the maximal vector heap size. On macOS, unless specified by this > ?????? environment variable, the maximal vector heap size is set to the > ?????? maximum of 16GB and the available physical memory. This is to > ?????? avoid having the R process killed when macOS over-commits memory. > > You can set R_MAX_VSIZE to a larger value but you should do some > experimenting to decide on a safe value for your system. Mac OS is > quite good at using virtual memory up to a point but then gets very > bad. For my 4 GB mac numeric(8e9) works but numeric(9e9) causes R to > be killed, so a setting of around 60GB _might_ be safe. > > File size probably doesn't matter in your example since you are > setting a large value for n - I can't tell how large since you didn't > provide your value of 'hertz'. > > Best, > > luke > > On Mon, 11 Jun 2018, Valerie Cavett wrote: > >> I???ve been reading in binary data collected via LabView for a project, and after upgrading to R 3.5.0, the code returns an error indicating that the 'vector memory is exhausted???.? I???m happy to provide a sample binary file; even ones that are quite small? (12 MB) generate this error. (I wasn???t sure whether a binary file attached to this email would trigger a spam filter.) >> >> bin.read = file(files[i], "rb???) >> datavals = readBin(bin.read, integer(), size = 2, n = 8*hertz*60*60000, endian = "little???) >> >> Error: vector memory exhausted (limit reached?) >> >> >> sessionInfo() >> R version 3.5.0 (2018-04-23) >> Platform: x86_64-apple-darwin15.6.0 (64-bit) >> Running under: macOS Sierra 10.12.6 >> >> >> This does not happen in R 3.4 (R version 3.4.4 (2018-03-15) -- "Someone to Lean On???) - the vector is created and populated by the binary file values without issue, even at a 1GB binary file size. >> >> Other files that are read in as csv files, even at 1GB, load correctly to 3.5, so I assume that this is a function of a vector being explicitly defined/changed in some way from 3.4 to 3.5. >> >> Any help, suggestions or workarounds are greatly appreciated! >> Val >> >> ??????? [[alternative HTML version deleted]] >> >> > >-- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa????????????????? Phone:???????????? 319-335-3386 Department of Statistics and??????? Fax:?????????????? 319-335-3017 ??? Actuarial Science 241 Schaeffer Hall????????????????? email:?? luke-tierney at uiowa.edu Iowa City, IA 52242???????????????? WWW:? http://www.stat.uiowa.edu