noclue_
2010-Aug-05 23:40 UTC
[R] 64-bit R on 64-bit Windows box... Still not enough memory?!
I have a 64-bit windows box - Intel Xeon CPU E7340 @ 2.4GHz 31.9GB of RAM I have R 2.11.1 (64bit) running on it. My csv data is 3.6 GB (with about 15 million obs, 120 variables.) ------------------------------------------------ I have successfully imported the data above into R. No problem. Now I am trying to run 'rpart' on my data. But I got the following error : Error: cannot allocate vector of size 53.5 Mb In addition: Warning messages: 1: In lapply(x, "is.na") : Reached total allocation of 32764Mb: see help(memory.size) 2: In lapply(x, "is.na") : Reached total allocation of 32764Mb: see help(memory.size) 3: In lapply(x, "is.na") : Reached total allocation of 32764Mb: see help(memory.size) 4: In lapply(x, "is.na") : Reached total allocation of 32764Mb: see help(memory.size)>==========================================================Can anybody give me a hint on how to solve this? Thanks! ======================================================== -- View this message in context: http://r.789695.n4.nabble.com/64-bit-R-on-64-bit-Windows-box-Still-not-enough-memory-tp2315742p2315742.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]]
David Winsemius
2010-Aug-06 02:36 UTC
[R] 64-bit R on 64-bit Windows box... Still not enough memory?!
On Aug 5, 2010, at 7:40 PM, noclue_ wrote:> > I have a 64-bit windows box - > Intel Xeon CPU E7340 @ 2.4GHz 31.9GB of RAM > I have R 2.11.1 (64bit) running on it.Dear noclue_; What does this return?: .Machine$sizeof.pointer> > My csv data is 3.6 GB (with about 15 million obs, 120 variables.)On my 64 bit setup with 24GB of RAM I can comfortably work with a dataset that is around the same number of columns but (only) 4.5 million rows. Working with this size data.frame in 18GB of RAM was somewhat uncomfortable because it would often "roll over" into virtual memory and then modeling calls took forever.... well, twenty minutes anyway. I think you may be under-estimating the space requirements when working with larger objects. Numerics take 8 bytes. Objects often need to be copied and so space consumption can quickly double or triple. > 8*15000000*120 [1] 1.44e+10 So that's 14 GB just to hold the object, not to do anything useful with it. -- David.> ------------------------------------------------ > I have successfully imported the data above into R. No problem. > > Now I am trying to run 'rpart' on my data. But I got the following > error : > > Error: cannot allocate vector of size 53.5 Mb > In addition: Warning messages: > 1: In lapply(x, "is.na") : > Reached total allocation of 32764Mb: see help(memory.size) > 2: In lapply(x, "is.na") : > Reached total allocation of 32764Mb: see help(memory.size) > 3: In lapply(x, "is.na") : > Reached total allocation of 32764Mb: see help(memory.size) > 4: In lapply(x, "is.na") : > Reached total allocation of 32764Mb: see help(memory.size) >> > ==========================================================> Can anybody give me a hint on how to solve this?Post better details?: ?sessionInfo> Thanks! > ========================================================>David Winsemius, MD West Hartford, CT
Philipp Pagel
2010-Aug-06 07:43 UTC
[R] 64-bit R on 64-bit Windows box... Still not enough memory?!
On Thu, Aug 05, 2010 at 04:40:48PM -0700, noclue_ wrote:> > I have a 64-bit windows box - > Intel Xeon CPU E7340 @ 2.4GHz 31.9GB of RAM > I have R 2.11.1 (64bit) running on it. > > My csv data is 3.6 GB (with about 15 million obs, 120 variables.)Here is my guess: Your vraiables are mstly numeric but only given with two significant digits in the csv file: A B ... 0.0 12.0 1.3 0.4 2.3 1.1 So that would make 15e6 * 120 * 3 / 1024^3 = 5.0 Gb You ahve 3.6Gb - but that's close enough. If you read that into R, each nume ris represented as a double - i.e. 8 byte. Thus the entire data frame takes 15e6 * 120 * 8 / 1024^3 = 13.4Gb With almost half of your memory taken things can get problematic. Once you start actually working with the data you'll have to allow for a lot more space because copies will probably be made in the process. So you may have to put your data into a database and process it in pieces. Or use sqldf or bigmemory or something like that. cu Philipp -- Dr. Philipp Pagel Lehrstuhl f?r Genomorientierte Bioinformatik Technische Universit?t M?nchen Wissenschaftszentrum Weihenstephan Maximus-von-Imhof-Forum 3 85354 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/
noclue_
2010-Aug-06 16:03 UTC
[R] 64-bit R on 64-bit Windows box... Still not enough memory?!
> .Machine$sizeof.pointer[1] 4 -- View this message in context: http://r.789695.n4.nabble.com/64-bit-R-on-64-bit-Windows-box-Still-not-enough-memory-tp2315742p2316493.html Sent from the R help mailing list archive at Nabble.com.
David Winsemius
2010-Aug-06 18:48 UTC
[R] 64-bit R on 64-bit Windows box... Still not enough memory?!
You are running 32 bit R. Read the RW FAQ. On Aug 6, 2010, at 12:03 PM, noclue_ wrote:> >> .Machine$sizeof.pointer > [1] 4 > --David Winsemius, MD West Hartford, CT
Philipp Pagel
2010-Aug-06 19:12 UTC
[R] 64-bit R on 64-bit Windows box... Still not enough memory?!
On Fri, Aug 06, 2010 at 09:03:09AM -0700, noclue_ wrote:> > > .Machine$sizeof.pointer > [1] 4So it appears you are not on 64bit. Experpt form the help page: [...] sizeof.pointer: the number of bytes in a C ?SEXP? type. Will be ?4? on 32-bit builds and ?8? on 64-bit builds of R. [...] cu Philipp -- Dr. Philipp Pagel Lehrstuhl f?r Genomorientierte Bioinformatik Technische Universit?t M?nchen Wissenschaftszentrum Weihenstephan Maximus-von-Imhof-Forum 3 85354 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/
noclue_
2010-Aug-07 01:18 UTC
[R] 64-bit R on 64-bit Windows box... Still not enough memory?!
sorry I opened a previous R version. Here is my 64-bit R session -> .Machine$sizeof.pointer[1] 8 -- View this message in context: http://r.789695.n4.nabble.com/64-bit-R-on-64-bit-Windows-box-Still-not-enough-memory-tp2315742p2316970.html Sent from the R help mailing list archive at Nabble.com.