Hi all, I am doing some calculation with very large dimension. I need to create a matrix with three columns and a very large number of rows (3195*1290*495*35*35*35*15=1.312083e+15) i n order to allocate calculation result from a for loop. R does not allow me to create such a matrix because of the large dimension (see below). Is there a way to go around this? Thanks very much!! Hanna> matrix(0, 3195*1290*495*35*35*35*15, 3)Error in matrix(0, 3195 * 1290 * 495 * 35 * 35 * 35 * 15, 3) : invalid 'nrow' value (too large or NA) In addition: Warning message: In matrix(0, 3195 * 1290 * 495 * 35 * 35 * 35 * 15, 3) : NAs introduced by coercion>[[alternative HTML version deleted]]
Hey Hanna, nrow and ncol in matrix() are integer-based, for the moment at least; accordingly they have a maximum value. (3195*1290*495*35*35*35*15) is actually larger than an integer can hold - you can test this with: str((3195*1290*495*35*35*35*15)) Which shows that it's stored as a numeric value. And if you try as.integer((3195*1290*495*35*35*35*15)) you'll get an NA - because it's too large for an integer to hold. You could try using the "bigmemory" package, which is designed to handle very very large matrices (and other datatypes) but I believe that handling is in terms of making sure you can store the thing by storing it in a file if necessary - I'm not sure if it allows for longs (which can store much larger values) for nrow and ncol and indexing generally. So it may be that, for now, you're out of luck I'm afraid :(. On 24 January 2016 at 11:46, li li <hannah.hlx at gmail.com> wrote:> Hi all, > I am doing some calculation with very large dimension. I need to create a > matrix > with three columns and a very large number of rows > (3195*1290*495*35*35*35*15=1.312083e+15) i > n order to allocate calculation result from a for loop. > R does not allow me to create such a matrix because of the large dimension > (see below). Is there a way to go around this? > Thanks very much!! > Hanna > > >> matrix(0, 3195*1290*495*35*35*35*15, 3) > Error in matrix(0, 3195 * 1290 * 495 * 35 * 35 * 35 * 15, 3) : > invalid 'nrow' value (too large or NA) > In addition: Warning message: > In matrix(0, 3195 * 1290 * 495 * 35 * 35 * 35 * 15, 3) : > NAs introduced by coercion >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Oliver Keyes Count Logula Wikimedia Foundation
FYI, the matrix you tried to allocate would hold (3195*1290*495*35*35*35*15) * 3 = 3.936248e+15 values. Each value would occupy 8 bytes of memory (for the double data type). In other words, in order to keep this data matrix in memory you would require a computer with at least 3.148998e+16 bytes of RAM, i.e. 29327331 GiB 28640 TiB = 28 PiB. Storing such a large matrix even on file is not possible. In other words, you need to figure out how to approach your original problem in a different way. /Henrik On Sun, Jan 24, 2016 at 8:46 AM, li li <hannah.hlx at gmail.com> wrote:> Hi all, > I am doing some calculation with very large dimension. I need to create a > matrix > with three columns and a very large number of rows > (3195*1290*495*35*35*35*15=1.312083e+15) i > n order to allocate calculation result from a for loop. > R does not allow me to create such a matrix because of the large dimension > (see below). Is there a way to go around this? > Thanks very much!! > Hanna > > >> matrix(0, 3195*1290*495*35*35*35*15, 3) > Error in matrix(0, 3195 * 1290 * 495 * 35 * 35 * 35 * 15, 3) : > invalid 'nrow' value (too large or NA) > In addition: Warning message: > In matrix(0, 3195 * 1290 * 495 * 35 * 35 * 35 * 15, 3) : > NAs introduced by coercion >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
> 28 PiB. Storing such a large matrix even on file is not possible.The ads for Amazon Red Shift say it is possible. E.g., Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools. Start small for $0.25 per hour with no commitments and scale to petabytes for $1,000 per terabyte per year, less than a tenth the cost of traditional solutions. Customers typically see 3x compression, reducing their costs to $333 per uncompressed terabyte per year. Cost may be an issue: 28 petabytes * 1024 petabytes/terabyte * $333 terabyte/year ~= $9.5 million/year or $26 thousand/day. Bill Dunlap TIBCO Software wdunlap tibco.com On Sun, Jan 24, 2016 at 1:29 PM, Henrik Bengtsson < henrik.bengtsson at gmail.com> wrote:> FYI, the matrix you tried to allocate would hold > (3195*1290*495*35*35*35*15) * 3 = 3.936248e+15 values. Each value > would occupy 8 bytes of memory (for the double data type). In other > words, in order to keep this data matrix in memory you would require a > computer with at least 3.148998e+16 bytes of RAM, i.e. 29327331 GiB > 28640 TiB = 28 PiB. Storing such a large matrix even on file is not > possible. > > In other words, you need to figure out how to approach your original > problem in a different way. > > /Henrik > > On Sun, Jan 24, 2016 at 8:46 AM, li li <hannah.hlx at gmail.com> wrote: > > Hi all, > > I am doing some calculation with very large dimension. I need to > create a > > matrix > > with three columns and a very large number of rows > > (3195*1290*495*35*35*35*15=1.312083e+15) i > > n order to allocate calculation result from a for loop. > > R does not allow me to create such a matrix because of the large > dimension > > (see below). Is there a way to go around this? > > Thanks very much!! > > Hanna > > > > > >> matrix(0, 3195*1290*495*35*35*35*15, 3) > > Error in matrix(0, 3195 * 1290 * 495 * 35 * 35 * 35 * 15, 3) : > > invalid 'nrow' value (too large or NA) > > In addition: Warning message: > > In matrix(0, 3195 * 1290 * 495 * 35 * 35 * 35 * 15, 3) : > > NAs introduced by coercion > >> > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]