Dear All, I need to perform a SVD on a very large data matrix, of dimension ~ 500,000 x 1,000 , and I am looking for an efficient algorithm that can perform an approximate (partial) SVD to extract on the order of the top 50 right and left singular vectors. Would be very grateful for any advice on what R-packages are available to perform such a task, what the RAM requirement is, and indeed what would be the state-of-the-art in terms of numerical algorithms and programming language to use to accomplish this task. with many thanks in advance, Andy Cooper [[alternative HTML version deleted]]
No answer, but first obvious question" Is the matrix sparse? Next obvious question: what's your ram, OS, etc. (Reply to list, as I can't help further). -- Bert On Mon, Apr 8, 2013 at 7:44 AM, Andy Cooper <andy_cooper83 at yahoo.co.uk> wrote:> > > Dear All, > > I need to perform a SVD on a very large data matrix, of dimension ~ 500,000 x 1,000 , and I am looking > for an efficient algorithm that can perform an approximate (partial) SVD to extract on the order of the top 50 > right and left singular vectors. > > Would be very grateful for any advice on what R-packages are available to perform such a task, what the RAM requirement is, and indeed what would be the state-of-the-art in terms of numerical algorithms and programming > language to use to accomplish this task. > > > with many thanks in advance, > > Andy Cooper > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
Hi Andy, On Mon, Apr 8, 2013 at 7:44 AM, Andy Cooper <andy_cooper83@yahoo.co.uk> wrote:> > > > Dear All, > > I need to perform a SVD on a very large data matrix, of dimension ~500,000 x 1,000 , and I am looking> for an efficient algorithm that can perform an approximate (partial) SVDto extract on the order of the top 50> right and left singular vectors.Scanning through the results after googling for "cran big svd" suggests that the irlba package might be useful for you: http://cran.r-project.org/web/packages/irlba/ The first sentence of its vignette looks quite promising: """The irlba package provides a fast way to compute partial singular value decompositions (SVD) of large matrices ..." HTH, -steve -- Steve Lianoglou Defender of The Thesis | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact [[alternative HTML version deleted]]
On 08-04-2013, at 16:44, Andy Cooper <andy_cooper83 at yahoo.co.uk> wrote:> > > Dear All, > > I need to perform a SVD on a very large data matrix, of dimension ~ 500,000 x 1,000 , and I am looking > for an efficient algorithm that can perform an approximate (partial) SVD to extract on the order of the top 50 > right and left singular vectors. > > Would be very grateful for any advice on what R-packages are available to perform such a task, what the RAM requirement is, and indeed what would be the state-of-the-art in terms of numerical algorithms and programming > language to use to accomplish this task.Info found with package sos and findFn("svd") and scrolling through the list for something relevant. Have a look at package irlba. It can work with dense matrices and sparse matrices as provided by package Matrix, according to the documentation. Berend