gregory_r_warnes@groton.pfizer.com
2002-Jul-09 15:40 UTC
svd() very slow for wide matrix X but not for t(X) (PR#1761)
Hi All, A colleague has just pointed out to me prcomp is *much* slower and memory intensive when applied to very wide matrixes than when applied to the transpose of the same matrix. In looking for the reason, I discovered that that the singular value decomposition function svd is the primary culprit: On my Solaris workstation, under R 1.5.1: > nr <- 50 > nc <- 1000 > x <- matrix( rnorm( nr*nc), nrow=nr, ncol=nc ) > tx <- t(x) > > # SVD directly on matrix is SLOW > system.time( val.x <- svd(x)$u ) [1] 5.41 0.00 5.62 0.00 0.00 > > # SVD directly on t(matrix) is FAST > system.time( val.tx <- svd(tx)$v ) [1] 0.60 0.00 0.61 0.00 0.00 > > # but the results are (essentially) identical > max( abs(val.x) - abs(val.tx) ) [1] 4.201361e-13 The speed difference gets worse as the number of columns grows. Here is a simple patch to $R_SRC/src/library/base/R/svd.R that works around the problem. It simply checks whether the matrix is wider than it is long, and if so, calls itself on the transpose of the matrix, and then exchanges the returned u and v matrixes. *** svd.R.orig Tue Jul 9 11:06:46 2002 --- svd.R Tue Jul 9 11:06:49 2002 *************** *** 5,10 **** --- 5,17 ---- n <- dx[1] p <- dx[2] if(!n || !p) stop("0 extent dimensions") + + if( p > n ) # compute on the transpose if wide + { + s <- svd( t(x), nv=nu, nu=nv) + return( d=s$d, u=s$v, v=s$u ) + } + if (is.complex(x)) { res <- La.svd(x, nu, nv) return(list(d = res$d, u = if(nu) res$u, v = if(nv) Conj(t(res$vt)))) With this patch the speed is almost the same whether called on x or tx: > system.time( val.x <- svd(x)$u ) [1] 0.62 0.00 0.66 0.00 0.00 > system.time( val.tx <- svd(tx)$v ) [1] 0.60 0.00 0.86 0.00 0.00 > identical( val.x, val.tx ) [1] TRUE This is fine for my purposes, but perhaps someone familiar with the FORTRAN code should see why its performance is so unbalanced. Extra details:> version_ platform sparc-sun-solaris2.8 arch sparc os solaris2.8 system sparc, solaris2.8 status Patched major 1 minor 5.1 year 2002 month 06 day 18 language R>-Greg Warnes LEGAL NOTICE Unless expressly stated otherwise, this message is confidential and may be privileged. It is intended for the addressee(s) only. Access to this E-mail by anyone else is unauthorized. If you are not an addressee, any disclosure or copying of the contents of this E-mail or any action taken (or not taken) in reliance on it is unauthorized and may be unlawful. If you are not an addressee, please inform the sender immediately. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._