Altuna Akalin
2007-Sep-20 08:19 UTC
[R] problem with generalized singular value decomposition using LAPACK
Hi All, I'm trying to run generalized singular value decomposition (GSVD) function from LAPACK library. Basically my problem is that I can not run it for large matrices, I get a memory error. I'm using R 2.5.1. I tried this on intel centos5 machines with 2 GB memory and 8 GB memory. I have unlimited max memory,cpu time and virtual memory. LAPACK is already compiled for R (libRlapack.so ) and using dyn.load and .Fortran functions it's possible to call LAPACK functions(I don't if it's the best way to do it) . I pasted the function below. As I said, it's not possible to run GSVD on large matrices. Here is couple of examples:> res=GSVD( matrix(1:35000,5000,6), matrix(1:35000,5000,6) ) #runs withoutproblems> res=GSVD( matrix(1:36000,6000,6), matrix(1:36000,6000,6) )Error: cannot allocate vector of size 274.7 Mb when tried with 8gig machine: res=GSVD(matrix(1:36000,6000,6), matrix(1:36000,6000,6) ) #runs without problems> res=GSVD(matrix(1:42000,7000,6), matrix(1:36000,6000,6) )Error: cannot allocate vector of size 373.8 Mb when I tried this with equivalent built-in matlab GSVD function, I get a similar memory error as well. matlab also uses LAPACK for this. Is there a workaround for this problem without increasing the memory? Does using .Fortran makes it worse (memory wise)? I need to work with matrices 4 times larger than the ones I tried. I would be grateful if anyone can help on these issues Best, Altuna # function GSVD<-function(A,B) { # A=U*E1*Q' # B=V*E2*Q' dyn.load("/usr/local/lib/R/lib/libRlapack.so") #is.loaded("dggsvd") # returns TRUE z <- .Fortran("dggsvd", as.character('N'), as.character('N'), as.character('Q'), as.integer(nrow(A)), as.integer(ncol(A)), as.integer(nrow(B)), integer(1), integer(1), as.double(A), as.integer(nrow(A)), as.double(B), as.integer(nrow(B)), double(ncol(A)), double(ncol(A)), double(nrow(A)*nrow(A)), as.integer(nrow(A)), double(nrow(B)*nrow(B)), as.integer(nrow(B)), double(ncol(A)*ncol(A)), as.integer(ncol(A)), double(max(c(3*ncol(A),nrow(A),nrow(B)))+ncol(A)), integer(ncol(A)), integer(1),dup=FALSE) K=z[7][[1]] L=z[8][[1]] U=z[15][[1]] V=z[17][[1]] Q=z[19][[1]] ALPHA=z[13][[1]] BETA=z[14][[1]] R=matrix(z[9][[1]],ncol(A),nrow=nrow(A),byrow=FALSE) U=matrix(U,ncol=nrow(A),nrow=nrow(A),byrow=FALSE) V=matrix(V,ncol=nrow(B),nrow=nrow(B),byrow=FALSE) Q=matrix(Q,ncol=ncol(A),nrow=ncol(A),byrow=FALSE) D1=mat.or.vec(nrow(A),K+L) D2=mat.or.vec(nrow(B),K+L) oR=mat.or.vec((K+L),ncol(A)) if(K > 0) { if(K==1) { D1[1:K,1:K] =rep(1,K) } else { diag(D1[1:K,1:K])=rep(1,K) } diag(D1[(K+1):(K+L),(K+1):(K+L)])=ALPHA[(K+1):(K+L)] diag(D2[1:L,(K+1):(K+L)])=BETA[(K+1):(K+L)] } if(K ==0) { diag(D1[(K+1):(K+L),(K+1):(K+L)])=ALPHA[(K+1):(K+L)] diag(D2[1:L,(K+1):(K+L)])=BETA[(K+1):(K+L)] } Ci=ALPHA[(K+1):(K+L)] S=BETA[(K+1):(K+L)] oR[(1):(K+L),(ncol(A)-K-L+1):(ncol(A))]=R[(1):(K+L),(ncol(A)-K-L+1):(ncol(A))] return(list(U=U,V=V,Q=Q,D1=D1,D2=D2,oR=oR,C=Ci,S=S,K=K,L=L)) }
elw at stderr.org
2007-Sep-20 11:27 UTC
[R] problem with generalized singular value decomposition using LAPACK
> I'm trying to run generalized singular value decomposition (GSVD) > function from LAPACK library. Basically my problem is that I can not run > it for large matrices, I get a memory error. I'm using R 2.5.1. I tried > this on intel centos5 machines with 2 GB memory and 8 GB memory. I have > unlimited max memory,cpu time and virtual memory. > >> res=GSVD( matrix(1:35000,5000,6), matrix(1:35000,5000,6) ) #runs >> without > problems >> res=GSVD( matrix(1:36000,6000,6), matrix(1:36000,6000,6) ) > Error: cannot allocate vector of size 274.7 MbDid you examine R's internal memory limits, e.g. ?memory for help... that should help. [You should, most likely, be able to allocate a ~274MB vector on a 2GB machine... but you might have to coerce R into taking that much memory from the system, rather than trying to be conservative.] --elijah