Is it possible to avoid the loop in the following function (or make the function otherwise more efficient) and can someone point me to a possible solution? (It would be great if hours could be reduced to seconds :-). # --------------------------------------------- RanEigen=function(items=x,cases=y,sample=z) { X=matrix(rnorm(cases*items),nrow=cases,byrow=F) S=crossprod(X-rep(1,cases) %*% t(colMeans(X))) EV=eigen((1/sqrt(diag(S))*diag(items))%*%S%*%(1/sqrt(diag(S))*diag(items)),only.values=T)$values for (i in 2:sample) { X=matrix(rnorm(cases*items),nrow=cases,byrow=F) S=crossprod(X-rep(1,cases) %*% t(colMeans(X))) EV=rbind(EV,eigen((1/sqrt(diag(S))*diag(items))%*%S%*%(1/sqrt(diag(S))*diag(items)),only.values=T)$values) } REigV=(cbind(1:items,colMeans(EV))) REigV[,2]=as.numeric(formatC(REigV[,2],format="f",digits=7,flag=" ",width=10)) colnames(REigV)=c(' ','Eigenvalue') rownames(REigV)=rep('',items) return(REigV) } # --------------------------------------------- Thanks in advance, Dirk ************************************************* Dr. Dirk Enzmann Criminological Research Institute of Lower Saxony Luetzerodestr. 9 D-30161 Hannover Germany phone: +49-511-348.36.32 fax: +49-511-348.36.10 email: ENZMANN at KFN.uni-hannover.de http://www.kfn.de
Have you profiled your code (see `Writing R Extensions')? My guess is that most of the time is being spent in eigen(); if so you would have to use a different approach to gain much speed. BTW, please make more use of the space bar: your code is nigh unreadable (and so I've not tried to read it in detail). `Writing R Extensions' also shows you how to format code well. On Thu, 8 May 2003, Dirk Enzmann wrote:> Is it possible to avoid the loop in the following function (or make the > function otherwise more efficient) and can someone point me to a > possible solution? (It would be great if hours could be reduced to > seconds :-). > > # --------------------------------------------- > RanEigen=function(items=x,cases=y,sample=z) > { > X=matrix(rnorm(cases*items),nrow=cases,byrow=F) > S=crossprod(X-rep(1,cases) %*% t(colMeans(X))) > > EV=eigen((1/sqrt(diag(S))*diag(items))%*%S%*%(1/sqrt(diag(S))*diag(items)),only.values=T)$values > > for (i in 2:sample) > { > X=matrix(rnorm(cases*items),nrow=cases,byrow=F) > S=crossprod(X-rep(1,cases) %*% t(colMeans(X))) > > EV=rbind(EV,eigen((1/sqrt(diag(S))*diag(items))%*%S%*%(1/sqrt(diag(S))*diag(items)),only.values=T)$values) > > } > REigV=(cbind(1:items,colMeans(EV))) > REigV[,2]=as.numeric(formatC(REigV[,2],format="f",digits=7,flag=" > ",width=10)) > colnames(REigV)=c(' ','Eigenvalue') > rownames(REigV)=rep('',items) > return(REigV) > } > # ----------------------------------------------- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On Thu, 2003-05-08 at 10:31, Dirk Enzmann wrote:> I create vectors of eigenvalues of correlation matrices of random numbers for x items and y cases i-1-times and > add them via rbind to the already calculated eigenvalues. Thus, at the end of the loop I have a matrix of > items*samples eigenvalues. > > Ernesto Jardim schrieb: > > > On Thu, 2003-05-08 at 09:58, Dirk Enzmann wrote: > > > Is it possible to avoid the loop in the following function (or make the > > > function otherwise more efficient) and can someone point me to a > > > possible solution? (It would be great if hours could be reduced to > > > seconds :-). > > > > > > # --------------------------------------------- > > > RanEigen=function(items=x,cases=y,sample=z) > > > { > > > X=matrix(rnorm(cases*items),nrow=cases,byrow=F) > > > S=crossprod(X-rep(1,cases) %*% t(colMeans(X))) > > > > > > EV=eigen((1/sqrt(diag(S))*diag(items))%*%S%*%(1/sqrt(diag(S))*diag(items)),only.values=T)$values > > > > > > for (i in 2:sample) > > > { > > > X=matrix(rnorm(cases*items),nrow=cases,byrow=F) > > > S=crossprod(X-rep(1,cases) %*% t(colMeans(X))) > > > > > > EV=rbind(EV,eigen((1/sqrt(diag(S))*diag(items))%*%S%*%(1/sqrt(diag(S))*diag(items)),only.values=T)$values) > > > > > > } > > > REigV=(cbind(1:items,colMeans(EV))) > > > REigV[,2]=as.numeric(formatC(REigV[,2],format="f",digits=7,flag=" > > > ",width=10)) > > > colnames(REigV)=c(' ','Eigenvalue') > > > rownames(REigV)=rep('',items) > > > return(REigV) > > > } > > > # --------------------------------------------- > > > > > > Thanks in advance, > > > Dirk > > > > > > > > > ************************************************* > > > Dr. Dirk Enzmann > > > Criminological Research Institute of Lower Saxony > > > Luetzerodestr. 9 > > > D-30161 Hannover > > > Germany > > > > > > phone: +49-511-348.36.32 > > > fax: +49-511-348.36.10 > > > email: ENZMANN at KFN.uni-hannover.de > > > > > > http://www.kfn.de > > > > > > ______________________________________________ > > > R-help at stat.math.ethz.ch mailing list > > > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > > > > Hi > > > > You're not using the value of i inside your loop. Why do you need the > > loop ? > > > > EJ > > -- > ************************************************* > Dr. Dirk Enzmann > Criminological Research Institute of Lower Saxony > Luetzerodestr. 9 > D-30161 Hannover > Germany > > phone: +49-511-348.36.32 > fax: +49-511-348.36.10 > email: ENZMANN at KFN.uni-hannover.de > > http://www.kfn.de > ************************************************* >To avoid the loop you can try someting like : fun <- function(){ X=matrix(rnorm(cases*items),nrow=cases,byrow=F) S=crossprod(X-rep(1,cases) %*% t(colMeans(X))) EV=rbind(EV,eigen((1/sqrt(diag(S))*diag(items))%*%S%*%(1/sqrt(diag(S))*diag(items)),only.values=T)$values) } lst1 <- split(c(2:sample),c(2:sample)) lst2 <- lapply(lst1,fun) This is not the most elegant way of doing it, but it worked for me. The idea is to create a list with the number of samples you want and then use "lapply" that is much faster then loop. In the end you get a list with the results. You can use "unlist" to change it into a vector. Hope it helps Regards EJ
Dirk Enzmann <enzmann at kfn.uni-hannover.de> wrote:> Is it possible to avoid the loop in the following function (or make the > function otherwise more efficient)...Prof Brian Ripley <ripley at stats.ox.ac.uk> replied:> My guess is that most of the time is being spent in eigen(); if so you > would have to use a different approach to gain much speed.If M is an (p x q) matrix with q>>p, then at most the first p eigenvalues of M'M are nonzero, and they can be found more efficiently using the left side of: La.svd(M)$d^2 = eigen(crossprod(M))$values[1:p] I've re-written your function for the situation items >> cases (I leave the opposite situation as an exercise). I also threw in some sweep's, a zapsmall, and a few other tricks. A timing test shows it runs 57x faster for this example (with items=500 and cases=10): R> set.seed(1) R> system.time(res1 <- RanEigen.orig(500,10,30)) [1] 229.80 0.07 230.38 0.00 0.00 R> set.seed(1) R> system.time(res2 <- RanEigen(500,10,30)) [1] 4.00 0.02 4.07 0.00 0.00 R> all.equal(res1, res2) [1] TRUE and here's my function: RanEigen <- function(items, cases, sample) { if (items < cases) stop("I assume items >= cases") EV <- matrix(0, sample, items) for (i in 1:sample) { X <- matrix(rnorm(cases*items), nrow=cases) Xb <- sweep(X, 2, colMeans(X)) S <- crossprod(Xb) Xc <- sweep(Xb, 2, 1/sqrt(diag(S)), "*") EV[i, 1:cases] <- zapsmall(La.svd(Xc)$d^2) # The rest are zero } REigV <- cbind(1:items, colMeans(EV)) dimnames(REigV) <- list(rep('',items), c(' ','Eigenvalue')) REigV } -- -- David Brahm (brahm at alum.mit.edu)