thr3ads.net - R help - [R] Avoiding loops to spare time and memory [May 2003]

If this information is useful, please help other people find it:
Share via:

Dirk Enzmann

2003-May-08 08:58 UTC

[R] Avoiding loops to spare time and memory

Is it possible to avoid the loop in the following function (or make the
function otherwise more efficient) and can someone point me to a
possible solution? (It would be great if hours could be reduced to
seconds :-).

# ---------------------------------------------
RanEigen=function(items=x,cases=y,sample=z)
{
  X=matrix(rnorm(cases*items),nrow=cases,byrow=F)
  S=crossprod(X-rep(1,cases) %*% t(colMeans(X)))

EV=eigen((1/sqrt(diag(S))*diag(items))%*%S%*%(1/sqrt(diag(S))*diag(items)),only.values=T)$values

  for (i in 2:sample)
  {
  X=matrix(rnorm(cases*items),nrow=cases,byrow=F)
  S=crossprod(X-rep(1,cases) %*% t(colMeans(X)))

EV=rbind(EV,eigen((1/sqrt(diag(S))*diag(items))%*%S%*%(1/sqrt(diag(S))*diag(items)),only.values=T)$values)

  }
  REigV=(cbind(1:items,colMeans(EV)))
 
REigV[,2]=as.numeric(formatC(REigV[,2],format="f",digits=7,flag="
",width=10))
  colnames(REigV)=c(' ','Eigenvalue')
  rownames(REigV)=rep('',items)
  return(REigV)
}
# ---------------------------------------------

Thanks in advance,
Dirk


*************************************************
Dr. Dirk Enzmann
Criminological Research Institute of Lower Saxony
Luetzerodestr. 9
D-30161 Hannover
Germany

phone: +49-511-348.36.32
fax:   +49-511-348.36.10
email: ENZMANN at KFN.uni-hannover.de

http://www.kfn.de

Prof Brian Ripley

2003-May-08 09:22 UTC

head link

[R] Avoiding loops to spare time and memory

Have you profiled your code (see `Writing R Extensions')?

My guess is that most of the time is being spent in eigen(); if so you 
would have to use a different approach to gain much speed.

BTW, please make more use of the space bar: your code is nigh unreadable
(and so I've not tried to read it in detail).  `Writing R Extensions' 
also shows you how to format code well.

On Thu, 8 May 2003, Dirk Enzmann wrote:
> Is it possible to avoid the loop in the following function (or make the
> function otherwise more efficient) and can someone point me to a
> possible solution? (It would be great if hours could be reduced to
> seconds :-).
> 
> # ---------------------------------------------
> RanEigen=function(items=x,cases=y,sample=z)
> {
>   X=matrix(rnorm(cases*items),nrow=cases,byrow=F)
>   S=crossprod(X-rep(1,cases) %*% t(colMeans(X)))
> 
>
EV=eigen((1/sqrt(diag(S))*diag(items))%*%S%*%(1/sqrt(diag(S))*diag(items)),only.values=T)$values
> 
>   for (i in 2:sample)
>   {
>   X=matrix(rnorm(cases*items),nrow=cases,byrow=F)
>   S=crossprod(X-rep(1,cases) %*% t(colMeans(X)))
> 
>
EV=rbind(EV,eigen((1/sqrt(diag(S))*diag(items))%*%S%*%(1/sqrt(diag(S))*diag(items)),only.values=T)$values)
> 
>   }
>   REigV=(cbind(1:items,colMeans(EV)))
>  
REigV[,2]=as.numeric(formatC(REigV[,2],format="f",digits=7,flag="
> ",width=10))
>   colnames(REigV)=c(' ','Eigenvalue')
>   rownames(REigV)=rep('',items)
>   return(REigV)
> }
> # ---------------------------------------------
-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Ernesto Jardim

2003-May-08 09:45 UTC

head link

[R] Avoiding loops to spare time and memory

On Thu, 2003-05-08 at 10:31, Dirk Enzmann wrote:> I create vectors of eigenvalues of correlation matrices of random numbers
for x items and y cases  i-1-times and
> add them via rbind to the already calculated eigenvalues. Thus, at the end
of the loop I have a matrix of
> items*samples eigenvalues.
> 
> Ernesto Jardim schrieb:
> 
> > On Thu, 2003-05-08 at 09:58, Dirk Enzmann wrote:
> > > Is it possible to avoid the loop in the following function (or
make the
> > > function otherwise more efficient) and can someone point me to a
> > > possible solution? (It would be great if hours could be reduced
to
> > > seconds :-).
> > >
> > > # ---------------------------------------------
> > > RanEigen=function(items=x,cases=y,sample=z)
> > > {
> > >   X=matrix(rnorm(cases*items),nrow=cases,byrow=F)
> > >   S=crossprod(X-rep(1,cases) %*% t(colMeans(X)))
> > >
> > >
EV=eigen((1/sqrt(diag(S))*diag(items))%*%S%*%(1/sqrt(diag(S))*diag(items)),only.values=T)$values
> > >
> > >   for (i in 2:sample)
> > >   {
> > >   X=matrix(rnorm(cases*items),nrow=cases,byrow=F)
> > >   S=crossprod(X-rep(1,cases) %*% t(colMeans(X)))
> > >
> > >
EV=rbind(EV,eigen((1/sqrt(diag(S))*diag(items))%*%S%*%(1/sqrt(diag(S))*diag(items)),only.values=T)$values)
> > >
> > >   }
> > >   REigV=(cbind(1:items,colMeans(EV)))
> > >  
REigV[,2]=as.numeric(formatC(REigV[,2],format="f",digits=7,flag="
> > > ",width=10))
> > >   colnames(REigV)=c(' ','Eigenvalue')
> > >   rownames(REigV)=rep('',items)
> > >   return(REigV)
> > > }
> > > # ---------------------------------------------
> > >
> > > Thanks in advance,
> > > Dirk
> > >
> > >
> > > *************************************************
> > > Dr. Dirk Enzmann
> > > Criminological Research Institute of Lower Saxony
> > > Luetzerodestr. 9
> > > D-30161 Hannover
> > > Germany
> > >
> > > phone: +49-511-348.36.32
> > > fax:   +49-511-348.36.10
> > > email: ENZMANN at KFN.uni-hannover.de
> > >
> > > http://www.kfn.de
> > >
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> >
> > Hi
> >
> > You're not using the value of i inside your loop. Why do you need
the
> > loop ?
> >
> > EJ
> 
> --
> *************************************************
> Dr. Dirk Enzmann
> Criminological Research Institute of Lower Saxony
> Luetzerodestr. 9
> D-30161 Hannover
> Germany
> 
> phone: +49-511-348.36.32
> fax:   +49-511-348.36.10
> email: ENZMANN at KFN.uni-hannover.de
> 
> http://www.kfn.de
> *************************************************
> 

To avoid the loop you can try someting like :

fun <- function(){
X=matrix(rnorm(cases*items),nrow=cases,byrow=F)
S=crossprod(X-rep(1,cases) %*% t(colMeans(X)))
EV=rbind(EV,eigen((1/sqrt(diag(S))*diag(items))%*%S%*%(1/sqrt(diag(S))*diag(items)),only.values=T)$values)
}

lst1 <- split(c(2:sample),c(2:sample))
lst2 <- lapply(lst1,fun)

This is not the most elegant way of doing it, but it worked for me. The
idea is to create a list with the number of samples you want and then
use "lapply" that is much faster then loop.

In the end you get a list with the results. You can use "unlist" to
change it into a vector.

Hope it helps

Regards

EJ

David Brahm

2003-May-08 15:25 UTC

head link

[R] Avoiding loops to spare time and memory

Dirk Enzmann <enzmann at kfn.uni-hannover.de>
wrote:> Is it possible to avoid the loop in the following function (or make the
> function otherwise more efficient)...
Prof Brian Ripley <ripley at stats.ox.ac.uk>
replied:> My guess is that most of the time is being spent in eigen(); if so you 
> would have to use a different approach to gain much speed.
If M is an (p x q) matrix with q>>p, then at most the first p eigenvalues
of
M'M are nonzero, and they can be found more efficiently using the left side
of:
  La.svd(M)$d^2  =  eigen(crossprod(M))$values[1:p]

I've re-written your function for the situation items >> cases (I
leave the
opposite situation as an exercise).  I also threw in some sweep's, a
zapsmall,
and a few other tricks.  A timing test shows it runs 57x faster for this
example (with items=500 and cases=10):
  R> set.seed(1)
  R> system.time(res1 <- RanEigen.orig(500,10,30))
     [1] 229.80   0.07 230.38   0.00   0.00
  R> set.seed(1)
  R> system.time(res2 <- RanEigen(500,10,30))
     [1] 4.00 0.02 4.07 0.00 0.00
  R> all.equal(res1, res2)
     [1] TRUE

and here's my function:

RanEigen <- function(items, cases, sample) {
  if (items < cases) stop("I assume items >= cases")
  EV <- matrix(0, sample, items)
  for (i in 1:sample) {
    X <- matrix(rnorm(cases*items), nrow=cases)
    Xb <- sweep(X, 2, colMeans(X))
    S <- crossprod(Xb)
    Xc <- sweep(Xb, 2, 1/sqrt(diag(S)), "*")
    EV[i, 1:cases] <- zapsmall(La.svd(Xc)$d^2)  # The rest are zero
  }
  REigV <- cbind(1:items, colMeans(EV))
  dimnames(REigV) <- list(rep('',items), c('
','Eigenvalue'))
  REigV
}

-- 
                              -- David Brahm (brahm at alum.mit.edu)

Apparently Analagous Threads

Search for more reasonably related threads

R help - May 2003 - Avoiding loops to spare time and memory

[R] Avoiding loops to spare time and memory

[R] Avoiding loops to spare time and memory

[R] Avoiding loops to spare time and memory

[R] Avoiding loops to spare time and memory

Apparently Analagous Threads