Dieter Best
2007-Oct-16 22:41 UTC
[R] How to speed up multiple for loop over list of data frames
Hi there, I have a multiple for loop over a list of data frames for ( i in 1:(N-1) ) { for ( j in (i+1):N ) { for ( p in 1:M ) { v_i[p] = alist[[p]][i,"v"] v_j[p] = alist[[p]][j,"v"] } rho_s = cor(v_i, v_j, method = "spearman") rho_p = cor(v_i, v_j, method = "pearson" ) iv = c( iv, min(i, j) ) jv = c( jv, max(i, j) ) rho_sv = c( rho_sv, rho_s) rho_pv = c( rho_pv, rho_p) } } N is of the order of 400, M about 800. This takes me an entire day basically. Is there anything I could do to speed things up or is cor really that slow? -- D --------------------------------- [[alternative HTML version deleted]]
Bartjoosen
2007-Oct-17 09:28 UTC
[R] How to speed up multiple for loop over list of data frames
Maybe I'm wrong, but aren't you calculating just the same as cor(data frame,method ="spearman"), with some further parameters being monitored? Could you please provide a commented, minimal, self-contained, reproducible code, so that we can see what is actually going on and what is the way you want it? Dieter Best wrote:> > Hi there, > > I have a multiple for loop over a list of data frames > > for ( i in 1:(N-1) ) { > for ( j in (i+1):N ) { > for ( p in 1:M ) { > v_i[p] = alist[[p]][i,"v"] > v_j[p] = alist[[p]][j,"v"] > } > rho_s = cor(v_i, v_j, method = "spearman") > rho_p = cor(v_i, v_j, method = "pearson" ) > iv = c( iv, min(i, j) ) > jv = c( jv, max(i, j) ) > rho_sv = c( rho_sv, rho_s) > rho_pv = c( rho_pv, rho_p) > } > } > > N is of the order of 400, M about 800. > > This takes me an entire day basically. Is there anything I could do to > speed things up or is cor really that slow? > > -- D > > > > --------------------------------- > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- View this message in context: http://www.nabble.com/How-to-speed-up-multiple-for-loop-over-list-of-data-frames-tf4638585.html#a13249895 Sent from the R help mailing list archive at Nabble.com.
jim holtman
2007-Oct-17 13:44 UTC
[R] How to speed up multiple for loop over list of data frames
First thing to do is to use Rprof (?Rprof) on a subset of your data to see where time is being spent. My guess is that most of it is in the calls to 'cor' and if this is the case, they you have to figure out some other algorithm. Also if these dataframes all contain numeric information, convert them to matrices intially because the subsetting that you are doing on the dataframe (e.g., alist[[p]][i,"v"]) can be very expensive. The output from Rprof will help determine what course of action you should take. On 10/16/07, Dieter Best <dieterbest_2000 at yahoo.com> wrote:> Hi there, > > I have a multiple for loop over a list of data frames > > for ( i in 1:(N-1) ) { > for ( j in (i+1):N ) { > for ( p in 1:M ) { > v_i[p] = alist[[p]][i,"v"] > v_j[p] = alist[[p]][j,"v"] > } > rho_s = cor(v_i, v_j, method = "spearman") > rho_p = cor(v_i, v_j, method = "pearson" ) > iv = c( iv, min(i, j) ) > jv = c( jv, max(i, j) ) > rho_sv = c( rho_sv, rho_s) > rho_pv = c( rho_pv, rho_p) > } > } > > N is of the order of 400, M about 800. > > This takes me an entire day basically. Is there anything I could do to speed things up or is cor really that slow? > > -- D > > > > --------------------------------- > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?