Hello, good morning or evening!... After studying some of the examples at S-poetry Document, I tried to implement some of the concepts in my R script, that intensively uses looping constructs. However I did not manage any improvement. My main problem is that I have a list of a lot of data e.g.:> xs[[1]] [1]........................[1000] [[2]] [1]........................[840] ... [[50]] [1]........................[945] Having a script with loops inside loops (for example in a Monte-Carlo simulation) takes a lot of minutes before it is completed. Is there another easier way to perform functions for each of the [[i]] ? Using probably apply? or constructing a specific function? or using the so-called "vectorising" tricks? One example could be the following, that calculates the sums 1:5, 2:6, 3:7,..., for each of xs[[i]] : xs <- lapply(1:500, function(x) rnorm(1000)) totalsum <- list() sums <- list() first <- list() for(i in 1:length(xs)) { totalsum[i] <- sum(xs[[i]]) for(j in 1:length(xs[[i]])) { if(j == 1) { sums[[i]] <- list() } if(j >= 5) { sums[[i]][j] <- sum(xs[[i]][(j-4):j]) } } } Of course the functions I actually call are more complicated, increasing the total time of calculations to a lot of minutes,... << 1 >>. How could I optimize (or better eliminate?...) the above loop? Any other suggestions for my scripting habits? Another problem that I am facing is that calculating a lot of lists (>50), that contain results of various econometric tests of all the variables, in the form of example.list[[i]] <- expression demands more than 50 lines at the beginning of the script that "initiate" the lists (e.g. example.list.1 <- list() example.list.2 <- list() ... example.list.50 <- list() << 2 >>. Is there a way to avoid that? Thank you very very much in advance, Constantine Tsardounis
I think this does what your loop is doing. Take about 0.5 seconds.> system.time(+ result <- lapply(xs, function(.val){ + .sums <- filter(.val, rep(1,5)) # add 5 connected values together + .sums[-c(1,2,length(.sums)-1, length(.sums))] + }) + ) [1] 0.50 0.00 0.54 NA NA>On 1/30/06, Constantine Tsardounis <costas.magnuse@gmail.com> wrote:> > Hello, good morning or evening!... > > After studying some of the examples at S-poetry Document, I tried to > implement some of the concepts in my R script, that intensively uses > looping constructs. However I did not manage any improvement. > My main problem is that I have a list of a lot of data e.g.: > > xs > [[1]] > [1]........................[1000] > [[2]] > [1]........................[840] > ... > [[50]] > [1]........................[945] > > > Having a script with loops inside loops (for example in a Monte-Carlo > simulation) takes a lot of minutes before it is completed. Is there > another easier way to perform functions for each of the [[i]] ? Using > probably apply? or constructing a specific function? or using the > so-called "vectorising" tricks? > > One example could be the following, that calculates the sums 1:5, > 2:6, 3:7,..., for each of xs[[i]] : > > xs <- lapply(1:500, function(x) rnorm(1000)) > totalsum <- list() > sums <- list() > first <- list() > > for(i in 1:length(xs)) { > totalsum[i] <- sum(xs[[i]]) > for(j in 1:length(xs[[i]])) { > if(j == 1) { > sums[[i]] <- list() > } > if(j >= 5) { > sums[[i]][j] <- sum(xs[[i]][(j-4):j]) > } > } > } > > Of course the functions I actually call are more complicated, > increasing the total time of calculations to a lot of minutes,... > > << 1 >>. How could I optimize (or better eliminate?...) the above > loop? Any other suggestions for my scripting habits? > > Another problem that I am facing is that calculating a lot of lists > (>50), that contain results of various econometric tests of all the > variables, in the form of > > example.list[[i]] <- expression > > demands more than 50 lines at the beginning of the script that > "initiate" the lists (e.g. > example.list.1 <- list() > example.list.2 <- list() > ... > example.list.50 <- list() > > << 2 >>. Is there a way to avoid that? > > > Thank you very very much in advance, > > Constantine Tsardounis > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >-- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]]
fingers were too fast. Left off the creation of the two lists (total & sums) that you wanted. system.time( result <- lapply(xs, function(.val){ .total <- sum(.val) # total sum .sums <- filter(.val, rep(1,5)) # sum 5 consective values list(total=.total, sum=.sums[-c(1,2,length(.sums)-1, length(.sums))]) }) ) # create lists of sum and totals total <- lapply(result, '[[', 'total') sums <- lapply(result, '[[', 'sum') On 1/30/06, Constantine Tsardounis <costas.magnuse@gmail.com> wrote:> > Hello, good morning or evening!... > > After studying some of the examples at S-poetry Document, I tried to > implement some of the concepts in my R script, that intensively uses > looping constructs. However I did not manage any improvement. > My main problem is that I have a list of a lot of data e.g.: > > xs > [[1]] > [1]........................[1000] > [[2]] > [1]........................[840] > ... > [[50]] > [1]........................[945] > > > Having a script with loops inside loops (for example in a Monte-Carlo > simulation) takes a lot of minutes before it is completed. Is there > another easier way to perform functions for each of the [[i]] ? Using > probably apply? or constructing a specific function? or using the > so-called "vectorising" tricks? > > One example could be the following, that calculates the sums 1:5, > 2:6, 3:7,..., for each of xs[[i]] : > > xs <- lapply(1:500, function(x) rnorm(1000)) > totalsum <- list() > sums <- list() > first <- list() > > for(i in 1:length(xs)) { > totalsum[i] <- sum(xs[[i]]) > for(j in 1:length(xs[[i]])) { > if(j == 1) { > sums[[i]] <- list() > } > if(j >= 5) { > sums[[i]][j] <- sum(xs[[i]][(j-4):j]) > } > } > } > > Of course the functions I actually call are more complicated, > increasing the total time of calculations to a lot of minutes,... > > << 1 >>. How could I optimize (or better eliminate?...) the above > loop? Any other suggestions for my scripting habits? > > Another problem that I am facing is that calculating a lot of lists > (>50), that contain results of various econometric tests of all the > variables, in the form of > > example.list[[i]] <- expression > > demands more than 50 lines at the beginning of the script that > "initiate" the lists (e.g. > example.list.1 <- list() > example.list.2 <- list() > ... > example.list.50 <- list() > > << 2 >>. Is there a way to avoid that? > > > Thank you very very much in advance, > > Constantine Tsardounis > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >-- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]]
From: Constantine Tsardounis> > Hello, good morning or evening!... > > After studying some of the examples at S-poetry Document, I tried to > implement some of the concepts in my R script, that intensively uses > looping constructs. However I did not manage any improvement. > My main problem is that I have a list of a lot of data e.g.: > > xs > [[1]] > [1]........................[1000] > [[2]] > [1]........................[840] > ... > [[50]] > [1]........................[945] > > > Having a script with loops inside loops (for example in a Monte-Carlo > simulation) takes a lot of minutes before it is completed. Is there > another easier way to perform functions for each of the [[i]] ? Using > probably apply? or constructing a specific function? or using the > so-called "vectorising" tricks? > > One example could be the following, that calculates the sums 1:5, > 2:6, 3:7,..., for each of xs[[i]] : > > xs <- lapply(1:500, function(x) rnorm(1000)) > totalsum <- list() > sums <- list() > first <- list() > > for(i in 1:length(xs)) { > totalsum[i] <- sum(xs[[i]]) > for(j in 1:length(xs[[i]])) { > if(j == 1) { > sums[[i]] <- list() > } > if(j >= 5) { > sums[[i]][j] <- sum(xs[[i]][(j-4):j]) > } > } > }For this you want to vectorize the computation inside, eliminating the j loop, then use lapply() if you like for the outer loop. That saves you the line to initialize the list.> Of course the functions I actually call are more complicated, > increasing the total time of calculations to a lot of minutes,... > > << 1 >>. How could I optimize (or better eliminate?...) the above > loop? Any other suggestions for my scripting habits? > > Another problem that I am facing is that calculating a lot of lists > (>50), that contain results of various econometric tests of all the > variables, in the form of > > example.list[[i]] <- expression > > demands more than 50 lines at the beginning of the script that > "initiate" the lists (e.g. > example.list.1 <- list() > example.list.2 <- list() > ... > example.list.50 <- list() > > << 2 >>. Is there a way to avoid that?Yes, by putting them all in one list. Andy> > Thank you very very much in advance, > > Constantine Tsardounis > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >
Thank you very very much for your responses... How exactly do I vectorize?> > One example could be the following, that calculates the sums 1:5, > > 2:6, 3:7,..., for each of xs[[i]] : > > > > xs <- lapply(1:500, function(x) rnorm(1000)) > > totalsum <- list() > > sums <- list() > > first <- list() > > > > for(i in 1:length(xs)) { > > totalsum[i] <- sum(xs[[i]]) > > for(j in 1:length(xs[[i]])) { > > if(j == 1) { > > sums[[i]] <- list() > > } > > if(j >= 5) { > > sums[[i]][j] <- sum(xs[[i]][(j-4):j]) > > } > > } > > } > > For this you want to vectorize the computation inside, eliminating the j > loop, then use lapply() if you like for the outer loop. That saves you the > line to initialize the list.