Hello, good morning or evening!... After studying some of the examples at S-poetry Document, I tried to implement some of the concepts in my R script, that intensively uses looping constructs. However I did not manage any improvement. My main problem is that I have a list of a lot of data e.g.:> xs[[1]] [1]........................[1000] [[2]] [1]........................[840] ... [[50]] [1]........................[945] Having a script with loops inside loops (for example in a Monte-Carlo simulation) takes a lot of minutes before it is completed. Is there another easier way to perform functions for each of the [[i]] ? Using probably apply? or constructing a specific function? or using the so-called "vectorising" tricks? One example could be the following, that calculates the sums 1:5, 2:6, 3:7,..., for each of xs[[i]] : xs <- lapply(1:500, function(x) rnorm(1000)) totalsum <- list() sums <- list() first <- list() for(i in 1:length(xs)) { totalsum[i] <- sum(xs[[i]]) for(j in 1:length(xs[[i]])) { if(j == 1) { sums[[i]] <- list() } if(j >= 5) { sums[[i]][j] <- sum(xs[[i]][(j-4):j]) } } } Of course the functions I actually call are more complicated, increasing the total time of calculations to a lot of minutes,... << 1 >>. How could I optimize (or better eliminate?...) the above loop? Any other suggestions for my scripting habits? Another problem that I am facing is that calculating a lot of lists (>50), that contain results of various econometric tests of all the variables, in the form of example.list[[i]] <- expression demands more than 50 lines at the beginning of the script that "initiate" the lists (e.g. example.list.1 <- list() example.list.2 <- list() ... example.list.50 <- list() << 2 >>. Is there a way to avoid that? Thank you very very much in advance, Constantine Tsardounis
I think this does what your loop is doing. Take about 0.5 seconds.> system.time(+ result <- lapply(xs, function(.val){ + .sums <- filter(.val, rep(1,5)) # add 5 connected values together + .sums[-c(1,2,length(.sums)-1, length(.sums))] + }) + ) [1] 0.50 0.00 0.54 NA NA>On 1/30/06, Constantine Tsardounis <costas.magnuse@gmail.com> wrote:> > Hello, good morning or evening!... > > After studying some of the examples at S-poetry Document, I tried to > implement some of the concepts in my R script, that intensively uses > looping constructs. However I did not manage any improvement. > My main problem is that I have a list of a lot of data e.g.: > > xs > [[1]] > [1]........................[1000] > [[2]] > [1]........................[840] > ... > [[50]] > [1]........................[945] > > > Having a script with loops inside loops (for example in a Monte-Carlo > simulation) takes a lot of minutes before it is completed. Is there > another easier way to perform functions for each of the [[i]] ? Using > probably apply? or constructing a specific function? or using the > so-called "vectorising" tricks? > > One example could be the following, that calculates the sums 1:5, > 2:6, 3:7,..., for each of xs[[i]] : > > xs <- lapply(1:500, function(x) rnorm(1000)) > totalsum <- list() > sums <- list() > first <- list() > > for(i in 1:length(xs)) { > totalsum[i] <- sum(xs[[i]]) > for(j in 1:length(xs[[i]])) { > if(j == 1) { > sums[[i]] <- list() > } > if(j >= 5) { > sums[[i]][j] <- sum(xs[[i]][(j-4):j]) > } > } > } > > Of course the functions I actually call are more complicated, > increasing the total time of calculations to a lot of minutes,... > > << 1 >>. How could I optimize (or better eliminate?...) the above > loop? Any other suggestions for my scripting habits? > > Another problem that I am facing is that calculating a lot of lists > (>50), that contain results of various econometric tests of all the > variables, in the form of > > example.list[[i]] <- expression > > demands more than 50 lines at the beginning of the script that > "initiate" the lists (e.g. > example.list.1 <- list() > example.list.2 <- list() > ... > example.list.50 <- list() > > << 2 >>. Is there a way to avoid that? > > > Thank you very very much in advance, > > Constantine Tsardounis > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >-- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]]
fingers were too fast. Left off the creation of the two lists (total &
sums) that you wanted.
system.time(
result <- lapply(xs, function(.val){
.total <- sum(.val) # total sum
.sums <- filter(.val, rep(1,5)) # sum 5 consective values
list(total=.total, sum=.sums[-c(1,2,length(.sums)-1, length(.sums))])
})
)
# create lists of sum and totals
total <- lapply(result, '[[', 'total')
sums <- lapply(result, '[[', 'sum')
On 1/30/06, Constantine Tsardounis <costas.magnuse@gmail.com>
wrote:>
> Hello, good morning or evening!...
>
> After studying some of the examples at S-poetry Document, I tried to
> implement some of the concepts in my R script, that intensively uses
> looping constructs. However I did not manage any improvement.
> My main problem is that I have a list of a lot of data e.g.:
> > xs
> [[1]]
> [1]........................[1000]
> [[2]]
> [1]........................[840]
> ...
> [[50]]
> [1]........................[945]
>
>
> Having a script with loops inside loops (for example in a Monte-Carlo
> simulation) takes a lot of minutes before it is completed. Is there
> another easier way to perform functions for each of the [[i]] ? Using
> probably apply? or constructing a specific function? or using the
> so-called "vectorising" tricks?
>
> One example could be the following, that calculates the sums 1:5,
> 2:6, 3:7,..., for each of xs[[i]] :
>
> xs <- lapply(1:500, function(x) rnorm(1000))
> totalsum <- list()
> sums <- list()
> first <- list()
>
> for(i in 1:length(xs)) {
> totalsum[i] <- sum(xs[[i]])
> for(j in 1:length(xs[[i]])) {
> if(j == 1) {
> sums[[i]] <- list()
> }
> if(j >= 5) {
> sums[[i]][j] <- sum(xs[[i]][(j-4):j])
> }
> }
> }
>
> Of course the functions I actually call are more complicated,
> increasing the total time of calculations to a lot of minutes,...
>
> << 1 >>. How could I optimize (or better eliminate?...) the
above
> loop? Any other suggestions for my scripting habits?
>
> Another problem that I am facing is that calculating a lot of lists
> (>50), that contain results of various econometric tests of all the
> variables, in the form of
>
> example.list[[i]] <- expression
>
> demands more than 50 lines at the beginning of the script that
> "initiate" the lists (e.g.
> example.list.1 <- list()
> example.list.2 <- list()
> ...
> example.list.50 <- list()
>
> << 2 >>. Is there a way to avoid that?
>
>
> Thank you very very much in advance,
>
> Constantine Tsardounis
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
--
Jim Holtman
Cincinnati, OH
+1 513 247 0281
What the problem you are trying to solve?
[[alternative HTML version deleted]]
From: Constantine Tsardounis> > Hello, good morning or evening!... > > After studying some of the examples at S-poetry Document, I tried to > implement some of the concepts in my R script, that intensively uses > looping constructs. However I did not manage any improvement. > My main problem is that I have a list of a lot of data e.g.: > > xs > [[1]] > [1]........................[1000] > [[2]] > [1]........................[840] > ... > [[50]] > [1]........................[945] > > > Having a script with loops inside loops (for example in a Monte-Carlo > simulation) takes a lot of minutes before it is completed. Is there > another easier way to perform functions for each of the [[i]] ? Using > probably apply? or constructing a specific function? or using the > so-called "vectorising" tricks? > > One example could be the following, that calculates the sums 1:5, > 2:6, 3:7,..., for each of xs[[i]] : > > xs <- lapply(1:500, function(x) rnorm(1000)) > totalsum <- list() > sums <- list() > first <- list() > > for(i in 1:length(xs)) { > totalsum[i] <- sum(xs[[i]]) > for(j in 1:length(xs[[i]])) { > if(j == 1) { > sums[[i]] <- list() > } > if(j >= 5) { > sums[[i]][j] <- sum(xs[[i]][(j-4):j]) > } > } > }For this you want to vectorize the computation inside, eliminating the j loop, then use lapply() if you like for the outer loop. That saves you the line to initialize the list.> Of course the functions I actually call are more complicated, > increasing the total time of calculations to a lot of minutes,... > > << 1 >>. How could I optimize (or better eliminate?...) the above > loop? Any other suggestions for my scripting habits? > > Another problem that I am facing is that calculating a lot of lists > (>50), that contain results of various econometric tests of all the > variables, in the form of > > example.list[[i]] <- expression > > demands more than 50 lines at the beginning of the script that > "initiate" the lists (e.g. > example.list.1 <- list() > example.list.2 <- list() > ... > example.list.50 <- list() > > << 2 >>. Is there a way to avoid that?Yes, by putting them all in one list. Andy> > Thank you very very much in advance, > > Constantine Tsardounis > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >
Thank you very very much for your responses... How exactly do I vectorize?> > One example could be the following, that calculates the sums 1:5, > > 2:6, 3:7,..., for each of xs[[i]] : > > > > xs <- lapply(1:500, function(x) rnorm(1000)) > > totalsum <- list() > > sums <- list() > > first <- list() > > > > for(i in 1:length(xs)) { > > totalsum[i] <- sum(xs[[i]]) > > for(j in 1:length(xs[[i]])) { > > if(j == 1) { > > sums[[i]] <- list() > > } > > if(j >= 5) { > > sums[[i]][j] <- sum(xs[[i]][(j-4):j]) > > } > > } > > } > > For this you want to vectorize the computation inside, eliminating the j > loop, then use lapply() if you like for the outer loop. That saves you the > line to initialize the list.