It is my impression that good R programmers make very little use of the for statement. Please consider the following R statement: for( i in 1:(len-1) ) s[i] = log(c1[i+1]/c1[i], base = exp(1) ) One problem I have found with this statement is that s must exist before the statement is run. Can it be written without using a for loop? Would that be better? Thanks, Bob
Bob: Please, please spend some time with an R tutorial or two before you post here. This list can help, but I think we assume that you have already made an effort to learn basic R on your own. Your question is about as basic as it gets, so it appears to me that you have not done this. There are many many R tutorials out there. Some suggestions, by no means comprehensive, can be found here: https://www.rstudio.com/online-learning/#r-programming Others will no doubt respond, but you can answer it yourself after only a few minutes with most R tutorials. Cheers, Bert On Sat, Sep 22, 2018 at 2:16 PM rsherry8 <rsherry8 at comcast.net> wrote:> > It is my impression that good R programmers make very little use of the > for statement. Please consider the following > R statement: > for( i in 1:(len-1) ) s[i] = log(c1[i+1]/c1[i], base = exp(1) ) > One problem I have found with this statement is that s must exist before > the statement is run. Can it be written without using a for > loop? Would that be better? > > Thanks, > Bob > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
c1 <- 1:1000000 len <- 1000000 system.time( s1 <- log(c1[-1]/c1[-len]) ) s <- c1[-len] system.time( for (i in 1:(len-1)) s[i] <- log(c1[i+1]/c1[i]) ) all.equal(s,s1)> > c1 <- 1:1000000 > len <- 1000000 > system.time(+ s1 <- log(c1[-1]/c1[-len]) + ) user system elapsed 0.032 0.005 0.037> s <- c1[-len] > system.time(+ for (i in 1:(len-1)) s[i] <- log(c1[i+1]/c1[i]) + ) user system elapsed 0.226 0.002 0.232> all.equal(s,s1)[1] TRUE>much faster, and much easier to understand when vectorized On Sat, Sep 22, 2018 at 5:16 PM, rsherry8 <rsherry8 at comcast.net> wrote:> > It is my impression that good R programmers make very little use of the for > statement. Please consider the following > R statement: > for( i in 1:(len-1) ) s[i] = log(c1[i+1]/c1[i], base = exp(1) ) > One problem I have found with this statement is that s must exist before the > statement is run. Can it be written without using a for > loop? Would that be better? > > Thanks, > Bob > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
another version just for fun s <- parallel::pvec(1:len, function(i) log(c1[i + 1] / c1[i])) On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <rsherry8 at comcast.net> wrote:> > > It is my impression that good R programmers make very little use of the > for statement. Please consider the following > R statement: > for( i in 1:(len-1) ) s[i] = log(c1[i+1]/c1[i], base = exp(1) ) > One problem I have found with this statement is that s must exist before > the statement is run. Can it be written without using a for > loop? Would that be better? > > Thanks, > Bob > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
or this one: (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len)) On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <rsherry8 at comcast.net> wrote:> > > It is my impression that good R programmers make very little use of the > for statement. Please consider the following > R statement: > for( i in 1:(len-1) ) s[i] = log(c1[i+1]/c1[i], base = exp(1) ) > One problem I have found with this statement is that s must exist before > the statement is run. Can it be written without using a for > loop? Would that be better? > > Thanks, > Bob > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
I do use for loops a few times per month, but only wrapped around large chunks of vectorized calculations, not for this kind of use case. In those cases I also pre-allocate output vectors/lists (e.g. vector( "list", len )) to avoid memory thrashing as you grow lists or other vectors one element at a time (v <- c( v, new value ) is an inefficient trick). I also create variables to hold intermediate results that would yield the same answer each time before going into the loop (e.g. exp(1)). As regards your toy example, I would use a one-liner: s <- diff( log( c1 ) ) which avoids executing exp(1) at all, much less every time through the loop, and it uses vectorized incremental subtraction rather than division (laws of logarithms from algebra). The default base for the log function is e, so it is unnecessary to specify it. Note that your loop calculates logs involving all but the first and last elements of c1 twice... once when indexing for i+1, and again in the next iteration of the loop it is accessed as index i. You would be surprised how many iterative algorithms can be accomplished with cumsum and diff. Bill Dunlap has demonstrated examples quite a few times in the mailing list archives if you have time to search. On September 22, 2018 2:16:27 PM PDT, rsherry8 <rsherry8 at comcast.net> wrote:> >It is my impression that good R programmers make very little use of the > >for statement. Please consider the following >R statement: > for( i in 1:(len-1) ) s[i] = log(c1[i+1]/c1[i], base = exp(1) ) >One problem I have found with this statement is that s must exist >before >the statement is run. Can it be written without using a for >loop? Would that be better? > >Thanks, >Bob > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- Sent from my phone. Please excuse my brevity.
On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu <liuwensui at gmail.com> wrote:> > or this one: > > (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))Oh dear god no.> > On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <rsherry8 at comcast.net> wrote: > > > > > > It is my impression that good R programmers make very little use of the > > for statement. Please consider the following > > R statement: > > for( i in 1:(len-1) ) s[i] = log(c1[i+1]/c1[i], base = exp(1) ) > > One problem I have found with this statement is that s must exist before > > the statement is run. Can it be written without using a for > > loop? Would that be better? > > > > Thanks, > > Bob > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On Sun, Sep 23, 2018 at 10:09 AM Wensui Liu <liuwensui at gmail.com> wrote:> > Why?The operations required for this algorithm are vectorized, as are most operations in R. There is no need to iterate through each element. Using Vectorize to achieve the iteration is no better than using *apply or a for-loop, and betrays the same basic lack of insight into basic principles of programming in R. And/or, if you want a more practical reason:> c1 <- 1:1000000 > len <- 1000000 > system.time( s1 <- log(c1[-1]/c1[-len]))user system elapsed 0.031 0.004 0.035> system.time(s2 <- Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))user system elapsed 1.258 0.022 1.282 Best, Ista> > On Sun, Sep 23, 2018 at 7:54 AM Ista Zahn <istazahn at gmail.com> wrote: >> >> On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu <liuwensui at gmail.com> wrote: >> > >> > or this one: >> > >> > (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len)) >> >> Oh dear god no. >> >> > >> > On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <rsherry8 at comcast.net> wrote: >> > > >> > > >> > > It is my impression that good R programmers make very little use of the >> > > for statement. Please consider the following >> > > R statement: >> > > for( i in 1:(len-1) ) s[i] = log(c1[i+1]/c1[i], base = exp(1) ) >> > > One problem I have found with this statement is that s must exist before >> > > the statement is run. Can it be written without using a for >> > > loop? Would that be better? >> > > >> > > Thanks, >> > > Bob >> > > >> > > ______________________________________________ >> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > > https://stat.ethz.ch/mailman/listinfo/r-help >> > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> > > and provide commented, minimal, self-contained, reproducible code. >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code.
actually, by the parallel pvec, the user time is a lot shorter. or did I somewhere miss your invaluable insight?> c1 <- 1:1000000 > len <- length(c1) > rbenchmark::benchmark(log(c1[-1]/c1[-len]), replications = 100)test replications elapsed relative user.self sys.self 1 log(c1[-1]/c1[-len]) 100 4.617 1 4.484 0.133 user.child sys.child 1 0 0> rbenchmark::benchmark(pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1] / c1[i])), replications = 100)test 1 pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1]/c1[i])) replications elapsed relative user.self sys.self user.child sys.child 1 100 9.079 1 2.571 4.138 9.736 8.046 On Sun, Sep 23, 2018 at 12:33 PM Ista Zahn <istazahn at gmail.com> wrote:> > On Sun, Sep 23, 2018 at 10:09 AM Wensui Liu <liuwensui at gmail.com> wrote: > > > > Why? > > The operations required for this algorithm are vectorized, as are most > operations in R. There is no need to iterate through each element. > Using Vectorize to achieve the iteration is no better than using > *apply or a for-loop, and betrays the same basic lack of insight into > basic principles of programming in R. > > And/or, if you want a more practical reason: > > > c1 <- 1:1000000 > > len <- 1000000 > > system.time( s1 <- log(c1[-1]/c1[-len])) > user system elapsed > 0.031 0.004 0.035 > > system.time(s2 <- Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len)) > user system elapsed > 1.258 0.022 1.282 > > Best, > Ista > > > > > On Sun, Sep 23, 2018 at 7:54 AM Ista Zahn <istazahn at gmail.com> wrote: > >> > >> On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu <liuwensui at gmail.com> wrote: > >> > > >> > or this one: > >> > > >> > (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len)) > >> > >> Oh dear god no. > >> > >> > > >> > On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <rsherry8 at comcast.net> wrote: > >> > > > >> > > > >> > > It is my impression that good R programmers make very little use of the > >> > > for statement. Please consider the following > >> > > R statement: > >> > > for( i in 1:(len-1) ) s[i] = log(c1[i+1]/c1[i], base = exp(1) ) > >> > > One problem I have found with this statement is that s must exist before > >> > > the statement is run. Can it be written without using a for > >> > > loop? Would that be better? > >> > > > >> > > Thanks, > >> > > Bob > >> > > > >> > > ______________________________________________ > >> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> > > https://stat.ethz.ch/mailman/listinfo/r-help > >> > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > >> > > and provide commented, minimal, self-contained, reproducible code. > >> > > >> > ______________________________________________ > >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> > https://stat.ethz.ch/mailman/listinfo/r-help > >> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > >> > and provide commented, minimal, self-contained, reproducible code.
In my opinion this is a pretty reasonable question for someone new to R. Yes, it can be written without a for loop, and it would be better. Rich Heiberger gave a good solution early on, but I'd like to add an outline of the reasoning that leads to the solution. You are taking the log of a ratio, and in the ratio, the numerator uses elements 2 through len, and the denominator uses elements 1 through (len-1). So, just write it that way: c1[2:len]/c1[1:(len-1)] or, taking advantage of using negative numbers when indexing vectors, c1[-1]/c1[-len] then take the log s <- log( c1[-1]/c1[-len] ) Comparing this with the loop version makes an example of why people say the R language is vectorized. Do good R programmers make very little use of the for statement? Since R is vectorized, the for statement is necessary less often than in non-vectorized languages. But "very little use" would be too broad a generalization. It will depend on what problems are being solved. Finally, if using the loop in this case, it's true that s must exist before the statement is run. But that's not much of a problem. Just put s <- numeric( len-1) before the loop. -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 Lab cell 925-724-7509 ?On 9/22/18, 2:16 PM, "R-help on behalf of rsherry8" <r-help-bounces at r-project.org on behalf of rsherry8 at comcast.net> wrote: It is my impression that good R programmers make very little use of the for statement. Please consider the following R statement: for( i in 1:(len-1) ) s[i] = log(c1[i+1]/c1[i], base = exp(1) ) One problem I have found with this statement is that s must exist before the statement is run. Can it be written without using a for loop? Would that be better? Thanks, Bob ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
One issue I haven't seen mentioned (and apologize if I've missed it) is that of making programs readable for long-term use. In the histoRicalg project to try to document and test some of the codes from long ago that are the underpinnings of some important R computations, things like the negative index approach require what we might term "local knowledge" i.e., to R. In such cases the old fashioned for loop is easier for humans to understand. The i:j form is a bit easier. Compromise is to comment, and if your code is EVER to be used later, especially by non-R users, it is a good idea to do so. i.e., c1[-1] # does loop from 2 to end of vector John Nash histoRicalg links, to which all welcome: https://gitlab.com/nashjc/histoRicalg https://gitlab.com/nashjc/histoRicalg/wikis/home https://lists.r-consortium.org/g/rconsortium-project-histoRicalg On 2018-09-24 12:13 PM, MacQueen, Don via R-help wrote:> In my opinion this is a pretty reasonable question for someone new to R. > > Yes, it can be written without a for loop, and it would be better. Rich Heiberger gave a good solution early on, but I'd like to add an outline of the reasoning that leads to the solution. > > You are taking the log of a ratio, and in the ratio, the numerator uses elements 2 through len, and the denominator uses elements 1 through (len-1). So, just write it that way: > > c1[2:len]/c1[1:(len-1)] > > or, taking advantage of using negative numbers when indexing vectors, > > c1[-1]/c1[-len] > > then take the log > > s <- log( c1[-1]/c1[-len] ) > > Comparing this with the loop version makes an example of why people say the R language is vectorized. > > Do good R programmers make very little use of the for statement? Since R is vectorized, the for statement is necessary less often than in non-vectorized languages. But "very little use" would be too broad a generalization. It will depend on what problems are being solved. > > Finally, if using the loop in this case, it's true that s must exist before the statement is run. But that's not much of a problem. Just put > s <- numeric( len-1) > before the loop. > > -- > Don MacQueen > Lawrence Livermore National Laboratory > 7000 East Ave., L-627 > Livermore, CA 94550 > 925-423-1062 > Lab cell 925-724-7509 > > > > ?On 9/22/18, 2:16 PM, "R-help on behalf of rsherry8" <r-help-bounces at r-project.org on behalf of rsherry8 at comcast.net> wrote: > > > It is my impression that good R programmers make very little use of the > for statement. Please consider the following > R statement: > for( i in 1:(len-1) ) s[i] = log(c1[i+1]/c1[i], base = exp(1) ) > One problem I have found with this statement is that s must exist before > the statement is run. Can it be written without using a for > loop? Would that be better? > > Thanks, > Bob > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >