what you measures is the "elapsed" time in the default setting. you might need to take a closer look at the beautiful benchmark() function and see what time I am talking about. I just provided tentative solution for the person asking for it and believe he has enough wisdom to decide what's best. why bother to judge others subjectively? On Sun, Sep 23, 2018 at 1:18 PM Ista Zahn <istazahn at gmail.com> wrote:> > On Sun, Sep 23, 2018 at 1:46 PM Wensui Liu <liuwensui at gmail.com> wrote: > > > > actually, by the parallel pvec, the user time is a lot shorter. or did > > I somewhere miss your invaluable insight? > > > > > c1 <- 1:1000000 > > > len <- length(c1) > > > rbenchmark::benchmark(log(c1[-1]/c1[-len]), replications = 100) > > test replications elapsed relative user.self sys.self > > 1 log(c1[-1]/c1[-len]) 100 4.617 1 4.484 0.133 > > user.child sys.child > > 1 0 0 > > > rbenchmark::benchmark(pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1] / c1[i])), replications = 100) > > test > > 1 pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1]/c1[i])) > > replications elapsed relative user.self sys.self user.child sys.child > > 1 100 9.079 1 2.571 4.138 9.736 8.046 > > Your output is mangled in my email, but on my system your pvec > approach takes more than twice as long: > > c1 <- 1:1000000 > len <- length(c1) > library(parallel) > library(rbenchmark) > > regular <- function() log(c1[-1]/c1[-len]) > iterate.parallel <- function() { > pvec(1:(len - 1), mc.cores = 4, > function(i) log(c1[i + 1] / c1[i])) > } > > benchmark(regular(), iterate.parallel(), > replications = 100, > columns = c("test", "elapsed", "relative")) > ## test elapsed relative > ## 2 iterate.parallel() 7.517 2.482 > ## 1 regular() 3.028 1.000 > > Honestly, just use log(c1[-1]/c1[-len]). The code is simple and easy > to understand and it runs pretty fast. There is usually no reason to > make it more complicated. > --Ista > > > On Sun, Sep 23, 2018 at 12:33 PM Ista Zahn <istazahn at gmail.com> wrote: > > > > > > On Sun, Sep 23, 2018 at 10:09 AM Wensui Liu <liuwensui at gmail.com> wrote: > > > > > > > > Why? > > > > > > The operations required for this algorithm are vectorized, as are most > > > operations in R. There is no need to iterate through each element. > > > Using Vectorize to achieve the iteration is no better than using > > > *apply or a for-loop, and betrays the same basic lack of insight into > > > basic principles of programming in R. > > > > > > And/or, if you want a more practical reason: > > > > > > > c1 <- 1:1000000 > > > > len <- 1000000 > > > > system.time( s1 <- log(c1[-1]/c1[-len])) > > > user system elapsed > > > 0.031 0.004 0.035 > > > > system.time(s2 <- Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len)) > > > user system elapsed > > > 1.258 0.022 1.282 > > > > > > Best, > > > Ista > > > > > > > > > > > On Sun, Sep 23, 2018 at 7:54 AM Ista Zahn <istazahn at gmail.com> wrote: > > > >> > > > >> On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu <liuwensui at gmail.com> wrote: > > > >> > > > > >> > or this one: > > > >> > > > > >> > (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len)) > > > >> > > > >> Oh dear god no. > > > >> > > > >> > > > > >> > On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <rsherry8 at comcast.net> wrote: > > > >> > > > > > >> > > > > > >> > > It is my impression that good R programmers make very little use of the > > > >> > > for statement. Please consider the following > > > >> > > R statement: > > > >> > > for( i in 1:(len-1) ) s[i] = log(c1[i+1]/c1[i], base = exp(1) ) > > > >> > > One problem I have found with this statement is that s must exist before > > > >> > > the statement is run. Can it be written without using a for > > > >> > > loop? Would that be better? > > > >> > > > > > >> > > Thanks, > > > >> > > Bob > > > >> > > > > > >> > > ______________________________________________ > > > >> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > >> > > https://stat.ethz.ch/mailman/listinfo/r-help > > > >> > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > >> > > and provide commented, minimal, self-contained, reproducible code. > > > >> > > > > >> > ______________________________________________ > > > >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > >> > https://stat.ethz.ch/mailman/listinfo/r-help > > > >> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > >> > and provide commented, minimal, self-contained, reproducible code.
At the risk of asking something fundamental . . . . does log(c1[-1]/c1[-len] do the following (1) use all elements of c and perform the calculation (2) delete the first element of c and perform the calculation, (2) delete the first two elements of c and perform the calculation, . . . (n) use only the last element of c and perform the calculation. Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) ________________________________ From: R-help <r-help-bounces at r-project.org> on behalf of Wensui Liu <liuwensui at gmail.com> Sent: Sunday, September 23, 2018 2:26 PM To: Ista Zahn Cc: r-help at r-project.org Subject: Re: [R] For Loop CAUTION: This message originated from a non UMB, UMSOM, FPI, or UMMS email system. Whether the sender is known or not known, hover over any links before clicking and use caution opening attachments. what you measures is the "elapsed" time in the default setting. you might need to take a closer look at the beautiful benchmark() function and see what time I am talking about. I just provided tentative solution for the person asking for it and believe he has enough wisdom to decide what's best. why bother to judge others subjectively? On Sun, Sep 23, 2018 at 1:18 PM Ista Zahn <istazahn at gmail.com> wrote:> > On Sun, Sep 23, 2018 at 1:46 PM Wensui Liu <liuwensui at gmail.com> wrote: > > > > actually, by the parallel pvec, the user time is a lot shorter. or did > > I somewhere miss your invaluable insight? > > > > > c1 <- 1:1000000 > > > len <- length(c1) > > > rbenchmark::benchmark(log(c1[-1]/c1[-len]), replications = 100) > > test replications elapsed relative user.self sys.self > > 1 log(c1[-1]/c1[-len]) 100 4.617 1 4.484 0.133 > > user.child sys.child > > 1 0 0 > > > rbenchmark::benchmark(pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1] / c1[i])), replications = 100) > > test > > 1 pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1]/c1[i])) > > replications elapsed relative user.self sys.self user.child sys.child > > 1 100 9.079 1 2.571 4.138 9.736 8.046 > > Your output is mangled in my email, but on my system your pvec > approach takes more than twice as long: > > c1 <- 1:1000000 > len <- length(c1) > library(parallel) > library(rbenchmark) > > regular <- function() log(c1[-1]/c1[-len]) > iterate.parallel <- function() { > pvec(1:(len - 1), mc.cores = 4, > function(i) log(c1[i + 1] / c1[i])) > } > > benchmark(regular(), iterate.parallel(), > replications = 100, > columns = c("test", "elapsed", "relative")) > ## test elapsed relative > ## 2 iterate.parallel() 7.517 2.482 > ## 1 regular() 3.028 1.000 > > Honestly, just use log(c1[-1]/c1[-len]). The code is simple and easy > to understand and it runs pretty fast. There is usually no reason to > make it more complicated. > --Ista > > > On Sun, Sep 23, 2018 at 12:33 PM Ista Zahn <istazahn at gmail.com> wrote: > > > > > > On Sun, Sep 23, 2018 at 10:09 AM Wensui Liu <liuwensui at gmail.com> wrote: > > > > > > > > Why? > > > > > > The operations required for this algorithm are vectorized, as are most > > > operations in R. There is no need to iterate through each element. > > > Using Vectorize to achieve the iteration is no better than using > > > *apply or a for-loop, and betrays the same basic lack of insight into > > > basic principles of programming in R. > > > > > > And/or, if you want a more practical reason: > > > > > > > c1 <- 1:1000000 > > > > len <- 1000000 > > > > system.time( s1 <- log(c1[-1]/c1[-len])) > > > user system elapsed > > > 0.031 0.004 0.035 > > > > system.time(s2 <- Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len)) > > > user system elapsed > > > 1.258 0.022 1.282 > > > > > > Best, > > > Ista > > > > > > > > > > > On Sun, Sep 23, 2018 at 7:54 AM Ista Zahn <istazahn at gmail.com> wrote: > > > >> > > > >> On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu <liuwensui at gmail.com> wrote: > > > >> > > > > >> > or this one: > > > >> > > > > >> > (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len)) > > > >> > > > >> Oh dear god no. > > > >> > > > >> > > > > >> > On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <rsherry8 at comcast.net> wrote: > > > >> > > > > > >> > > > > > >> > > It is my impression that good R programmers make very little use of the > > > >> > > for statement. Please consider the following > > > >> > > R statement: > > > >> > > for( i in 1:(len-1) ) s[i] = log(c1[i+1]/c1[i], base = exp(1) ) > > > >> > > One problem I have found with this statement is that s must exist before > > > >> > > the statement is run. Can it be written without using a for > > > >> > > loop? Would that be better? > > > >> > > > > > >> > > Thanks, > > > >> > > Bob > > > >> > > > > > >> > > ______________________________________________ > > > >> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > >> > > https://stat.ethz.ch/mailman/listinfo/r-help > > > >> > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > >> > > and provide commented, minimal, self-contained, reproducible code. > > > >> > > > > >> > ______________________________________________ > > > >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > >> > https://stat.ethz.ch/mailman/listinfo/r-help > > > >> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > >> > and provide commented, minimal, self-contained, reproducible code.______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
On 23/09/2018 2:36 PM, Sorkin, John wrote:> At the risk of asking something fundamental . . . . > > does log(c1[-1]/c1[-len] > > do the following > > > (1) use all elements of c and perform the calculation > > (2) delete the first element of c and perform the calculation, > > (2) delete the first two elements of c and perform the calculation, > > . . . > > (n) use only the last element of c and perform the calculation.c1[-1] creates a new vector which is a copy of c1 leaving out element 1, and c1[-len] creates a new vector which copies everything except element len. So your (1) is closest to the truth. It is very similar to (but probably a little faster than) log(c1[2:len]/c1[1:(len-1)]) There are differences in borderline cases (like length(c1) != len, or len < 2) that are not relevant in the original context. Duncan Murdoch> > > Thank you, > > John > > > > John David Sorkin M.D., Ph.D. > Professor of Medicine > Chief, Biostatistics and Informatics > University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine > Baltimore VA Medical Center > 10 North Greene Street > GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > (Phone) 410-605-7119 > (Fax) 410-605-7913 (Please call phone number above prior to faxing) > > > > ________________________________ > From: R-help <r-help-bounces at r-project.org> on behalf of Wensui Liu <liuwensui at gmail.com> > Sent: Sunday, September 23, 2018 2:26 PM > To: Ista Zahn > Cc: r-help at r-project.org > Subject: Re: [R] For Loop > > CAUTION: This message originated from a non UMB, UMSOM, FPI, or UMMS email system. Whether the sender is known or not known, hover over any links before clicking and use caution opening attachments. > > > > what you measures is the "elapsed" time in the default setting. you > might need to take a closer look at the beautiful benchmark() function > and see what time I am talking about. > > I just provided tentative solution for the person asking for it and > believe he has enough wisdom to decide what's best. why bother to > judge others subjectively? > On Sun, Sep 23, 2018 at 1:18 PM Ista Zahn <istazahn at gmail.com> wrote: >> >> On Sun, Sep 23, 2018 at 1:46 PM Wensui Liu <liuwensui at gmail.com> wrote: >>> >>> actually, by the parallel pvec, the user time is a lot shorter. or did >>> I somewhere miss your invaluable insight? >>> >>>> c1 <- 1:1000000 >>>> len <- length(c1) >>>> rbenchmark::benchmark(log(c1[-1]/c1[-len]), replications = 100) >>> test replications elapsed relative user.self sys.self >>> 1 log(c1[-1]/c1[-len]) 100 4.617 1 4.484 0.133 >>> user.child sys.child >>> 1 0 0 >>>> rbenchmark::benchmark(pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1] / c1[i])), replications = 100) >>> test >>> 1 pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1]/c1[i])) >>> replications elapsed relative user.self sys.self user.child sys.child >>> 1 100 9.079 1 2.571 4.138 9.736 8.046 >> >> Your output is mangled in my email, but on my system your pvec >> approach takes more than twice as long: >> >> c1 <- 1:1000000 >> len <- length(c1) >> library(parallel) >> library(rbenchmark) >> >> regular <- function() log(c1[-1]/c1[-len]) >> iterate.parallel <- function() { >> pvec(1:(len - 1), mc.cores = 4, >> function(i) log(c1[i + 1] / c1[i])) >> } >> >> benchmark(regular(), iterate.parallel(), >> replications = 100, >> columns = c("test", "elapsed", "relative")) >> ## test elapsed relative >> ## 2 iterate.parallel() 7.517 2.482 >> ## 1 regular() 3.028 1.000 >> >> Honestly, just use log(c1[-1]/c1[-len]). The code is simple and easy >> to understand and it runs pretty fast. There is usually no reason to >> make it more complicated. >> --Ista >> >>> On Sun, Sep 23, 2018 at 12:33 PM Ista Zahn <istazahn at gmail.com> wrote: >>>> >>>> On Sun, Sep 23, 2018 at 10:09 AM Wensui Liu <liuwensui at gmail.com> wrote: >>>>> >>>>> Why? >>>> >>>> The operations required for this algorithm are vectorized, as are most >>>> operations in R. There is no need to iterate through each element. >>>> Using Vectorize to achieve the iteration is no better than using >>>> *apply or a for-loop, and betrays the same basic lack of insight into >>>> basic principles of programming in R. >>>> >>>> And/or, if you want a more practical reason: >>>> >>>>> c1 <- 1:1000000 >>>>> len <- 1000000 >>>>> system.time( s1 <- log(c1[-1]/c1[-len])) >>>> user system elapsed >>>> 0.031 0.004 0.035 >>>>> system.time(s2 <- Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len)) >>>> user system elapsed >>>> 1.258 0.022 1.282 >>>> >>>> Best, >>>> Ista >>>> >>>>> >>>>> On Sun, Sep 23, 2018 at 7:54 AM Ista Zahn <istazahn at gmail.com> wrote: >>>>>> >>>>>> On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu <liuwensui at gmail.com> wrote: >>>>>>> >>>>>>> or this one: >>>>>>> >>>>>>> (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len)) >>>>>> >>>>>> Oh dear god no. >>>>>> >>>>>>> >>>>>>> On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <rsherry8 at comcast.net> wrote: >>>>>>>> >>>>>>>> >>>>>>>> It is my impression that good R programmers make very little use of the >>>>>>>> for statement. Please consider the following >>>>>>>> R statement: >>>>>>>> for( i in 1:(len-1) ) s[i] = log(c1[i+1]/c1[i], base = exp(1) ) >>>>>>>> One problem I have found with this statement is that s must exist before >>>>>>>> the statement is run. Can it be written without using a for >>>>>>>> loop? Would that be better? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Bob >>>>>>>> >>>>>>>> ______________________________________________ >>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>>> >>>>>>> ______________________________________________ >>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>>>>> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Below... On Sun, 23 Sep 2018, Sorkin, John wrote:> At the risk of asking something fundamental . . . . > > does log(c1[-1]/c1[-len]You dropped the closing parenthesis. log( c1[-1] / c1[-len] )> > do the following > > > (1) use all elements of c and perform the calculationNo. a) "c" is the base "concatenate" function, and b) it is using two different subsets of the elements in c1.> (2) delete the first element of c and perform the calculation,It does not change c1. c1[-1] is an expression that creates an entirely new (but unnamed) vector that contains everything but the first element of c1.> (2) delete the first two elements of c and perform the calculation,You are wandering into the weeds here...> . . . > > (n) use only the last element of c and perform the calculation.No, c1[-len] creates a temporary array that contains all elements except the one(s) in the variable "len". Note that the more conventional syntax here is c1[ length(c1) ]. c1 <- 1:3 c1[ -1 ] #> [1] 2 3 c1[ -length(c1) ] #> [1] 1 2 c1[ -1 ] / c1[ -length( c1 ) ] # c(2,3)/c(1,2) #> [1] 2.0 1.5 log( c1[ -1 ] / c1[ -length( c1 ) ] ) # log( c(2, 1.5) ) #> [1] 0.6931472 0.4054651 #' Created on 2018-09-23 by the [reprex package](http://reprex.tidyverse.org) (v0.2.0).> > > Thank you, > > John > > > > John David Sorkin M.D., Ph.D. > Professor of Medicine > Chief, Biostatistics and Informatics > University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine > Baltimore VA Medical Center > 10 North Greene Street > GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > (Phone) 410-605-7119 > (Fax) 410-605-7913 (Please call phone number above prior to faxing) > > > > ________________________________ > From: R-help <r-help-bounces at r-project.org> on behalf of Wensui Liu <liuwensui at gmail.com> > Sent: Sunday, September 23, 2018 2:26 PM > To: Ista Zahn > Cc: r-help at r-project.org > Subject: Re: [R] For Loop > > CAUTION: This message originated from a non UMB, UMSOM, FPI, or UMMS email system. Whether the sender is known or not known, hover over any links before clicking and use caution opening attachments. > > > > what you measures is the "elapsed" time in the default setting. you > might need to take a closer look at the beautiful benchmark() function > and see what time I am talking about. > > I just provided tentative solution for the person asking for it and > believe he has enough wisdom to decide what's best. why bother to > judge others subjectively? > On Sun, Sep 23, 2018 at 1:18 PM Ista Zahn <istazahn at gmail.com> wrote: >> >> On Sun, Sep 23, 2018 at 1:46 PM Wensui Liu <liuwensui at gmail.com> wrote: >>> >>> actually, by the parallel pvec, the user time is a lot shorter. or did >>> I somewhere miss your invaluable insight? >>> >>>> c1 <- 1:1000000 >>>> len <- length(c1) >>>> rbenchmark::benchmark(log(c1[-1]/c1[-len]), replications = 100) >>> test replications elapsed relative user.self sys.self >>> 1 log(c1[-1]/c1[-len]) 100 4.617 1 4.484 0.133 >>> user.child sys.child >>> 1 0 0 >>>> rbenchmark::benchmark(pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1] / c1[i])), replications = 100) >>> test >>> 1 pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1]/c1[i])) >>> replications elapsed relative user.self sys.self user.child sys.child >>> 1 100 9.079 1 2.571 4.138 9.736 8.046 >> >> Your output is mangled in my email, but on my system your pvec >> approach takes more than twice as long: >> >> c1 <- 1:1000000 >> len <- length(c1) >> library(parallel) >> library(rbenchmark) >> >> regular <- function() log(c1[-1]/c1[-len]) >> iterate.parallel <- function() { >> pvec(1:(len - 1), mc.cores = 4, >> function(i) log(c1[i + 1] / c1[i])) >> } >> >> benchmark(regular(), iterate.parallel(), >> replications = 100, >> columns = c("test", "elapsed", "relative")) >> ## test elapsed relative >> ## 2 iterate.parallel() 7.517 2.482 >> ## 1 regular() 3.028 1.000 >> >> Honestly, just use log(c1[-1]/c1[-len]). The code is simple and easy >> to understand and it runs pretty fast. There is usually no reason to >> make it more complicated. >> --Ista >> >>> On Sun, Sep 23, 2018 at 12:33 PM Ista Zahn <istazahn at gmail.com> wrote: >>>> >>>> On Sun, Sep 23, 2018 at 10:09 AM Wensui Liu <liuwensui at gmail.com> wrote: >>>>> >>>>> Why? >>>> >>>> The operations required for this algorithm are vectorized, as are most >>>> operations in R. There is no need to iterate through each element. >>>> Using Vectorize to achieve the iteration is no better than using >>>> *apply or a for-loop, and betrays the same basic lack of insight into >>>> basic principles of programming in R. >>>> >>>> And/or, if you want a more practical reason: >>>> >>>>> c1 <- 1:1000000 >>>>> len <- 1000000 >>>>> system.time( s1 <- log(c1[-1]/c1[-len])) >>>> user system elapsed >>>> 0.031 0.004 0.035 >>>>> system.time(s2 <- Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len)) >>>> user system elapsed >>>> 1.258 0.022 1.282 >>>> >>>> Best, >>>> Ista >>>> >>>>> >>>>> On Sun, Sep 23, 2018 at 7:54 AM Ista Zahn <istazahn at gmail.com> wrote: >>>>>> >>>>>> On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu <liuwensui at gmail.com> wrote: >>>>>>> >>>>>>> or this one: >>>>>>> >>>>>>> (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len)) >>>>>> >>>>>> Oh dear god no. >>>>>> >>>>>>> >>>>>>> On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <rsherry8 at comcast.net> wrote: >>>>>>>> >>>>>>>> >>>>>>>> It is my impression that good R programmers make very little use of the >>>>>>>> for statement. Please consider the following >>>>>>>> R statement: >>>>>>>> for( i in 1:(len-1) ) s[i] = log(c1[i+1]/c1[i], base = exp(1) ) >>>>>>>> One problem I have found with this statement is that s must exist before >>>>>>>> the statement is run. Can it be written without using a for >>>>>>>> loop? Would that be better? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Bob >>>>>>>> >>>>>>>> ______________________________________________ >>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>>> >>>>>>> ______________________________________________ >>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>>>>> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >--------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
On Sun, Sep 23, 2018 at 2:26 PM Wensui Liu <liuwensui at gmail.com> wrote:> > what you measures is the "elapsed" time in the default setting. you > might need to take a closer look at the beautiful benchmark() function > and see what time I am talking about.I'm pretty sure you do not know what you are talking about.> > I just provided tentative solution for the person asking for it and > believe he has enough wisdom to decide what's best. why bother to > judge others subjectively?You are giving bad and confused advice. Please stop doing that. --Ista> On Sun, Sep 23, 2018 at 1:18 PM Ista Zahn <istazahn at gmail.com> wrote: > > > > On Sun, Sep 23, 2018 at 1:46 PM Wensui Liu <liuwensui at gmail.com> wrote: > > > > > > actually, by the parallel pvec, the user time is a lot shorter. or did > > > I somewhere miss your invaluable insight? > > > > > > > c1 <- 1:1000000 > > > > len <- length(c1) > > > > rbenchmark::benchmark(log(c1[-1]/c1[-len]), replications = 100) > > > test replications elapsed relative user.self sys.self > > > 1 log(c1[-1]/c1[-len]) 100 4.617 1 4.484 0.133 > > > user.child sys.child > > > 1 0 0 > > > > rbenchmark::benchmark(pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1] / c1[i])), replications = 100) > > > test > > > 1 pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1]/c1[i])) > > > replications elapsed relative user.self sys.self user.child sys.child > > > 1 100 9.079 1 2.571 4.138 9.736 8.046 > > > > Your output is mangled in my email, but on my system your pvec > > approach takes more than twice as long: > > > > c1 <- 1:1000000 > > len <- length(c1) > > library(parallel) > > library(rbenchmark) > > > > regular <- function() log(c1[-1]/c1[-len]) > > iterate.parallel <- function() { > > pvec(1:(len - 1), mc.cores = 4, > > function(i) log(c1[i + 1] / c1[i])) > > } > > > > benchmark(regular(), iterate.parallel(), > > replications = 100, > > columns = c("test", "elapsed", "relative")) > > ## test elapsed relative > > ## 2 iterate.parallel() 7.517 2.482 > > ## 1 regular() 3.028 1.000 > > > > Honestly, just use log(c1[-1]/c1[-len]). The code is simple and easy > > to understand and it runs pretty fast. There is usually no reason to > > make it more complicated. > > --Ista > > > > > On Sun, Sep 23, 2018 at 12:33 PM Ista Zahn <istazahn at gmail.com> wrote: > > > > > > > > On Sun, Sep 23, 2018 at 10:09 AM Wensui Liu <liuwensui at gmail.com> wrote: > > > > > > > > > > Why? > > > > > > > > The operations required for this algorithm are vectorized, as are most > > > > operations in R. There is no need to iterate through each element. > > > > Using Vectorize to achieve the iteration is no better than using > > > > *apply or a for-loop, and betrays the same basic lack of insight into > > > > basic principles of programming in R. > > > > > > > > And/or, if you want a more practical reason: > > > > > > > > > c1 <- 1:1000000 > > > > > len <- 1000000 > > > > > system.time( s1 <- log(c1[-1]/c1[-len])) > > > > user system elapsed > > > > 0.031 0.004 0.035 > > > > > system.time(s2 <- Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len)) > > > > user system elapsed > > > > 1.258 0.022 1.282 > > > > > > > > Best, > > > > Ista > > > > > > > > > > > > > > On Sun, Sep 23, 2018 at 7:54 AM Ista Zahn <istazahn at gmail.com> wrote: > > > > >> > > > > >> On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu <liuwensui at gmail.com> wrote: > > > > >> > > > > > >> > or this one: > > > > >> > > > > > >> > (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len)) > > > > >> > > > > >> Oh dear god no. > > > > >> > > > > >> > > > > > >> > On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <rsherry8 at comcast.net> wrote: > > > > >> > > > > > > >> > > > > > > >> > > It is my impression that good R programmers make very little use of the > > > > >> > > for statement. Please consider the following > > > > >> > > R statement: > > > > >> > > for( i in 1:(len-1) ) s[i] = log(c1[i+1]/c1[i], base = exp(1) ) > > > > >> > > One problem I have found with this statement is that s must exist before > > > > >> > > the statement is run. Can it be written without using a for > > > > >> > > loop? Would that be better? > > > > >> > > > > > > >> > > Thanks, > > > > >> > > Bob > > > > >> > > > > > > >> > > ______________________________________________ > > > > >> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > >> > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > >> > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > > >> > > and provide commented, minimal, self-contained, reproducible code. > > > > >> > > > > > >> > ______________________________________________ > > > > >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > >> > https://stat.ethz.ch/mailman/listinfo/r-help > > > > >> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > > >> > and provide commented, minimal, self-contained, reproducible code.
On Sun, 23 Sep 2018, Wensui Liu wrote:> what you measures is the "elapsed" time in the default setting. you > might need to take a closer look at the beautiful benchmark() function > and see what time I am talking about.When I am waiting for the answer, elapsed time is what matters to me. Also, since each person usually has different hardware, running benchmark with multiple expressions as Ista did lets you pay attention to relative comparisons. Keep in mind that parallel processing requires extra time just to distribute the calculations to the workers, so it doesn't pay to distribute tiny tasks like calculating the division of two numeric vector elements. That is the essence of vectorizing... bundle your simple calculations together so the processor can focus on getting answers rather than managing processes or even interpreting R for loops.> I just provided tentative solution for the person asking for it and > believe he has enough wisdom to decide what's best. why bother to > judge others subjectively?I would say that Ista has backed up his objections with measurable performance metrics, so while his initial reaction was pretty subjective I think your reaction at this point is really off the mark. One confusing aspect of your response is that Ista reacted to your use of the Vectorize function, but you responded as though he reacted to your use of the pvec function. I mentioned drawbacks of using pvec above, but it really is important to stress that the Vectorize function is a usability facade and is in no way a performance enhancement to be associated with what we refer to as vectorized (lowercase) code. The Vectorize function creates a function that calls lapply, which in turn calls the C function do_lapply, which calls your R function with scalar inputs as many times as desired, storing the results in a list, which Vectorize then gives to mapply which runs another for loop over to create a matrix or vector result. This is clearly less efficient than a simple for loop would have been, rather than more efficient as a true vectorized solution such as log(c1[-1]/c1[-len]) will normally be. Vectorize is syntactic sugar with a performance penalty. Please pay attention to the comments offered by others on this list... being told your solution is inferior doesn't feel good but it is a very real opportunity for you to improve. End comment.> On Sun, Sep 23, 2018 at 1:18 PM Ista Zahn <istazahn at gmail.com> wrote: >> >> On Sun, Sep 23, 2018 at 1:46 PM Wensui Liu <liuwensui at gmail.com> wrote: >>> >>> actually, by the parallel pvec, the user time is a lot shorter. or did >>> I somewhere miss your invaluable insight? >>> >>>> c1 <- 1:1000000 >>>> len <- length(c1) >>>> rbenchmark::benchmark(log(c1[-1]/c1[-len]), replications = 100) >>> test replications elapsed relative user.self sys.self >>> 1 log(c1[-1]/c1[-len]) 100 4.617 1 4.484 0.133 >>> user.child sys.child >>> 1 0 0 >>>> rbenchmark::benchmark(pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1] / c1[i])), replications = 100) >>> test >>> 1 pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1]/c1[i])) >>> replications elapsed relative user.self sys.self user.child sys.child >>> 1 100 9.079 1 2.571 4.138 9.736 8.046 >> >> Your output is mangled in my email, but on my system your pvec >> approach takes more than twice as long: >> >> c1 <- 1:1000000 >> len <- length(c1) >> library(parallel) >> library(rbenchmark) >> >> regular <- function() log(c1[-1]/c1[-len]) >> iterate.parallel <- function() { >> pvec(1:(len - 1), mc.cores = 4, >> function(i) log(c1[i + 1] / c1[i])) >> } >> >> benchmark(regular(), iterate.parallel(), >> replications = 100, >> columns = c("test", "elapsed", "relative")) >> ## test elapsed relative >> ## 2 iterate.parallel() 7.517 2.482 >> ## 1 regular() 3.028 1.000 >> >> Honestly, just use log(c1[-1]/c1[-len]). The code is simple and easy >> to understand and it runs pretty fast. There is usually no reason to >> make it more complicated. >> --Ista >> >>> On Sun, Sep 23, 2018 at 12:33 PM Ista Zahn <istazahn at gmail.com> wrote: >>>> >>>> On Sun, Sep 23, 2018 at 10:09 AM Wensui Liu <liuwensui at gmail.com> wrote: >>>>> >>>>> Why? >>>> >>>> The operations required for this algorithm are vectorized, as are most >>>> operations in R. There is no need to iterate through each element. >>>> Using Vectorize to achieve the iteration is no better than using >>>> *apply or a for-loop, and betrays the same basic lack of insight into >>>> basic principles of programming in R. >>>> >>>> And/or, if you want a more practical reason: >>>> >>>>> c1 <- 1:1000000 >>>>> len <- 1000000 >>>>> system.time( s1 <- log(c1[-1]/c1[-len])) >>>> user system elapsed >>>> 0.031 0.004 0.035 >>>>> system.time(s2 <- Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len)) >>>> user system elapsed >>>> 1.258 0.022 1.282 >>>> >>>> Best, >>>> Ista >>>> >>>>> >>>>> On Sun, Sep 23, 2018 at 7:54 AM Ista Zahn <istazahn at gmail.com> wrote: >>>>>> >>>>>> On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu <liuwensui at gmail.com> wrote: >>>>>>> >>>>>>> or this one: >>>>>>> >>>>>>> (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len)) >>>>>> >>>>>> Oh dear god no. >>>>>> >>>>>>> >>>>>>> On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <rsherry8 at comcast.net> wrote: >>>>>>>> >>>>>>>> >>>>>>>> It is my impression that good R programmers make very little use of the >>>>>>>> for statement. Please consider the following >>>>>>>> R statement: >>>>>>>> for( i in 1:(len-1) ) s[i] = log(c1[i+1]/c1[i], base = exp(1) ) >>>>>>>> One problem I have found with this statement is that s must exist before >>>>>>>> the statement is run. Can it be written without using a for >>>>>>>> loop? Would that be better? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Bob >>>>>>>> >>>>>>>> ______________________________________________ >>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>>> >>>>>>> ______________________________________________ >>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>>>>> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >--------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
On 23/09/2018 3:31 PM, Jeff Newmiller wrote: [lots of good stuff deleted]> Vectorize is > syntactic sugar with a performance penalty.[More deletions.] I would say Vectorize isn't just "syntactic sugar". When I use that term, I mean something that looks nice but is functionally equivalent. However, Vectorize() really does something useful: some functions (e.g. outer()) take other functions as arguments, but they assume the argument is a vectorized function. If it is not, they fail, or generate garbage results. Vectorize() is designed to modify the interface to a function so it acts as if it is vectorized. The "performance penalty" part of your statement is true. It will generally save some computing cycles to write a new function using a for loop instead of using Vectorize(). But that may waste some programmer time. Duncan Murdoch (writing as one of the authors of Vectorize()) P.S. I'd give an example of syntactic sugar, but I don't want to bruise some other author's feelings :-).
>>>>> Wensui Liu >>>>> on Sun, 23 Sep 2018 13:26:32 -0500 writes:> what you measures is the "elapsed" time in the default > setting. you might need to take a closer look at the > beautiful benchmark() function and see what time I am > talking about. > I just provided tentative solution for the person asking > for it and believe he has enough wisdom to decide what's > best. why bother to judge others subjectively? Well, because Ista Zahn is much much much better R programmer than you, sorry to be blunt! Martin > On Sun, Sep 23, 2018 at 1:18 PM Ista Zahn <istazahn at gmail.com> > wrote: >> >> On Sun, Sep 23, 2018 at 1:46 PM Wensui Liu >> <liuwensui at gmail.com> wrote: >> > >> > actually, by the parallel pvec, the user time is a lot >> shorter. or did > I somewhere miss your invaluable >> insight? >> > >> > > c1 <- 1:1000000 > > len <- length(c1) > > >> rbenchmark::benchmark(log(c1[-1]/c1[-len]), replications >> = 100) > test replications elapsed relative user.self >> sys.self > 1 log(c1[-1]/c1[-len]) 100 4.617 1 4.484 0.133 >> > user.child sys.child > 1 0 0 > > >> rbenchmark::benchmark(pvec(1:(len - 1), mc.cores = 4, >> function(i) log(c1[i + 1] / c1[i])), replications = 100) >> > test > 1 pvec(1:(len - 1), mc.cores = 4, function(i) >> log(c1[i + 1]/c1[i])) > replications elapsed relative >> user.self sys.self user.child sys.child > 1 100 9.079 1 >> 2.571 4.138 9.736 8.046 >> >> Your output is mangled in my email, but on my system your >> pvec approach takes more than twice as long: >> >> c1 <- 1:1000000 len <- length(c1) library(parallel) >> library(rbenchmark) >> >> regular <- function() log(c1[-1]/c1[-len]) >> iterate.parallel <- function() { pvec(1:(len - 1), >> mc.cores = 4, function(i) log(c1[i + 1] / c1[i])) } >> >> benchmark(regular(), iterate.parallel(), replications >> 100, columns = c("test", "elapsed", "relative")) ## test >> elapsed relative ## 2 iterate.parallel() 7.517 2.482 ## 1 >> regular() 3.028 1.000 >> >> Honestly, just use log(c1[-1]/c1[-len]). The code is >> simple and easy to understand and it runs pretty >> fast. There is usually no reason to make it more >> complicated. --Ista >> >> > On Sun, Sep 23, 2018 at 12:33 PM Ista Zahn >> <istazahn at gmail.com> wrote: >> > > >> > > On Sun, Sep 23, 2018 at 10:09 AM Wensui Liu >> <liuwensui at gmail.com> wrote: >> > > > >> > > > Why? >> > > >> > > The operations required for this algorithm are >> vectorized, as are most > > operations in R. There is no >> need to iterate through each element. > > Using >> Vectorize to achieve the iteration is no better than >> using > > *apply or a for-loop, and betrays the same >> basic lack of insight into > > basic principles of >> programming in R. >> > > >> > > And/or, if you want a more practical reason: >> > > >> > > > c1 <- 1:1000000 > > > len <- 1000000 > > > >> system.time( s1 <- log(c1[-1]/c1[-len])) > > user system >> elapsed > > 0.031 0.004 0.035 > > > system.time(s2 <- >> Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len)) > >> > user system elapsed > > 1.258 0.022 1.282 >> > > >> > > Best, > > Ista >> > > >> > > > >> > > > On Sun, Sep 23, 2018 at 7:54 AM Ista Zahn >> <istazahn at gmail.com> wrote: >> > > >> >> > > >> On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu >> <liuwensui at gmail.com> wrote: >> > > >> > >> > > >> > or this one: >> > > >> > >> > > >> > (Vectorize(function(i) log(c1[i + 1] / c1[i])) >> (1:len)) >> > > >> >> > > >> Oh dear god no. >> > > >> >> > > >> > >> > > >> > On Sat, Sep 22, 2018 at 4:16 PM rsherry8 >> <rsherry8 at comcast.net> wrote: >> > > >> > > >> > > >> > > >> > > >> > > It is my impression that good R programmers >> make very little use of the > > >> > > for >> statement. Please consider the following > > >> > > R >> statement: > > >> > > for( i in 1:(len-1) ) s[i] >> log(c1[i+1]/c1[i], base = exp(1) ) > > >> > > One problem >> I have found with this statement is that s must exist >> before > > >> > > the statement is run. Can it be written >> without using a for > > >> > > loop? Would that be >> better? >> > > >> > > >> > > >> > > Thanks, > > >> > > Bob >> > > >> > > >> > > >> > > ______________________________________________ >> > > >> > > R-help at r-project.org mailing list -- To >> UNSUBSCRIBE and more, see > > >> > > >> https://stat.ethz.ch/mailman/listinfo/r-help > > >> > > >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html > > >> > > >> and provide commented, minimal, self-contained, >> reproducible code. >> > > >> > >> > > >> > ______________________________________________ > >> > >> > R-help at r-project.org mailing list -- To >> UNSUBSCRIBE and more, see > > >> > >> https://stat.ethz.ch/mailman/listinfo/r-help > > >> > >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html > > >> > and >> provide commented, minimal, self-contained, reproducible >> code. > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and > more, see https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide > commented, minimal, self-contained, reproducible code.