Ravi.Vishnu@outokumpu.com
2005-Aug-23 10:03 UTC
[R] priority of operators in the FOR ( ) statement
Dear All, I spent an entire evening in debugging a small, fairly simple program in R - without success. It was my Guru in Bayesian Analysis, Thomas Fridtjof, who was able to diagonose the problem. He said that it took a long time for him also to locate the problem. This program illustrates in some ways the shortcomings of the error messages that R responds with. In this case, it was quite misleading and directs attention to a location far removed the actual problem statement. Without any more introductory comments, let me directly discuss the essential problem. I am enclosing the entire program after a brief discussion. The problem arises from the following statement (nr is an integer constant) : for ( i in 1:nr-1) {.......} The unexpected problem (at least for me) is that R reads the above statement as (i in (1:nr)-1) {.....}. This makes i be initially as zero which leads to an error because the for loop in R starts from 1. The problem is easily fixed by writing the for loop as ( i in 1:(nr-1)) {.......}. This would be an easy problem to fix if R directly indicates what the problem is. Instead, it gives mystifying error messages which are totally misleading. For example, to the program given below, I got the following error message (these point to commands elsewhere in the program) : Error in if ((x >= 0) & (x < s2)) return(x/2) else if ((x >= s2) & (x < : missing value where TRUE/FALSE needed I would like clarifications on the following points : 1. I am just curious to know if the priority of operators in the for statement ( the colon before the minus operator, for example) is a deliberate design decision. I have tested Matlab and found that it interprets my original statement correctly without an extra paranthesis. 2. Faced with a similiar problem in the future, what is a smart way of debugging in R to locate a problem. With this problem, I checked and double checked every single statement in the program, except the for statement because I just did not expect any problem there. I have seen that there is a debug package but I have not used it. Can such tools be used to locate a problem with greater ease? Can somebody give a concrete example (for the following program, for example) of a debugging routine. *************************************************************************' # Bayesian Data Analysis ## source("M:/programming/Rfolder/Assignments/fortest.txt") # #Remove all objects from the workspace rm(list=ls()) # #We will also try to note the time that the program takes # #We will start the clock at starttime starttime <- proc.time()[3]; my.function<-function(x) { s2<-sqrt(2); if ((x>=0) & (x<s2)) return(x/2) else if ((x>=s2) & (x<1+s2)) return(0.2) else if ((x>=1+s2) & (x<1.5+s2)) return(0.6) else if ((x>1.5+s2) | (x<0)) return(0) } alphayx<-function(y,x) { fy<-my.function(y) fx<-my.function(x) fyx<-fy/fx # to account for 0/0 division if (is.na(fyx)) fyx<-0 #fyx<-ifelse(is.na(fyx),0,fyx); alpha<-min(1,fyx) return(alpha) } sigma<-0.5; #nr is the number of iterations nr<-20 x<-numeric(nr); x[1]<-1; t<-1:nr; for (i in 1:nr-1) { xi<-x[i]; yi<-rnorm(1,mean=xi,sd=sigma); ui<-runif(1,0,1); ualphai<-alphayx(yi,xi); xn<-ifelse(ui<=ualphai,yi,xi); x[i+1]<-xn; } plot(t,x,type="p") endtime<-proc.time()[3]; elapsedTime<-endtime-starttime; cat("Elapsed time is", elapsedTime, "seconds", "\n") *****************************************************************************' This message is meant for the addressee only and may contain confidential and legally privileged information. Any unauthorised review, use, copying, storage, disclosure or distribution of this e- mail and any attachments is strictly prohibited. If you are not the named recipient or have otherwise received this communication in error, please destroy this message from your system and kindly notify the sender by e-mail. Thank you for your co-operation. [[alternative HTML version deleted]]
Ravi.Vishnu at outokumpu.com wrote:> Dear All, > I spent an entire evening in debugging a small, fairly simple program in R > - without success. It was my Guru in Bayesian Analysis, Thomas Fridtjof, > who was able to diagonose the problem. He said that it took a long time > for him also to locate the problem. > This program illustrates in some ways the shortcomings of the error > messages that R responds with.To summarize: you assumed that 1:nr-1 was equivalent to 1:(nr-1), rather than (1:nr)-1 (as documented). This led to indexing by 0, which (as is documented) gives a zero length vector. R responded with the error message> missing value where TRUE/FALSE neededwhen you used this in a test. That seems like an appropriate error message to me. I don't know any system that would respond better to user errors in operator priority: those almost always lead to obscure errors, because the expression you write is often syntactically correct but logically wrong.> 2. Faced with a similiar problem in the future, what is a smart way of > debugging in R to locate a problem.Use traceback() to isolate the location of the error, then debug() to single step through the function until you get to the error location. At that point, examine the values of the expressions involved in the calculation, and make sure they are as expected. And in general: if you aren't sure of the relative priority of two operators, use parentheses. 1:(nr-1) would work regardless of whether : or - had higher priority. Or, in extreme cases, read the documentation. Duncan Murdoch
Since there is nothing wrong with for(i in 1:nr - 1) R can't really do much more than point to where your code fails due your incorrect assumption about operator precedence. You're certainly not the first to fall into this trap. But it's not that hard to diagnose. Anytime I have problems with a loop, I do three simple things: 1. for(i in whatever) print(i) 2. look at what traceback() says 3. step through the loop "by hand". The first test would have told you (in much less than an "entire evening") what the problem was. Peter Ehlers Ravi.Vishnu at outokumpu.com wrote:> Dear All, > I spent an entire evening in debugging a small, fairly simple program in R > - without success. It was my Guru in Bayesian Analysis, Thomas Fridtjof, > who was able to diagonose the problem. He said that it took a long time > for him also to locate the problem. > This program illustrates in some ways the shortcomings of the error > messages that R responds with. In this case, it was quite misleading and > directs attention to a location far removed the actual problem statement. > Without any more introductory comments, let me directly discuss the > essential problem. I am enclosing the entire program after a brief > discussion. > > The problem arises from the following statement (nr is an integer > constant) : > for ( i in 1:nr-1) {.......} > The unexpected problem (at least for me) is that R reads the above > statement as (i in (1:nr)-1) {.....}. This makes i be initially as zero > which leads to an error because the for loop in R starts from 1. The > problem is easily fixed by writing the for loop as ( i in 1:(nr-1)) > {.......}. This would be an easy problem to fix if R directly indicates > what the problem is. Instead, it gives mystifying error messages which are > totally misleading. For example, to the program given below, I got the > following error message (these point to commands elsewhere in the program) > : > Error in if ((x >= 0) & (x < s2)) return(x/2) else if ((x >= s2) & (x < : > > missing value where TRUE/FALSE needed > > I would like clarifications on the following points : > 1. I am just curious to know if the priority of operators in the for > statement ( the colon before the minus operator, for example) is a > deliberate design decision. I have tested Matlab and found that it > interprets my original statement correctly without an extra paranthesis. > 2. Faced with a similiar problem in the future, what is a smart way of > debugging in R to locate a problem. With this problem, I checked and > double checked every single statement in the program, except the for > statement because I just did not expect any problem there. I have seen > that there is a debug package but I have not used it. Can such tools be > used to locate a problem with greater ease? Can somebody give a concrete > example (for the following program, for example) of a debugging routine. > > *************************************************************************' > # Bayesian Data Analysis > ## source("M:/programming/Rfolder/Assignments/fortest.txt") > > # #Remove all objects from the workspace > rm(list=ls()) > # #We will also try to note the time that the program takes > # #We will start the clock at starttime > starttime <- proc.time()[3]; > > my.function<-function(x) { > s2<-sqrt(2); > if ((x>=0) & (x<s2)) return(x/2) > else > if ((x>=s2) & (x<1+s2)) return(0.2) > else > if ((x>=1+s2) & (x<1.5+s2)) return(0.6) > else > if ((x>1.5+s2) | (x<0)) return(0) > } > > alphayx<-function(y,x) { > fy<-my.function(y) > fx<-my.function(x) > fyx<-fy/fx > # to account for 0/0 division > if (is.na(fyx)) fyx<-0 > #fyx<-ifelse(is.na(fyx),0,fyx); > alpha<-min(1,fyx) > return(alpha) > } > > sigma<-0.5; > #nr is the number of iterations > nr<-20 > x<-numeric(nr); > x[1]<-1; > t<-1:nr; > > for (i in 1:nr-1) { > xi<-x[i]; > yi<-rnorm(1,mean=xi,sd=sigma); > ui<-runif(1,0,1); > ualphai<-alphayx(yi,xi); > xn<-ifelse(ui<=ualphai,yi,xi); > x[i+1]<-xn; > } > > plot(t,x,type="p") > > endtime<-proc.time()[3]; > elapsedTime<-endtime-starttime; > cat("Elapsed time is", elapsedTime, "seconds", "\n") > *****************************************************************************' > > > > > This message is meant for the addressee only and may contain > confidential and legally privileged information. Any unauthorised > review, use, copying, storage, disclosure or distribution of this e- > mail and any attachments is strictly prohibited. If you are not the > named recipient or have otherwise received this communication in > error, please destroy this message from your system and kindly notify > the sender by e-mail. Thank you for your co-operation. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
On 23-Aug-05 Duncan Murdoch wrote:> [...] > ... in extreme cases, read the documentation.One for "fortunes"? Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 23-Aug-05 Time: 12:05:42 ------------------------------ XFMail ------------------------------
The command that I think is most useful in this situation is 'browser()'. Even a couple decades of programming in the S language hasn't yet solved the problem of my fingers typing code that doesn't match what I want to happen. I quite consistently have a browser() call in functions that I write to make sure that what I am assuming is the same as what R assumes. Patrick Burns patrick at burns-stat.com +44 (0)20 8525 0696 http://www.burns-stat.com (home of S Poetry and "A Guide for the Unwilling S User") Ravi.Vishnu at outokumpu.com wrote:>Dear All, >I spent an entire evening in debugging a small, fairly simple program in R >- without success. It was my Guru in Bayesian Analysis, Thomas Fridtjof, >who was able to diagonose the problem. He said that it took a long time >for him also to locate the problem. >This program illustrates in some ways the shortcomings of the error >messages that R responds with. In this case, it was quite misleading and >directs attention to a location far removed the actual problem statement. >Without any more introductory comments, let me directly discuss the >essential problem. I am enclosing the entire program after a brief >discussion. > >The problem arises from the following statement (nr is an integer >constant) : >for ( i in 1:nr-1) {.......} >The unexpected problem (at least for me) is that R reads the above >statement as (i in (1:nr)-1) {.....}. This makes i be initially as zero >which leads to an error because the for loop in R starts from 1. The >problem is easily fixed by writing the for loop as ( i in 1:(nr-1)) >{.......}. This would be an easy problem to fix if R directly indicates >what the problem is. Instead, it gives mystifying error messages which are >totally misleading. For example, to the program given below, I got the >following error message (these point to commands elsewhere in the program) >: >Error in if ((x >= 0) & (x < s2)) return(x/2) else if ((x >= s2) & (x < : > > missing value where TRUE/FALSE needed > >I would like clarifications on the following points : >1. I am just curious to know if the priority of operators in the for >statement ( the colon before the minus operator, for example) is a >deliberate design decision. I have tested Matlab and found that it >interprets my original statement correctly without an extra paranthesis. >2. Faced with a similiar problem in the future, what is a smart way of >debugging in R to locate a problem. With this problem, I checked and >double checked every single statement in the program, except the for >statement because I just did not expect any problem there. I have seen >that there is a debug package but I have not used it. Can such tools be >used to locate a problem with greater ease? Can somebody give a concrete >example (for the following program, for example) of a debugging routine. > >*************************************************************************' ># Bayesian Data Analysis >## source("M:/programming/Rfolder/Assignments/fortest.txt") > ># #Remove all objects from the workspace >rm(list=ls()) ># #We will also try to note the time that the program takes ># #We will start the clock at starttime >starttime <- proc.time()[3]; > >my.function<-function(x) { >s2<-sqrt(2); >if ((x>=0) & (x<s2)) return(x/2) >else >if ((x>=s2) & (x<1+s2)) return(0.2) >else >if ((x>=1+s2) & (x<1.5+s2)) return(0.6) >else >if ((x>1.5+s2) | (x<0)) return(0) >} > >alphayx<-function(y,x) { >fy<-my.function(y) >fx<-my.function(x) >fyx<-fy/fx ># to account for 0/0 division >if (is.na(fyx)) fyx<-0 >#fyx<-ifelse(is.na(fyx),0,fyx); >alpha<-min(1,fyx) >return(alpha) >} > >sigma<-0.5; >#nr is the number of iterations >nr<-20 >x<-numeric(nr); >x[1]<-1; >t<-1:nr; > >for (i in 1:nr-1) { >xi<-x[i]; >yi<-rnorm(1,mean=xi,sd=sigma); >ui<-runif(1,0,1); >ualphai<-alphayx(yi,xi); >xn<-ifelse(ui<=ualphai,yi,xi); >x[i+1]<-xn; >} > >plot(t,x,type="p") > >endtime<-proc.time()[3]; >elapsedTime<-endtime-starttime; >cat("Elapsed time is", elapsedTime, "seconds", "\n") >*****************************************************************************' > > > > >This message is meant for the addressee only and may contain >confidential and legally privileged information. Any unauthorised >review, use, copying, storage, disclosure or distribution of this e- >mail and any attachments is strictly prohibited. If you are not the >named recipient or have otherwise received this communication in >error, please destroy this message from your system and kindly notify >the sender by e-mail. Thank you for your co-operation. > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > > > > >
Gabor Grothendieck
2005-Aug-23 11:28 UTC
[R] priority of operators in the FOR ( ) statement
On 8/23/05, Ravi.Vishnu at outokumpu.com <Ravi.Vishnu at outokumpu.com> wrote:> Dear All, > I spent an entire evening in debugging a small, fairly simple program in R > - without success. It was my Guru in Bayesian Analysis, Thomas Fridtjof, > who was able to diagonose the problem. He said that it took a long time > for him also to locate the problem. > This program illustrates in some ways the shortcomings of the error > messages that R responds with. In this case, it was quite misleading and > directs attention to a location far removed the actual problem statement. > Without any more introductory comments, let me directly discuss the > essential problem. I am enclosing the entire program after a brief > discussion. > > The problem arises from the following statement (nr is an integer > constant) : > for ( i in 1:nr-1) {.......} > The unexpected problem (at least for me) is that R reads the above > statement as (i in (1:nr)-1) {.....}. This makes i be initially as zero > which leads to an error because the for loop in R starts from 1. The > problem is easily fixed by writing the for loop as ( i in 1:(nr-1)) > {.......}. This would be an easy problem to fix if R directly indicates > what the problem is. Instead, it gives mystifying error messages which are > totally misleading. For example, to the program given below, I got the > following error message (these point to commands elsewhere in the program) > : > Error in if ((x >= 0) & (x < s2)) return(x/2) else if ((x >= s2) & (x < : > > missing value where TRUE/FALSE needed > > I would like clarifications on the following points : > 1. I am just curious to know if the priority of operators in the for > statement ( the colon before the minus operator, for example) is a > deliberate design decision. I have tested Matlab and found that it > interprets my original statement correctly without an extra paranthesis.?Syntax gives the operator precedence. Also, note that : is probably best not used in functions since it does not handle boundary conditions properly. If n were 0 then 1:n results in two iterations corresonding to 1 and 0 but what you really wanted was likely no iterations at all. To do that you need seq(length = n) rather than ":". Also I have found expressions like 0:1/10 handy to generate 0, .1, .2, ..., 1 and that works with the current precedence.