Hello, This one has been bugging me for a long time and I have never found a solution. I am using R version 2.15.1 but it has come up in older versions of R I have used over the past 2-3 years. Q: Am I wrong to expect that R should handle hundreds of iterations of the base model or statistical functions, embedded within for loops, in one script run? I have found that when I write scripts that do this, sometimes they have a tendency to crash, seemingly unpredictably. For example, one problem script of mine employs glm and gls about a hundred different times, and output files are being written at the end of each iteration. I have used my output files to determine that the crash cause is not consistent (R never fails at the same iteration). Note that the data are fixed here (no data generation or randomization steps, so that is not the issue). But it is clear that scripts with larger numbers of iterations are more likely to produce a crash. And a year or two ago, I had a seemingly stable R script again with for looped model fits, but discovered this script was prone to crashing when I ran it on a newer PC. Because the new PC also seemed to be blazing through R code absurdly fast, I tried adding a short "fluff" procedure at the end of each iteration that required a few seconds of processing time. Low and behold, when I added that, the script stopped crashing (and each iteration of course took longer). I still don't understand why that fixed things. What is going on? Solutions? Thanks.---steve -- Steve Powers powers_s at nd.edu University of Notre Dame Environmental Change Initiative website (http://www.nd.edu/~spowers2/index.htm)
You are not wrong to expect R to not crash. However, R (as most people use it) is not monolithic, and you have provided neither reproducible code nor sessionInfo() with the relevant packages loaded to help anyone interested in investigating the problem. You are the most likely person to be able to generate sample code that reproduces your problem, even if imperfectly. --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity. Steve Powers <powers_s at nd.edu> wrote:>Hello, > >This one has been bugging me for a long time and I have never found a >solution. I am using R version 2.15.1 but it has come up in older >versions of R I have used over the past 2-3 years. > >Q: Am I wrong to expect that R should handle hundreds of iterations of >the >base model or statistical functions, embedded within for loops, in one >script run? I have found that when I write scripts that do this, >sometimes >they have a tendency to crash, seemingly unpredictably. > >For example, one problem script of mine employs glm and gls about a >hundred >different times, and output files are being written at the end of each >iteration. I have used my output files to determine that the crash >cause is >not consistent (R never fails at the same iteration). Note that the >data are >fixed here (no data generation or randomization steps, so that is not >the >issue). But it is clear that scripts with larger numbers of iterations >are >more likely to produce a crash. > >And a year or two ago, I had a seemingly stable R script again with for >looped model fits, but discovered this script was prone to crashing >when I >ran it on a newer PC. Because the new PC also seemed to be blazing >through R >code absurdly fast, I tried adding a short "fluff" procedure at the end >of >each iteration that required a few seconds of processing time. Low and >behold, when I added that, the script stopped crashing (and each >iteration >of course took longer). I still don't understand why that fixed things. > >What is going on? Solutions? Thanks.---steve
Sometimes these intermittent "crashes" come from memory misuse, e.g., not allocating enough scratch space. You can sometimes make those coding errors cause more consistent problems by calling gctorture(TRUE) before running your code. Here is an example in which it looks like package:gam's lo() is misusing memory when its degree argument is 2 - sometimes it can do 10 iterations, sometimes 3, sometimes 1: > library(gam) Loading required package: splines Loaded gam 1.06.2 > v <- lapply(1:10,function(i){cat(i,""); gam(mpg ~ lo(hp,degree=2), data=mtcars)}) 1 2 3 4 5 6 7 8 9 10 > > v <- lapply(1:10,function(i){cat(i,""); gam(mpg ~ lo(hp,degree=2), data=mtcars)}) 1 2 3 Error in sys.call() : invalid 'which' argument > v <- lapply(1:10,function(i){cat(i,""); gam(mpg ~ lo(hp,degree=2), data=mtcars)}) 1 Error in sys.call() : invalid 'which' argument If I call gctorture(TRUE) before calling lo(degree=2) then it hangs on the first call. One could attach a debugger at this point to get a clue about where it is failing. > library(gam) Loading required package: splines Loaded gam 1.06.2 > > gctorture(TRUE) > v <- lapply(1:10,function(i){cat(i,""); gam(mpg ~ lo(hp,degree=2), data=mtcars)}) 1 Using valgrind is helpful also. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf > Of Steve Powers > Sent: Thursday, December 27, 2012 7:01 PM > To: r-help at r-project.org > Subject: [R] R crashing inconsistently within for loops > > Hello, > > This one has been bugging me for a long time and I have never found a > solution. I am using R version 2.15.1 but it has come up in older versions of R I have used > over the past 2-3 years. > > Q: Am I wrong to expect that R should handle hundreds of iterations of the > base model or statistical functions, embedded within for loops, in one > script run? I have found that when I write scripts that do this, sometimes > they have a tendency to crash, seemingly unpredictably. > > For example, one problem script of mine employs glm and gls about a hundred > different times, and output files are being written at the end of each > iteration. I have used my output files to determine that the crash cause is > not consistent (R never fails at the same iteration). Note that the data are > fixed here (no data generation or randomization steps, so that is not the > issue). But it is clear that scripts with larger numbers of iterations are > more likely to produce a crash. > > And a year or two ago, I had a seemingly stable R script again with for > looped model fits, but discovered this script was prone to crashing when I > ran it on a newer PC. Because the new PC also seemed to be blazing through R > code absurdly fast, I tried adding a short "fluff" procedure at the end of > each iteration that required a few seconds of processing time. Low and > behold, when I added that, the script stopped crashing (and each iteration > of course took longer). I still don't understand why that fixed things. > > What is going on? Solutions? Thanks.---steve > > -- > Steve Powers > powers_s at nd.edu > University of Notre Dame > Environmental Change Initiative > website (http://www.nd.edu/~spowers2/index.htm) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Steve Powers <powers_s <at> nd.edu> writes:> > Hello, > > This one has been bugging me for a long time and I have never found a > solution. I am using R version 2.15.1 but it has come > up in older versions of R I have used over the past 2-3 years. > > Q: Am I wrong to expect that R should handle hundreds of iterations of the > base model or statistical functions, embedded within for loops, in one > script run? I have found that when I write scripts that do this, sometimes > they have a tendency to crash, seemingly unpredictably. > > For example, one problem script of mine employs glm and gls about a hundred > different times, and output files are being written at the end of each > iteration. I have used my output files to determine that the crash cause is > not consistent (R never fails at the same iteration). Note that the data are > fixed here (no data generation or randomization steps, so that is not the > issue). But it is clear that scripts with larger numbers of iterations are > more likely to produce a crash. > > And a year or two ago, I had a seemingly stable R script again with for > looped model fits, but discovered this script was prone to crashing when I > ran it on a newer PC. Because the new PC also seemed to be blazing through R > code absurdly fast, I tried adding a short "fluff" procedure at the end of > each iteration that required a few seconds of processing time. Low and > behold, when I added that, the script stopped crashing (and each iteration > of course took longer). I still don't understand why that fixed things.All of the advice given so far is useful (I think), but I thought I would chime in and answer one of your questions, which is that this is indeed surprising, especially if you're using only base packages. I and many other people routinely run thousands of iterations of these types of analyses with no problem. If (as Brian Ripley suggested) "crash" just means that some of your scripts stop with errors in some cases, then that's *not* surprising -- there are lots of ways to get glm() and gls() to give errors with slightly weird data sets. However, having R actually crash (i.e. the whole R session 'terminates abnormally', in the words of the posting guide) is quite a bit more unusual, and (if you are only using base R, not any contributed packages that may call compiled code in bad ways [or calling your own compiled code]) *always* constitutes a bug. Nondeterministic behavior in a deterministic function (i.e. no random number generation) is also unusual/surprising. The key here is finding a reproducible example, which can be tough for these kinds of problems. But it sounds like you have one, so if you can trim it down to a manageable size (see http://tinyurl.com/reproducible-000 for tips on creating reproducible examples), and give full information about your system (as suggested by another poster), this would be of great interest, especially to the people who hang out on r-devel at r-project.org . My prior probabilities are fairly strongly on the problem being something flaky about your system, but helping to identify these kinds of bugs in R is a community service ...
On Thu, 27 Dec 2012 22:01:22 -0500 Steve Powers <powers_s at nd.edu> wrote: Two points: 1) You don't define "crash." Did the script simply hang, did R abruptly cease to run and exit to the OS, did the display freeze, did the OS and machine stop working? "Crashing" is not explanatory, nor is it descriptive of your problem. 2) The phrase is "lo and behold" - no "w" in the "lo." It's also redundant since "lo" means "behold." Happy New Year. JWDougherty.