David L. Van Brunt, Ph.D.
2005-Nov-07 11:16 UTC
[R] R seems to "stall" after several hours on a long series of analyses... where to start?
Not sure where to even start on this.... I'm hoping there's some debugging I can do... I have a loop that cycles through several different data sets (same structure, different info), performing randomForest growth and predictions... saving out the predictions for later study... I get about 5 hours in (9%... of the planned iterations.. yikes!) and R just freezes. This happens in interactive and batch mode execution (I can see from the ".Rout" file that it gets about 9% through in Batch mode, and about 6% if in interactive mode... does that suggest memory problems?) I'm thinking of re-executing this same code on a different platform to see if that's the issue (currently using OS X)... any other suggestions on where to look, or what to try to get more information? Sorry so vague... it's a LOT of code, runs fine without error for many iterations, so I didn't think the problem was syntax... -- --------------------------------------- David L. Van Brunt, Ph.D. mailto:dlvanbrunt@gmail.com [[alternative HTML version deleted]]
Duncan Murdoch
2005-Nov-07 12:00 UTC
[R] R seems to "stall" after several hours on a long series of analyses... where to start?
David L. Van Brunt, Ph.D. wrote:> Not sure where to even start on this.... I'm hoping there's some debugging I > can do... > > I have a loop that cycles through several different data sets (same > structure, different info), performing randomForest growth and > predictions... saving out the predictions for later study... > > I get about 5 hours in (9%... of the planned iterations.. yikes!) and R just > freezes. > > This happens in interactive and batch mode execution (I can see from the > ".Rout" file that it gets about 9% through in Batch mode, and about 6% if in > interactive mode... does that suggest memory problems?) > > I'm thinking of re-executing this same code on a different platform to see > if that's the issue (currently using OS X)... any other suggestions on where > to look, or what to try to get more information? > > Sorry so vague... it's a LOT of code, runs fine without error for many > iterations, so I didn't think the problem was syntax...You could try running an external debugger to see whether it appears R is stuck in a loop. I don't know what OS X debuggers are like, but on Windows, you can see routine names even without debugging information. Recompiling R with debugging info will make the results a lot easier to interpret. Duncan Murdoch
David L. Van Brunt, Ph.D.
2005-Nov-07 15:09 UTC
[R] R seems to "stall" after several hours on a long series of analyses... where to start?
Great suggestions, all. I do have a timer in there, and it looks like the time to complete a loop is not increasing as it goes. From your comments, I take it that suggests there is not a memory leak. I could try scripting the loop from the shell, rather than R, to see if that works, but will do that as a last resort as it will require a good deal of re-writing (the loop follows some "setup" code that builds a pretty large data set... the loop then slaps several new columns on a copy of that data set, and analyses that...) I'll still try the other platform as well, see if the same problem occurs there. On 11/7/05, jim holtman <jholtman@gmail.com> wrote:> > Here is some code that I use to track the progress of my scripts. This > will print out the total cpu time and the memory that is being used. You > call it with 'my.stats("message")' to print out "message" on the console. > Also, have you profiled your code to see where the time is being spent? > Can you break it up into multiple runs so that you can start with a "fresh" > version of memory? > ======script==========> "my.stats" <- local({ > # local variables to hold the last times > # first two variables are the elasped and CPU times from the last report > lastTime <- lastCPU <- 0 > function(text = "stats", reset=F) > { > procTime <- proc.time()[1:3] # get current metrics > if (reset){ # setup to mark timing from this point > lastTime <<- procTime[3] > lastCPU <<- procTime[1] + procTime[2] > } else { > cat(text, "-",sys.call(sys.parent())[[1]], ": <", > round((procTime[1] + procTime[2]) - lastCPU,1), > round(procTime[3] - lastTime,1), ">", procTime, > " : ", round(memory.size()/2.^20., 1.), "MB\n") > invisible(flush.console()) # force a write to the console > } > } > }) > ========= here is some sample output========> > my.stats(reset=TRUE) # reset counters > > x <- runif(1e6) # generate 1M random numbers > > my.stats('random') > random - my.stats : < 0.3 31.8 > 96.17 11.7 230474.9 : 69.5 MB > > y <- x*x+sqrt(x) # just come calculation > > my.stats('calc') > calc - my.stats : < 0.7 71.2 > 96.52 11.74 230514.3 : 92.4 MB > > > You can see that memory is growing. The first number is the CPU time and > the second (in <>) is the elapsed time. > HTH > > > On 11/7/05, David L. Van Brunt, Ph.D. <dlvanbrunt@gmail.com> wrote: > > > Not sure where to even start on this.... I'm hoping there's some > > debugging I > > can do... > > > > I have a loop that cycles through several different data sets (same > > structure, different info), performing randomForest growth and > > predictions... saving out the predictions for later study... > > > > I get about 5 hours in (9%... of the planned iterations.. yikes!) and R > > just > > freezes. > > > > This happens in interactive and batch mode execution (I can see from the > > ".Rout" file that it gets about 9% through in Batch mode, and about 6% > > if in > > interactive mode... does that suggest memory problems?) > > > > I'm thinking of re-executing this same code on a different platform to > > see > > if that's the issue (currently using OS X)... any other suggestions on > > where > > to look, or what to try to get more information? > > > > Sorry so vague... it's a LOT of code, runs fine without error for many > > iterations, so I didn't think the problem was syntax... > > > > -- > > --------------------------------------- > > David L. Van Brunt, Ph.D. > > mailto: dlvanbrunt@gmail.com > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > > > > > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 247 0281 > > What the problem you are trying to solve?-- --------------------------------------- David L. Van Brunt, Ph.D. mailto:dlvanbrunt@gmail.com [[alternative HTML version deleted]]
David L. Van Brunt, Ph.D.
2005-Nov-07 20:49 UTC
[R] R seems to "stall" after several hours on a long series of analyses... where to start?
I'll try the memory stats function first, see what I get... I do have "top" on OS X, so I'll try watching that more closely as well. Great suggestions here. On 11/7/05, Paul Gilbert <pgilbert@bank-banque-canada.ca> wrote:> > > > David L. Van Brunt, Ph.D. wrote: > > > Great suggestions, all. > > > > I do have a timer in there, and it looks like the time to complete a > loop is > > not increasing as it goes. From your comments, I take it that suggests > there > > is not a memory leak. I could try scripting the loop from the shell, > rather > > than R, to see if that works, but will do that as a last resort as it > will > > require a good deal of re-writing (the loop follows some "setup" code > that > > builds a pretty large data set... the loop then slaps several new > columns on > > a copy of that data set, and analyses that...) > > You may find it is better to make a new array for these columns. R tends > to make copies when you do this sort of thing, with the result that you > have multiple copies of your original dataset. Also, define the array to > be the final size, with NA values, rather than appending rows or columns. > > > > I'll still try the other platform as well, see if the same problem > occurs > > there. > I'm curious to hear what you find. I doubt you willfind a big > difference, but you will find a big diffence on a machine with more > physical memory. > > Paul > > > > On 11/7/05, jim holtman <jholtman@gmail.com> wrote: > > > >>Here is some code that I use to track the progress of my scripts. This > >>will print out the total cpu time and the memory that is being used. You > >>call it with 'my.stats("message")' to print out "message" on the > console. > >> Also, have you profiled your code to see where the time is being spent? > >>Can you break it up into multiple runs so that you can start with a > "fresh" > >>version of memory? > >> ======script==========> >>"my.stats" <- local({ > >># local variables to hold the last times > >># first two variables are the elasped and CPU times from the last report > >>lastTime <- lastCPU <- 0 > >>function(text = "stats", reset=F) > >>{ > >>procTime <- proc.time()[1:3] # get current metrics > >>if (reset){ # setup to mark timing from this point > >>lastTime <<- procTime[3] > >>lastCPU <<- procTime[1] + procTime[2] > >>} else { > >>cat(text, "-",sys.call(sys.parent())[[1]], ": <", > >>round((procTime[1] + procTime[2]) - lastCPU,1), > >>round(procTime[3] - lastTime,1), ">", procTime, > >>" : ", round(memory.size()/2.^20., 1.), "MB\n") > >>invisible(flush.console()) # force a write to the console > >>} > >>} > >>}) > >> ========= here is some sample output========> >> > >>>my.stats(reset=TRUE) # reset counters > >>>x <- runif(1e6) # generate 1M random numbers > >>>my.stats('random') > >> > >>random - my.stats : < 0.3 31.8 > 96.17 11.7 230474.9 : 69.5 MB > >> > >>>y <- x*x+sqrt(x) # just come calculation > >>>my.stats('calc') > >> > >>calc - my.stats : < 0.7 71.2 > 96.52 11.74 230514.3 : 92.4 MB > >> > >> You can see that memory is growing. The first number is the CPU time > and > >>the second (in <>) is the elapsed time. > >> HTH > >> > >> > >> On 11/7/05, David L. Van Brunt, Ph.D. <dlvanbrunt@gmail.com> wrote: > >> > >> > >>>Not sure where to even start on this.... I'm hoping there's some > >>>debugging I > >>>can do... > >>> > >>>I have a loop that cycles through several different data sets (same > >>>structure, different info), performing randomForest growth and > >>>predictions... saving out the predictions for later study... > >>> > >>>I get about 5 hours in (9%... of the planned iterations.. yikes!) and R > >>>just > >>>freezes. > >>> > >>>This happens in interactive and batch mode execution (I can see from > the > >>>".Rout" file that it gets about 9% through in Batch mode, and about 6% > >>>if in > >>>interactive mode... does that suggest memory problems?) > >>> > >>>I'm thinking of re-executing this same code on a different platform to > >>>see > >>>if that's the issue (currently using OS X)... any other suggestions on > >>>where > >>>to look, or what to try to get more information? > >>> > >>>Sorry so vague... it's a LOT of code, runs fine without error for many > >>>iterations, so I didn't think the problem was syntax... > >>> > >>>-- > >>>--------------------------------------- > >>>David L. Van Brunt, Ph.D. > >>>mailto: dlvanbrunt@gmail.com > >>> > >>>[[alternative HTML version deleted]] > >>> > >>>______________________________________________ > >>>R-help@stat.math.ethz.ch mailing list > >>>https://stat.ethz.ch/mailman/listinfo/r-help > >>>PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >>> > >>> > >> > >> > >> > >>-- > >>Jim Holtman > >>Cincinnati, OH > >>+1 513 247 0281 > >> > >>What the problem you are trying to solve? > > > > > > > > > > > > -- > > --------------------------------------- > > David L. Van Brunt, Ph.D. > > mailto:dlvanbrunt@gmail.com > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >-- --------------------------------------- David L. Van Brunt, Ph.D. mailto:dlvanbrunt@gmail.com [[alternative HTML version deleted]]
Sixten Borg
2005-Nov-09 08:27 UTC
[R] R seems to "stall" after several hours on a long series of analyses... where to start?
Hi, I saw something similar, when I had R to look in a file every half minute if there was a request to do something, and if so, do that something and empty the file. (This was my way of testing if I coud do an interactive web page, somehow I managed to get the web page to write the requests to the file that R would look in. R would update a graph that was visible on that same web page). Anyway, this ran smoothly for while (40 minutes I think), then it just stopped. When I examined the situation, R suddenly woke up and continued its task as if nothing had happened (which was quite correct). My amateur interpretation was that the system put R to sleep since it appeared to be inactive according to the system. When I swithed to R, it became interactive and was given CPU time again. Maybe this gives some inspiration to solve the problem. The system was Windows NT, R version 1.8, I think. Kind regards. Sixten>>> "David L. Van Brunt, Ph.D." <dlvanbrunt at gmail.com> 2005-11-07 16:09 >>>Great suggestions, all. I do have a timer in there, and it looks like the time to complete a loop is not increasing as it goes. From your comments, I take it that suggests there is not a memory leak. I could try scripting the loop from the shell, rather than R, to see if that works, but will do that as a last resort as it will require a good deal of re-writing (the loop follows some "setup" code that builds a pretty large data set... the loop then slaps several new columns on a copy of that data set, and analyses that...) I'll still try the other platform as well, see if the same problem occurs there. On 11/7/05, jim holtman <jholtman at gmail.com> wrote:> > Here is some code that I use to track the progress of my scripts. This > will print out the total cpu time and the memory that is being used. You > call it with 'my.stats("message")' to print out "message" on the console. > Also, have you profiled your code to see where the time is being spent? > Can you break it up into multiple runs so that you can start with a "fresh" > version of memory? > ======script==========> "my.stats" <- local({ > # local variables to hold the last times > # first two variables are the elasped and CPU times from the last report > lastTime <- lastCPU <- 0 > function(text = "stats", reset=F) > { > procTime <- proc.time()[1:3] # get current metrics > if (reset){ # setup to mark timing from this point > lastTime <<- procTime[3] > lastCPU <<- procTime[1] + procTime[2] > } else { > cat(text, "-",sys.call(sys.parent())[[1]], ": <", > round((procTime[1] + procTime[2]) - lastCPU,1), > round(procTime[3] - lastTime,1), ">", procTime, > " : ", round(memory.size()/2.^20., 1.), "MB\n") > invisible(flush.console()) # force a write to the console > } > } > }) > ========= here is some sample output========> > my.stats(reset=TRUE) # reset counters > > x <- runif(1e6) # generate 1M random numbers > > my.stats('random') > random - my.stats : < 0.3 31.8 > 96.17 11.7 230474.9 : 69.5 MB > > y <- x*x+sqrt(x) # just come calculation > > my.stats('calc') > calc - my.stats : < 0.7 71.2 > 96.52 11.74 230514.3 : 92.4 MB > > > You can see that memory is growing. The first number is the CPU time and > the second (in <>) is the elapsed time. > HTH > > > On 11/7/05, David L. Van Brunt, Ph.D. <dlvanbrunt at gmail.com> wrote: > > > Not sure where to even start on this.... I'm hoping there's some > > debugging I > > can do... > > > > I have a loop that cycles through several different data sets (same > > structure, different info), performing randomForest growth and > > predictions... saving out the predictions for later study... > > > > I get about 5 hours in (9%... of the planned iterations.. yikes!) and R > > just > > freezes. > > > > This happens in interactive and batch mode execution (I can see from the > > ".Rout" file that it gets about 9% through in Batch mode, and about 6% > > if in > > interactive mode... does that suggest memory problems?) > > > > I'm thinking of re-executing this same code on a different platform to > > see > > if that's the issue (currently using OS X)... any other suggestions on > > where > > to look, or what to try to get more information? > > > > Sorry so vague... it's a LOT of code, runs fine without error for many > > iterations, so I didn't think the problem was syntax... > > > > -- > > --------------------------------------- > > David L. Van Brunt, Ph.D. > > mailto: dlvanbrunt at gmail.com > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > > > > > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 247 0281 > > What the problem you are trying to solve?-- --------------------------------------- David L. Van Brunt, Ph.D. mailto:dlvanbrunt at gmail.com [[alternative HTML version deleted]] ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Apparently Analagous Threads
- R seems to "stall" after several hours on a long series o f analyses... where to start?
- Repost: Examples of "classwt", "strata", and "sampsize" in randomForest?
- Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?
- to print system.time always
- Creating new columns inside a loop