I am submitting this problem to the R forum , rather than the Bioconductor forum, because its nature is closer to programming style than any Bioinformatic contents. I have implemented an R script to extracts many strings through querying 3 Bioinformatic databases in the same loop cycle. Ideally, the script should perform as many cycles as necessary to extract all available data of interest. Inevitably it triggers a BioMart exception after running many cycles in a row. The exception seems to be independent of the script instructions because if I restart the script from the point where it got interrupted then it runs for another while, extracting also the data where the exception occurred with no problem at all. Sometimes, though, the script does not respond any more, it hangs up, even if no exception has apparently occurred, and the only way to regain control is to kill the R process. This way I lose memory of how many data have been processed and stored to disk files (unless I manually count them ... there are thousands ..). If I restart the script then it restarts processing the data strings from scratch. I guess it may be a memory problem as the task manager (Windows/XP) shows that the hung-up R script is taking more than 70% of the available RAM. I wonder whether there is any system command to make the script self-aware of its memory requirements and running time. Ideally the script should be able to trap the exception and be sensitive to its current RAM / CPU time requirements, self-exit after freezing and saving the current program status so that when rerun it would not restart from scratch but rather pick up from where it exited. Maybe this is asking too much from a non-compiled language ? Thank you in advance, Maura tutti i telefonini TIM! [[alternative HTML version deleted]]
You can use 'try' to catch errors and take corrective action. 'memory.size' and 'proc.time' will give you information on the memory usage of your application and the CPU time that has been used. On Sun, Aug 2, 2009 at 2:02 PM, <mauede at alice.it> wrote:> I am submitting this problem to the ?R forum , rather than the Bioconductor forum, because its nature is closer to programming style than any ?Bioinformatic contents. > I have implemented an R script to extracts many strings ?through querying 3 Bioinformatic databases in the same loop cycle. Ideally, the script should perform as many cycles as necessary to extract all available data of interest. > Inevitably it triggers a BioMart exception after running many cycles in a row. The exception seems to be independent of the script instructions because if I restart the script from the point where it got interrupted then it runs for another while, extracting also the data where the exception occurred with no problem at all. > Sometimes, though, the script does not respond any more, it hangs up, even if no exception has apparently occurred, and the only way to regain control is to kill the R process. This way I lose memory of how many data have been processed and stored to disk files (unless I manually count them ... there are thousands ..). If I restart the script then it restarts processing the data strings from scratch. I guess it may be a memory problem as the task manager (Windows/XP) shows that the hung-up R script is taking more than 70% of the available RAM. > I wonder whether there is any system command to make the script self-aware of its memory requirements and running time. > Ideally the script should be able to trap the exception and be sensitive to its current RAM / CPU time requirements, self-exit after freezing and saving the current program status so that when rerun it would not restart from scratch but rather pick up from where it exited. > Maybe this is asking too much from a non-compiled language ? > > Thank you in advance, > Maura > > > tutti i telefonini TIM! > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?
Hello, Something you can do is saving your strings in a external text file (using cat, for instance). In this way, you would not require much memory while extracting your data. Once you have extracted it, you can always have a look at your external file to see if it is too big, what to do with it, etc. You can even consider saving your data into a database if need be. Best regards, Carlos J. Gil Bellosta http://www.datanalytics.com On Sun, 2009-08-02 at 20:02 +0200, mauede at alice.it wrote:> I am submitting this problem to the R forum , rather than the Bioconductor forum, because its nature is closer to programming style than any Bioinformatic contents. > I have implemented an R script to extracts many strings through querying 3 Bioinformatic databases in the same loop cycle. Ideally, the script should perform as many cycles as necessary to extract all available data of interest. > Inevitably it triggers a BioMart exception after running many cycles in a row. The exception seems to be independent of the script instructions because if I restart the script from the point where it got interrupted then it runs for another while, extracting also the data where the exception occurred with no problem at all. > Sometimes, though, the script does not respond any more, it hangs up, even if no exception has apparently occurred, and the only way to regain control is to kill the R process. This way I lose memory of how many data have been processed and stored to disk files (unless I manually count them ... there are thousands ..). If I restart the script then it restarts processing the data strings from scratch. I guess it may be a memory problem as the task manager (Windows/XP) shows that the hung-up R script is taking more than 70% of the available RAM. > I wonder whether there is any system command to make the script self-aware of its memory requirements and running time. > Ideally the script should be able to trap the exception and be sensitive to its current RAM / CPU time requirements, self-exit after freezing and saving the current program status so that when rerun it would not restart from scratch but rather pick up from where it exited. > Maybe this is asking too much from a non-compiled language ? > > Thank you in advance, > Maura > > > tutti i telefonini TIM! > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.