Dear R-devel, Apologies for bothering y'all with this seemingly perennial question. A user reported problem with my most recent version of randomForest (4.4-1), and I was able to reproduce it with his data with R-2.0.0 patched (2004-10-24) on WinXP Pro. The problem is that it crashes R on Windows. However, it does not happen on Linux (tried SUSE ES8 on our Opterons and Quantian on my laptop). On Linux, the memory usage for the R process goes up to about 130MB and stays there. On Windows, the memory usage would increase as the number of trees are grown (which already seem strange, as no more memory allocation is being done as the trees are grown), reaches about 130MB, then starts to decline, and eventually crashes the Rgui (or Rterm) process. My biggest problem is that I have not been able to get gdb to work under WinXPPro, thus I've been relying on Linux for debugging. This time I'm really baffled, as the problem does not appear on Linux. Can anyone provide any hint/pointer? I (and my hair) will be very, very grateful! Best, Andy Andy Liaw, PhD Biometrics Research PO Box 2000, RY33-300 Merck Research Labs Rahway, NJ 07065 mailto:andy_liaw@merck.com 732-594-0820
On Mon, 25 Oct 2004 22:26:35 -0400, "Liaw, Andy" <andy_liaw@merck.com> wrote :>Dear R-devel, > >Apologies for bothering y'all with this seemingly perennial question. A >user reported problem with my most recent version of randomForest (4.4-1), >and I was able to reproduce it with his data with R-2.0.0 patched >(2004-10-24) on WinXP Pro. The problem is that it crashes R on Windows. >However, it does not happen on Linux (tried SUSE ES8 on our Opterons and >Quantian on my laptop). On Linux, the memory usage for the R process goes >up to about 130MB and stays there. On Windows, the memory usage would >increase as the number of trees are grown (which already seem strange, as no >more memory allocation is being done as the trees are grown), reaches about >130MB, then starts to decline, and eventually crashes the Rgui (or Rterm) >process. > >My biggest problem is that I have not been able to get gdb to work under >WinXPPro, thus I've been relying on Linux for debugging. This time I'm >really baffled, as the problem does not appear on Linux. Can anyone provide >any hint/pointer? I (and my hair) will be very, very grateful!No idea what's going wrong in randomForest, but I've got some hints for debugging on http://www.stats.uwo.ca/faculty/murdoch/software/debuggingR I've found both the MinGW and Cygwin versions of gdb work, and use it under insight. It's rather painful compared to other Windows debuggers, but it does the job. I've never been able to get ddd going. Duncan Murdoch
Thanks to Duncan and Brian for the pointers. I was able to run gdb under XP, but it didn't help much. What eventually helped is valgrind (on Linux on ia32)! It indicated memory leaks, and here's how the leaks occurred: The main C function called from R has a loop over trees, and calls a function that grows regression trees. That function, in turns, calls a function that loop over variables and find the best one to split on. I Calloc()'ed arrays in the beginning of those functions and Free() them at the end. What I forgot is that there were conditional return in the middle of those functions, bypassing the Free(). Best, Andy> From: Duncan Murdoch > > On Mon, 25 Oct 2004 22:26:35 -0400, "Liaw, Andy" <andy_liaw@merck.com> > wrote : > > >Dear R-devel, > > > >Apologies for bothering y'all with this seemingly perennial > question. A > >user reported problem with my most recent version of > randomForest (4.4-1), > >and I was able to reproduce it with his data with R-2.0.0 patched > >(2004-10-24) on WinXP Pro. The problem is that it crashes R > on Windows. > >However, it does not happen on Linux (tried SUSE ES8 on our > Opterons and > >Quantian on my laptop). On Linux, the memory usage for the > R process goes > >up to about 130MB and stays there. On Windows, the memory > usage would > >increase as the number of trees are grown (which already > seem strange, as no > >more memory allocation is being done as the trees are > grown), reaches about > >130MB, then starts to decline, and eventually crashes the > Rgui (or Rterm) > >process. > > > >My biggest problem is that I have not been able to get gdb > to work under > >WinXPPro, thus I've been relying on Linux for debugging. > This time I'm > >really baffled, as the problem does not appear on Linux. > Can anyone provide > >any hint/pointer? I (and my hair) will be very, very grateful! > > No idea what's going wrong in randomForest, but I've got some hints > for debugging on >http://www.stats.uwo.ca/faculty/murdoch/software/debuggingR I've found both the MinGW and Cygwin versions of gdb work, and use it under insight. It's rather painful compared to other Windows debuggers, but it does the job. I've never been able to get ddd going. Duncan Murdoch