(Sorry for the cross-post--- I wasn't sure which list is more
appropriate...)
Hi everyone,
I've run into segfaults when using my randomForest package on large dataset
(e.g., 100 x 15200) and large number of trees (e.g., ntree=7000 and
mtry=3000). I'm wondering if anyone can give me some hints on where to look
for the problem.
The randomForest package mainly consists of two things: rf.c contains rf(),
a C wrapper function that calls the Fortran subroutines in rfsub.f that do
most of the work (slightly altered from Breiman's original code). All
memory allocations are done in rf.c, using S_alloc(). When I run random
forest with the data and setting as mentioned above, it was able to finish
growing the 7000 trees, but segfault when returning from rf() to R. GDB
gave the following (gdb prompts removed):
do_dotCode (call=0x873aff4, op=0x8a5f620, args=0x8a5d010, env=0x86fd0a4)
at dotcode.c:1413
1413 break;
1845 PROTECT(ans = allocVector(VECSXP, nargs));
1846 havenames = 0;
1847 if (dup) {
1849 info.cargs = cargs;
1850 info.allArgs = args;
1851 info.nargs = nargs;
1852 info.functionName = buf;
1853 nargs = 0;
1854 for (pargs = args ; pargs != R_NilValue ; pargs CDR(pargs)) {
1855 if(argConverters[nargs]) {
1864 PROTECT(s = CPtrToRObj(cargs[nargs], CAR(pargs),
which));
Program received signal SIGSEGV, Segmentation fault.
0x080ddc6a in RunGenCollect (size_needed=1515400) at memory.c:1133
1133 SEXP next = NEXT_NODE(s);
This is obtained on Linux (Mandrake 8.2 w/enterprise kernel 2.4.8) running
on dual P3-866 Xeon with 2GB RAM, using R-1.5.0 compiled from source.
Any help/hints/comments are greatly appreciated!
Regards,
Andy
Andy I. Liaw, PhD
Biometrics Research Phone: (732) 594-0820
Merck & Co., Inc. Fax: (732) 594-1565
P.O. Box 2000, RY70-38 Rahway, NJ 07065
mailto:andy_liaw@merck.com
------------------------------------------------------------------------------
Notice: This e-mail message, together with any attachments, contains information
of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that may be
confidential, proprietary copyrighted and/or legally privileged, and is intended
solely for the use of the individual or entity named on this message. If you are
not the intended recipient, and have received this message in error, please
immediately return this by e-mail and then delete it.
=============================================================================
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To:
r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
(Confined to R-devel). This almost always means that R's memory system (or malloc's) has been corrupted by array overruns. Sometimes gctorture(TRUE) helps. However in your case it's more likely those S_alloc calls, so try (temporarily) replacing them by calls to Calloc and then use something like Purify or `Electric Fence'. to test for overruns. On Wed, 12 Jun 2002, Liaw, Andy wrote:> (Sorry for the cross-post--- I wasn't sure which list is more > appropriate...)Only a few people read R-devel and not R-help.> Hi everyone, > > I've run into segfaults when using my randomForest package on large dataset > (e.g., 100 x 15200) and large number of trees (e.g., ntree=7000 and > mtry=3000). I'm wondering if anyone can give me some hints on where to look > for the problem. > > The randomForest package mainly consists of two things: rf.c contains rf(), > a C wrapper function that calls the Fortran subroutines in rfsub.f that do > most of the work (slightly altered from Breiman's original code). All > memory allocations are done in rf.c, using S_alloc(). When I run random > forest with the data and setting as mentioned above, it was able to finish > growing the 7000 trees, but segfault when returning from rf() to R. GDB > gave the following (gdb prompts removed):This is just saying it can't allocate the copies for the returned values of the .C arguments. I think you might want to consider .Call given that you are probably using quite large structures.> do_dotCode (call=0x873aff4, op=0x8a5f620, args=0x8a5d010, env=0x86fd0a4) > at dotcode.c:1413 > 1413 break; > 1845 PROTECT(ans = allocVector(VECSXP, nargs)); > 1846 havenames = 0; > 1847 if (dup) { > 1849 info.cargs = cargs; > 1850 info.allArgs = args; > 1851 info.nargs = nargs; > 1852 info.functionName = buf; > 1853 nargs = 0; > 1854 for (pargs = args ; pargs != R_NilValue ; pargs > CDR(pargs)) { > 1855 if(argConverters[nargs]) { > 1864 PROTECT(s = CPtrToRObj(cargs[nargs], CAR(pargs), > which)); > > Program received signal SIGSEGV, Segmentation fault. > 0x080ddc6a in RunGenCollect (size_needed=1515400) at memory.c:1133 > 1133 SEXP next = NEXT_NODE(s); > > This is obtained on Linux (Mandrake 8.2 w/enterprise kernel 2.4.8) running > on dual P3-866 Xeon with 2GB RAM, using R-1.5.0 compiled from source. > > Any help/hints/comments are greatly appreciated! > > Regards, > Andy > > Andy I. Liaw, PhD > Biometrics Research Phone: (732) 594-0820 > Merck & Co., Inc. Fax: (732) 594-1565 > P.O. Box 2000, RY70-38 Rahway, NJ 07065 > mailto:andy_liaw@merck.com > > > > ------------------------------------------------------------------------------ > Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that may be confidential, proprietary copyrighted and/or legally privileged, and is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please immediately return this by e-mail and then delete it. > > =============================================================================> > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request@stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
These symptoms suggest that your code may be writing outside of the data it allocates, which would trash internal data structures of the R heap and result in a segfault at a GC. I would try to find a malloc debugging library, use malloc in place of S_alloc, and see if the malloc debugging tools show any malloc heap corruption. The standard malloc in Mac OS X has very good debugging support if you have access to that. luke On Wed, Jun 12, 2002 at 09:26:07AM -0400, Liaw, Andy wrote:> (Sorry for the cross-post--- I wasn't sure which list is more > appropriate...) > > Hi everyone, > > I've run into segfaults when using my randomForest package on large dataset > (e.g., 100 x 15200) and large number of trees (e.g., ntree=7000 and > mtry=3000). I'm wondering if anyone can give me some hints on where to look > for the problem. > > The randomForest package mainly consists of two things: rf.c contains rf(), > a C wrapper function that calls the Fortran subroutines in rfsub.f that do > most of the work (slightly altered from Breiman's original code). All > memory allocations are done in rf.c, using S_alloc(). When I run random > forest with the data and setting as mentioned above, it was able to finish > growing the 7000 trees, but segfault when returning from rf() to R. GDB > gave the following (gdb prompts removed): > > do_dotCode (call=0x873aff4, op=0x8a5f620, args=0x8a5d010, env=0x86fd0a4) > at dotcode.c:1413 > 1413 break; > 1845 PROTECT(ans = allocVector(VECSXP, nargs)); > 1846 havenames = 0; > 1847 if (dup) { > 1849 info.cargs = cargs; > 1850 info.allArgs = args; > 1851 info.nargs = nargs; > 1852 info.functionName = buf; > 1853 nargs = 0; > 1854 for (pargs = args ; pargs != R_NilValue ; pargs > CDR(pargs)) { > 1855 if(argConverters[nargs]) { > 1864 PROTECT(s = CPtrToRObj(cargs[nargs], CAR(pargs), > which)); > > Program received signal SIGSEGV, Segmentation fault. > 0x080ddc6a in RunGenCollect (size_needed=1515400) at memory.c:1133 > 1133 SEXP next = NEXT_NODE(s); > > This is obtained on Linux (Mandrake 8.2 w/enterprise kernel 2.4.8) running > on dual P3-866 Xeon with 2GB RAM, using R-1.5.0 compiled from source. > > Any help/hints/comments are greatly appreciated! > > Regards, > Andy > > Andy I. Liaw, PhD > Biometrics Research Phone: (732) 594-0820 > Merck & Co., Inc. Fax: (732) 594-1565 > P.O. Box 2000, RY70-38 Rahway, NJ 07065 > mailto:andy_liaw@merck.com > > > > ------------------------------------------------------------------------------ > Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that may be confidential, proprietary copyrighted and/or legally privileged, and is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please immediately return this by e-mail and then delete it. > > =============================================================================> > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._-- Luke Tierney University of Minnesota Phone: 612-625-7843 School of Statistics Fax: 612-624-8868 313 Ford Hall, 224 Church St. S.E. email: luke@stat.umn.edu Minneapolis, MN 55455 USA WWW: http://www.stat.umn.edu -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Have you tried the dmalloc library (http://dmalloc.com)? I found it great for tracing memory problems, but I was only using C rather than a mixture of C and fortran. Simon ______________________________________________________________________> Simon Wood snw at st-and.ac.uk http://www.ruwpa.st-and.ac.uk/simon.html > The Mathematical Institute, North Haugh, St. Andrews, Fife KY16 9SS UK > Direct telephone: (0)1334 463799 Indirect fax: (0)1334 463748> I've run into segfaults when using my randomForest package on large dataset > (e.g., 100 x 15200) and large number of trees (e.g., ntree=7000 and > mtry=3000). I'm wondering if anyone can give me some hints on where to look > for the problem. > > The randomForest package mainly consists of two things: rf.c contains rf(), > a C wrapper function that calls the Fortran subroutines in rfsub.f that do > most of the work (slightly altered from Breiman's original code). All > memory allocations are done in rf.c, using S_alloc(). When I run random > forest with the data and setting as mentioned above, it was able to finish > growing the 7000 trees, but segfault when returning from rf() to R. GDB > gave the following (gdb prompts removed): > > do_dotCode (call=0x873aff4, op=0x8a5f620, args=0x8a5d010, env=0x86fd0a4) > at dotcode.c:1413 > 1413 break; > 1845 PROTECT(ans = allocVector(VECSXP, nargs)); > 1846 havenames = 0; > 1847 if (dup) { > 1849 info.cargs = cargs; > 1850 info.allArgs = args; > 1851 info.nargs = nargs; > 1852 info.functionName = buf; > 1853 nargs = 0; > 1854 for (pargs = args ; pargs != R_NilValue ; pargs > CDR(pargs)) { > 1855 if(argConverters[nargs]) { > 1864 PROTECT(s = CPtrToRObj(cargs[nargs], CAR(pargs), > which)); > > Program received signal SIGSEGV, Segmentation fault. > 0x080ddc6a in RunGenCollect (size_needed=1515400) at memory.c:1133 > 1133 SEXP next = NEXT_NODE(s); > > This is obtained on Linux (Mandrake 8.2 w/enterprise kernel 2.4.8) running > on dual P3-866 Xeon with 2GB RAM, using R-1.5.0 compiled from source. > > Any help/hints/comments are greatly appreciated! > > Regards, > Andy > > Andy I. Liaw, PhD > Biometrics Research Phone: (732) 594-0820 > Merck & Co., Inc. Fax: (732) 594-1565 > P.O. Box 2000, RY70-38 Rahway, NJ 07065 > mailto:andy_liaw at merck.com > > > > ------------------------------------------------------------------------------ > Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that may be confidential, proprietary copyrighted and/or legally privileged, and is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please immediately return this by e-mail and then delete it. > > =============================================================================> > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._