Dear R-devel, Can anyone give me some hints on how to go about debugging a strange segfault in my randomForest package? Here's the scoop: A user reported segfault when running predict() in the randomForest package. I asked for the data and code. The combination runs fine under WinXPPro, but does give segfault on one of our Linux boxes running R (1.7.0 through R-devel_2004-01-08) on Mandrake 9.0. The predict.randomForest() function calls a C function "runforest" via .C(..., DUP=FALSE, ...), which in turns calls a Fortran subroutine "testreebag" within a for loop. The segfault seems to occur right after finishing the runforest() function in C and returning to R. I inserted the line: Rprintf("Done!\n"); as the last line of the runforest() function and got the following output:> library(randomForest, lib.loc="~/rlibs") > arabid <- read.table('arabidopsis.out', sep=' ', header=T) > arabid <- arabid[,-which(names(arabid) == "X0")] > set.seed(1) > fit <- randomForest(arabid[,-1], arabid[,1], ntree=100) > predict(fit, arabid[,-1])Done! Program received signal SIGSEGV, Segmentation fault. 0x40152a48 in malloc () from /lib/libc.so.6 [If I change the DUP=FALSE in the .C() call to TRUE, I get the following: Program received signal SIGSEGV, Segmentation fault. 0x080b412b in Rf_duplicate (s=0x1) at duplicate.c:75 75 switch (TYPEOF(s)) { ] At this point I'm clueless as to what to do next, and would very much appreciate any help! Best, Andy Andy Liaw, PhD Biometrics Research PO Box 2000, RY33-300 Merck Research Labs Rahway, NJ 07065 mailto:andy_liaw@merck.com 732-594-0820 ------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments,...{{dropped}}
The symptom is that of the compiled code overrunning the storage it was allocated. I tracked down an instance (in SuppDists) yesterday and have seen quite a few in my time. I would expect you to find a write to memory off one or other end of an array, and compiling with bounds checking in place *may* help. (g77 is less useful than some commercial Fortran compilers at this, and some people write code that cannot be checked.) Increasing some storage areas for output arrays will often make the problem go away, but it is not actually a solution since it may just relocate an illegal write to a non-fatal place. Hope that helps enough, Brian On Thu, 8 Jan 2004, Liaw, Andy wrote:> Dear R-devel, > > Can anyone give me some hints on how to go about debugging a strange > segfault in my randomForest package? Here's the scoop: > > A user reported segfault when running predict() in the randomForest package. > I asked for the data and code. The combination runs fine under WinXPPro, > but does give segfault on one of our Linux boxes running R (1.7.0 through > R-devel_2004-01-08) on Mandrake 9.0. > > The predict.randomForest() function calls a C function "runforest" via > .C(..., DUP=FALSE, ...), which in turns calls a Fortran subroutine > "testreebag" within a for loop. The segfault seems to occur right after > finishing the runforest() function in C and returning to R. I inserted the > line: > > Rprintf("Done!\n"); > > as the last line of the runforest() function and got the following output: > > > library(randomForest, lib.loc="~/rlibs") > > arabid <- read.table('arabidopsis.out', sep=' ', header=T) > > arabid <- arabid[,-which(names(arabid) == "X0")] > > set.seed(1) > > fit <- randomForest(arabid[,-1], arabid[,1], ntree=100) > > predict(fit, arabid[,-1]) > Done! > > Program received signal SIGSEGV, Segmentation fault. > 0x40152a48 in malloc () from /lib/libc.so.6 > > [If I change the DUP=FALSE in the .C() call to TRUE, I get the following: > Program received signal SIGSEGV, Segmentation fault. > 0x080b412b in Rf_duplicate (s=0x1) at duplicate.c:75 > 75 switch (TYPEOF(s)) { > ] > > At this point I'm clueless as to what to do next, and would very much > appreciate any help! > > Best, > Andy > > Andy Liaw, PhD > Biometrics Research PO Box 2000, RY33-300 > Merck Research Labs Rahway, NJ 07065 > mailto:andy_liaw@merck.com 732-594-0820 > > > > > ------------------------------------------------------------------------------ > Notice: This e-mail message, together with any attachments,...{{dropped}} > > ______________________________________________ > R-devel@stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-devel > >-- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Thanks to DTL, BDR, KH and RG, I've found and fixed the bug. The problem was that I was offsetting an array being passed from C to Fortran, when I shouldn't have. Valgrind pinpointed the line that caused the trouble. When I found the bug, it makes me wonder why the code ever worked... [I've just submitted the patched version to CRAN.] Best, Andy> From: Prof Brian Ripley [mailto:ripley@stats.ox.ac.uk] > > The symptom is that of the compiled code overrunning the > storage it was > allocated. I tracked down an instance (in SuppDists) > yesterday and have > seen quite a few in my time. > > I would expect you to find a write to memory off one or other > end of an > array, and compiling with bounds checking in place *may* > help. (g77 is > less useful than some commercial Fortran compilers at this, and some > people write code that cannot be checked.) > > Increasing some storage areas for output arrays will often make the > problem go away, but it is not actually a solution since it may just > relocate an illegal write to a non-fatal place. > > Hope that helps enough, > > Brian > > On Thu, 8 Jan 2004, Liaw, Andy wrote: > > > Dear R-devel, > > > > Can anyone give me some hints on how to go about debugging a strange > > segfault in my randomForest package? Here's the scoop: > > > > A user reported segfault when running predict() in the > randomForest package. > > I asked for the data and code. The combination runs fine > under WinXPPro, > > but does give segfault on one of our Linux boxes running R > (1.7.0 through > > R-devel_2004-01-08) on Mandrake 9.0. > > > > The predict.randomForest() function calls a C function > "runforest" via > > .C(..., DUP=FALSE, ...), which in turns calls a Fortran subroutine > > "testreebag" within a for loop. The segfault seems to > occur right after > > finishing the runforest() function in C and returning to R. > I inserted the > > line: > > > > Rprintf("Done!\n"); > > > > as the last line of the runforest() function and got the > following output: > > > > > library(randomForest, lib.loc="~/rlibs") > > > arabid <- read.table('arabidopsis.out', sep=' ', header=T) > > > arabid <- arabid[,-which(names(arabid) == "X0")] > > > set.seed(1) > > > fit <- randomForest(arabid[,-1], arabid[,1], ntree=100) > > > predict(fit, arabid[,-1]) > > Done! > > > > Program received signal SIGSEGV, Segmentation fault. > > 0x40152a48 in malloc () from /lib/libc.so.6 > > > > [If I change the DUP=FALSE in the .C() call to TRUE, I get > the following: > > Program received signal SIGSEGV, Segmentation fault. > > 0x080b412b in Rf_duplicate (s=0x1) at duplicate.c:75 > > 75 switch (TYPEOF(s)) { > > ] > > > > At this point I'm clueless as to what to do next, and would > very much > > appreciate any help! > > > > Best, > > Andy > > > > Andy Liaw, PhD > > Biometrics Research PO Box 2000, RY33-300 > > Merck Research Labs Rahway, NJ 07065 > > mailto:andy_liaw@merck.com 732-594-0820 > > > > > > > > > > > -------------------------------------------------------------- > ---------------- > > Notice: This e-mail message, together with any > attachments,...{{dropped}} > > > > ______________________________________________ > > R-devel@stat.math.ethz.ch mailing list > > https://www.stat.math.ethz.ch/mailman/listinfo/r-devel > > > > > > -- > Brian D. Ripley, ripley@stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 > > >------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments,...{{dropped}}