Dear R-devel, A user reported a strange problem with predict.randomForest in the randomForest package yesterday, and I'm baffled by it. The code at the end of the message produces the error. The problem is that, in predict.randomForest, there's a .Fortran call to "runforest". One of the arguments passed in is "countts", which is a vector of doubles. The error occured because when the .Fortran call returned, that component of the output is mysteriously turned into numeric(0)! I checked that vector inside the Fortran code (dimensioned as a matrix), and it looked fine. Can anyone provide some hint as to what the problem could be? While I'm at it, can some one provide some tips on debugging Fortran code with GDB? The gdb manual has very little info on this topic. For example, how do I examine (print) arrays and values of arguments being passed in? Any help much appreciated! Regards, Andy Andy I. Liaw, PhD Biometrics Research Phone: (732) 594-0820 Merck & Co., Inc. Fax: (732) 594-1565 P.O. Box 2000, RY84-16 Rahway, NJ 07065 mailto:andy_liaw@merck.com =========================library(randomForest) library(mlbench) data(Soybean) Soybean <- Soybean[complete.cases(Soybean),] ## Drop empty levels: Soybean <- lapply(Soybean, function(x) factor(as.character(x))) nreps <- 100 rf.err <- numeric(nreps) for (i in 1:nreps) { test <- sample(nrow(Soybean), 150, replace=FALSE) sb.rf <- randomForest(Class~., data=Soybean, subset=-test) sb.rf.pred <- predict(sb.rf, Soybean[test,]) sb.rf.table <- table(sb.rf.pred, Soybean$Class[test]) rf.err[i] <- sum(diag(sb.rf.table)) print(1-sum(diag(sb.rf.table))/length(test)) } ------------------------------------------------------------------------------