anthony at resolution.com
2009-Jan-10 23:45 UTC
[R] Rserve/RandomForest does not work with a CSV?
Hi all, We're using Rserve and RandomForest to do classification from within a Java program. The total is about 4 lines of R code: library('randomForest') x y future fit<-randomForest(x,y,no.action=na.roughfix,importance=T,proximity=T) p<-predict(fit, future) What is very frustrating is that we have tried this two different ways (both work in R): 1. Load x, y, and future from a CSV. If I do this, Rserve throws an error when randomForest() is called. 2. Load x, y, and future by using arrays, and manually building them. If I do this, randomForest() works fine. Either way can be done inside of R, and they work great. Rserve is running as root, and our Java application is running inside of Tomcat, and is also running as root. The actual code looks something like: RConnection conn = new RConnection("127.0.0.1"); conn.voidEval("library('randomForest')"); conn.voidEval("train<-read.csv(\"" + (outfile.getAbsolutePath()) + "\",header=FALSE)"); conn.voidEval("x<-train[1:" + totalTrainData + ",1:11]"); conn.voidEval("y<-as.factor(train[1:" + totalTrainData + ",12])"); conn.voidEval("future<-train[" + (totalTrainData + 1) + ":" + (totalTrainData + totalPredictions) + ",1:11]"); conn.voidEval("fit<-randomForest(x,y,no.action=na.roughfix,importance=T,proximity=T)"); conn.voidEval("p<-predict(fit, future)"); conn.voidEval("write.csv(p, file=\"" + (filename.getAbsolutePath()) + "\")"); Every time we use this, it errors on the randomForest() call. (If I run this in R, it works perfectly fine). Any ideas why I cannot call randomForest() this way, but if instead, the x / y / future values are built using the array command, it works fine? As a secondary question, is it faster/slower to do it this way? Certainly is pretty convenient to use the CSV's. This one is driving us bonkers! -- Anthony