James Erickson
2013-Jan-11 02:54 UTC
[R] Error with looping through a list of strings as variables
Dear R users: I have been trying to figure out how to include string variables in a for loop to run multiple random forests with little success. The current code returns the following error: Error in trafo(data = data, numeric_trafo = numeric_trafo, factor_trafo factor_trafo, : data class character is not supported In addition: Warning message: In storage.mode(RET@predict_trafo) <- "double" : NAs introduced by coercion The code runs fine with the data before I add the > for (h in varlist){ loop. Loops i, k work without issue as long as I manually enter the response variable into the code below for h. Using R 2.15.0 (64bit), with cforest from the "party" package. Any thoughts would be of great help. Cheers Note : The data used in the script below is not the actual data but a substitute set which results in the same start-up errors. Once these start-up errors are resolved there should be a "(data, ...) : fraction of 0.000000 is too small" error will be seen which is simply due to the small substitute data set and of no concern. rm(list=ls()) library(party) library(reshape) puthere <- c("TEST_RESULTS.csv") hsb2 <- read.csv("http://www.ats.ucla.edu/stat/data/hsb2.csv") names(hsb2) set.seed(8296) ctrl <- cforest_unbiased(ntree=500, mtry=2) varlist <- names(hsb2)[3:4] for (h in varlist){ for (k in c(1,0)){ for (i in c(1,2)){ ## Data subset filtered <- subset(hsb2, schtyp == i & female == k, select = c(id:socst)) rank.cf <- cforest(h ~ write + math + science + socst, data = filtered, control = ctrl) print(rank.cf) ## Standard importance values __________________________ imp=varimp(rank.cf, conditional = TRUE) print(imp) ## predict variables _________________________________________ predicted=predict(rank.cf,OOB = TRUE) residual=filtered$h-predicted mse=mean(residual^2) rsq=1-mse/var(filtered$h) ##Correlation between fitted values and original values: ____ correl <- paste(cor(filtered$h,predicted)) Correlation <-paste( "MSE:",mse, "Rsq:",rsq, "Correlation between fitted values and original values:",correl) print(Correlation) ## combine results for output _______________________________ TestVar <- paste("Dependent =",h, sep=" ") namCL <- paste("schtyp =",i, sep=" ") namSE <- paste("female =",k, sep=" ") assign(namCL, 1:i) assign(namSE, 1:k) results <- rbind(TestVar, namCL, namSE, mse, rsq, correl) ## Writing data to csv file _________________________________ write.table(results, file = puthere, append = TRUE, quote = FALSE, sep = " ", col.names = TRUE, row.names = TRUE,) write.table(imp, file = puthere, append = TRUE, quote = FALSE, sep = " ", eol = "\r", na = "N/A", row.names = TRUE, col.names = TRUE, qmethod = "double") } } } [[alternative HTML version deleted]]
Maybe Matching Threads
- Unexpected results using the oneway_test in the coin package
- Different result with "kruskal.test" and post-hoc analysis with Nemenyi-Damico-Wolfe-Dunn test implemented in the help page for oneway_test in the coin package that uses multcomp
- Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
- Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
- Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing