James Erickson
2013-Jan-11 02:54 UTC
[R] Error with looping through a list of strings as variables
Dear R users:
I have been trying to figure out how to include string variables in a for
loop to run multiple random forests with little success. The current code
returns the following error:
Error in trafo(data = data, numeric_trafo = numeric_trafo, factor_trafo
factor_trafo, :
data class character is not supported
In addition: Warning message:
In storage.mode(RET@predict_trafo) <- "double" : NAs introduced by
coercion
The code runs fine with the data before I add the > for (h in varlist){
loop.
Loops i, k work without issue as long as I manually enter
the response variable into the code below for h.
Using R 2.15.0 (64bit), with cforest from the "party" package.
Any thoughts would be of great help.
Cheers
Note : The data used in the script below is not the actual data but
a substitute set which results in the same start-up errors. Once these
start-up errors are resolved there should be a "(data, ...) : fraction of
0.000000 is too small" error will be seen which is simply due to the
small substitute data set and of no concern.
rm(list=ls())
library(party)
library(reshape)
puthere <- c("TEST_RESULTS.csv")
hsb2 <- read.csv("http://www.ats.ucla.edu/stat/data/hsb2.csv")
names(hsb2)
set.seed(8296)
ctrl <- cforest_unbiased(ntree=500, mtry=2)
varlist <- names(hsb2)[3:4]
for (h in varlist){
for (k in c(1,0)){
for (i in c(1,2)){ ## Data subset
filtered <- subset(hsb2,
schtyp == i
& female == k,
select = c(id:socst))
rank.cf <- cforest(h ~ write + math + science + socst,
data = filtered,
control = ctrl)
print(rank.cf)
## Standard importance values __________________________
imp=varimp(rank.cf,
conditional = TRUE)
print(imp)
## predict variables _________________________________________
predicted=predict(rank.cf,OOB = TRUE)
residual=filtered$h-predicted
mse=mean(residual^2)
rsq=1-mse/var(filtered$h)
##Correlation between fitted values and original values: ____
correl <- paste(cor(filtered$h,predicted))
Correlation <-paste(
"MSE:",mse,
"Rsq:",rsq,
"Correlation between fitted values and original values:",correl)
print(Correlation)
## combine results for output _______________________________
TestVar <- paste("Dependent =",h, sep=" ")
namCL <- paste("schtyp =",i, sep=" ")
namSE <- paste("female =",k, sep=" ")
assign(namCL, 1:i)
assign(namSE, 1:k)
results <- rbind(TestVar, namCL, namSE, mse, rsq, correl)
## Writing data to csv file _________________________________
write.table(results, file = puthere,
append = TRUE,
quote = FALSE,
sep = " ",
col.names = TRUE,
row.names = TRUE,)
write.table(imp, file = puthere,
append = TRUE,
quote = FALSE,
sep = " ",
eol = "\r",
na = "N/A",
row.names = TRUE,
col.names = TRUE,
qmethod = "double")
}
}
}
[[alternative HTML version deleted]]
Maybe Matching Threads
- Unexpected results using the oneway_test in the coin package
- Different result with "kruskal.test" and post-hoc analysis with Nemenyi-Damico-Wolfe-Dunn test implemented in the help page for oneway_test in the coin package that uses multcomp
- Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
- Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
- Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
