While trying to train randomForest with my dataset, I am ending up with the
following error
Error in randomForest.default(datatrain, classtrain) :
length of response must be the same as predictors
My data looks like:
A,B,C,D,Class
1,2,1,2,cl1
1,2,1,2,cl1
3,2,1,2,cl2
3,2,1,2,cl2
3,2,1,2,cl2
3,2,1,2,cl2
3,2,1,2,cl2
3,2,1,2,cl2
3,2,1,2,cl2
3,2,12,3,cl2
3,2,1,2,cl2
Actual dataset has around 4000 features and two classes. And number of
instances is also around 4000.
The steps followed are:
trainfile <- read.csv("TrainFile",head=TRUE)
datatrain <- subset(trainfile,select=c(-Class))
classtrain <- (subset(trainfile,select=Class))
rf <- randomForest(datatrain, classtrain)
Error in randomForest.default(datatrain, classtrain) :
length of response must be the same as predictors
In addition: Warning message:
In randomForest.default(datatrain, classtrain) :
The response has five or fewer unique values. Are you sure you want to do
regression?
Where I am going wrong?
If I follow the example provided in documentation (Classification and
Regression with Random Forest)
rf <- randomForest(classtrain, data=datatrain)
I dont get randomForest of type: classification
I get:
Call:
randomForest(x = classtrain, data = datatrain)
Type of random forest: unsupervised
Number of trees: 500
No. of variables tried at each split: 1
Any suggestion would be appreciated.
Thanks
[[alternative HTML version deleted]]
Hi,
I want to export an table using the write.table and i want is this format:
(this table was exported in s-plus)
Q01
row.names Num Perc meab stdev min P5 P10 P25 P50 P75
P90 P95 max
A 10237 47.88 183.48 38.84 86.98 126.52 138.13 157.82 182.41 210.17
238.94 254.13 354.49
B 10243 47.91 186.91 36.55 86.98 128.18 139.96 159.27 182.42 208.75
233.2 249.07 336.17
762 3.56 178.73 36.37 90.19 114.27 127.16 144.88 166.59 193.56
220.37 234.42 307.87
* 137 0.64 150.77 32.88 96.42 112.72 120.59 139.84 159.36 181.25
206.01 216.33 254.58
tot 21379 100 182.48 37.77 86.98 126.52 138.22 157.85 181.75 208.82
235.51 251.35 354.49
It is a lot of tables, in S-plus I was using
for (i in 1:length(nrotulos)) {
write.table(nomequest[i],
"Y:\\questgeral.txt",sep="\t",append=T)
write.table(questgeral[[i]],
"Y:\\questgeral.txt",sep="\t",dimnames.write=T,append=T)
}
Now, i am trying to do the same thing in R. But, I have a lot of warnings
and the result is:
x
1 Q05
Num Perc media stdev min P5 P10 P25 P50 P75 P90 P95 max
1 12418 58 183.71 37.28 86.98 126.11 138.11 157.58 180.95 207.55 233.76
249.24 354.49
2 4898 22.88 188.45 38.79 86.98 128.89 140.62 160.69 185.38 214.12 241.36
256.39 344.28
3 2161 10.09 188.22 39.38 87.13 126.97 138.63 159.67 186.76 212.59 241.15
256.44 352.71
4 1934 9.03 175.7 34.59 86.98 122.76 133 152.29 172.4 198.13 220.71
237.78 317.62
tot 21411 100 184.53 37.77 86.98 126.52 138.22 157.85 181.75 208.82 235.51
251.35 354.49
using this code:
for (i in 1:length(nrotulos04)) {
write.table(nomequest[i],
"Y:/questimp1104m.txt",dec=".",sep=";",append=T,quote=F)
write.table(questimp[[i]],
"Y:/questimp1104m.txt",dec=".",sep=";",append=T,quote=F)
}
How can I put the row.names before the col num in R? And how can I remove
the x in first line and the number 1 and the ^t in the second line before
Q05?
Thanks for the advance!
Leandro Marino