thr3ads.net - R help - [R] problem with rfImpute (package randomForest) [Mar 2009]

If this information is useful, please help other people find it:
Share via:

lionel fugon

2009-Mar-11 14:53 UTC

[R] problem with rfImpute (package randomForest)

Hello everybody,

this is my first request about R so I am sorry if I send it to a bad mail or if
I am not very clear.

So my problem is about the use of rfImpute from randomForest package. I am
interested in imputations of missing values and I read that randomForest can
make it. So i write the following code :

set.seed(100);
library(mlbench)
library(randomForest)
data(BreastCancer)
summary(BreastCancer)
data=BreastCancer[,-1]
data=data[!is.na(data[,"Bare.nuclei"]),]
summary(data)


is.factor(data$Cl.thickness)# OK


##########selection of missing values######
x=1:nrow(data)
sample1=sample(x,70)
sample3=sample(x,70)
sample5=sample(x,70)


##########replace by missing values#########
data_missing=data
data_missing[sample1,1]=NA
data_missing[sample3,3]=NA
data_missing[sample5,5]=NA
summary(data_missing)


is.factor(data_missing$Cl.thickness)# OK


########imputation by random forest########
data_imputed <- rfImpute(Class ~ .,data_missing,iter=5,ntree=1000)


is.factor(data_imputed$Cl.thickness)# Not OK



And as you can see, rfImpute change the type of one explanatory variable. Before
imputation, it was a factor. After it becomes a quantitative variable. So I
don't understand what it happens. Maybe I should add an option in
rfImpute...
If someone could help me to understand.

Thank you very much




_________________________________________________________________

? Lancez-vous !

	[[alternative HTML version deleted]]

Maybe Matching Threads

Search for more possibly parallel threads

R help - Mar 2009 - problem with rfImpute (package randomForest)

[R] problem with rfImpute (package randomForest)

Maybe Matching Threads

Wisdom of the Ancients