Rob James
2012-May-09 21:50 UTC
[R] Failed Convergence when using mi to generate synthetic data
I was hoping to use mi to generate a synthetic version of a database. The strategy (see code below) was simple: use the diamonds dataset from ggplot2, subset it focus on 3K single-color, then create a blank record for every "real" record, and throw the new dataset at mi to see if it would populate the blank records. I kept getting failed convergence. I think I have simplified the dataset down to the point where either I am doing it wrong or something is wrong (conceptually) with what I am doing. I would welcome suggestions: library(ggplot2) library(mi) data(diamonds) #use only 2800 or so observations! diamonds1 <-subset(diamonds, color=="J") rm(diamonds) #simplify the data structure diamonds1 <-subset(diamonds1, select=-c(x, z, y, cut, clarity, depth, table)) str(diamonds1) #generate a blank table emptydiamonds1 <-diamonds1 for(j in 1:dim(diamonds1)[2]) { emptydiamonds1[,j] <- NA } #throw up a dummy variable for imputation diamonds1$impute=0 emptydiamonds1$impute=1 #package the two into one dataset d2 <-rbind(diamonds1, emptydiamonds1) str(d2) #run in.info miinfo <-mi.info(d2) #pre_process mi_pre <-mi.preprocess(d2) #impute Imp1 <-mi(mi_pre, n.iter=49) [[alternative HTML version deleted]]