Hello,
I'm a graduate student in Genetics, who has just started working with R. I
have been trying to do a k-means clustering of an expression data
compilation, which has lots of NA values in it. As suggested in a couple of
earlier posts, I tried using na.omit() and the MICE imputation algorithm to
take care of the NA, but they dont seem to work that well. na.omit() deletes
the entries, which affects the final results considerably, and so I am wary
about using it.
I am not sure whether I have been using MICE properly. Here is an example of
the data and my commands
> y<-read.table("test.txt",header=FALSE,skip=1,row.names=1)
V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13
gene1 0.14 0.07 -0.58 -0.56 -0.25 -0.17 1.02 0.98 0.18 0.28 0.23 0.37
gene2 NA NA NA NA NA NA NA NA NA NA NA NA
gene3 0.00 0.28 -0.01 0.29 0.14 NA 0.23 NA 0.08 0.00 -0.47 -0.57
gene4 -0.58 -1.22 -0.43 -0.23 NA -0.36 0.30 0.28 0.30 0.41 0.33 -0.08
gene5 -1.51 -1.36 -1.64 -1.89 -1.32 -0.38 -0.14 -0.32 0.39 0.58 0.19 -0.40
gene6 -0.50 -0.60 -0.42 0.41 0.32 NA NA NA -0.69 0.29 0.12 0.11
> md.pattern(y)
V2 V3 V4 V5 V10 V11 V12 V13 V14 V15 V16 V6 V8 V7 V9
2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0
1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 2
1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 3
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 15
1 1 1 1 1 1 1 1 1 1 1 2 2 3 3 21
> imp <-mice(y)
The message I get is:
iter imp variable
1 1 V2Error in solve.default(t(xobs) %*% xobs) :
system is computationally singular: reciprocal condition number =
1.0438e-19> imp
Error: object "imp" not found
I also tried using different methods as mentioned in the manual, but I get
the same error everytime. Any suggestions on what could be wrong? And what
needs to be done? I'd prefer to use MICE, but if there are any better
methods, please let me know.
Thanks,
Gaurav