Hi, there:
I have a question on the following dataset
> rbind(t2[which(t4>0.3),][1:3,], t2[1:3,]) # don't worry about what
this line means
[,1] [,2] [,3] [,4] [,5]
[1,] 34.216166 96.928587 330.125990 330.183222 330.201215
[2,] 2.819183 8.134491 8.275841 8.525256 8.828448
[3,] 2.819183 7.541680 7.550333 8.374636 8.690998
[4,] 4.672551 5.036353 5.072710 5.152218 5.223204
[5,] 5.470131 5.500513 5.674139 5.689151 5.770423
[6,] 4.480287 4.628300 4.797686 4.814106 4.823345
I want to filter out the first 3 cases from the rest and the criteria
is I am looking for a "gap".
My way is using std(eachrow)/median(each) and set up a threshold,
which is very naive, but fast and good enough. But I want it better
and more "academic". Please be advised. I think clustering might help,
but it needs to be quick since t2 has 30000 rows.
Thanks,
Weiwei
--
Weiwei Shi, Ph.D
"Did you always know?"
"No, I did not. But I believed..."
---Matrix III