J.Brian.Adams
2000-Dec-30 20:19 UTC
[R] Extracting repeated observations from a large data set
I have a dataset containing over 750,000 observations. I have read them into an nx6 matrix. If possible I would like to prune it by extracting only those observations in which a specific characteristic that is contained in column j appears at least k times. I have used the following where k=3 and the fifth column contains the test data ObsMatrix[as.numeric(table(ObsMatrix[,5])) > 3,] but it does not seem to work. It returns certain rows from the matrix, but not necessarily those with more than three repeats, and it only returns one row for each match. I need to be able to keep all of the duplicate records in the data. Is there a way to do this without using several nested for loops? -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._