Hi, I'm trying to assign a score to each row which allow me to identify which rows differ. In the example file below, I've used "," to indicate column separators. In this example, I'd like to identify that row 1 and row 5 are the same, and row 2 and row 4 are teh same. Any help much appreciated. Also, any comments on what the command lines do would be fantastic. Thanks!! example file: 0,0,1,0,1,0,0 0,1,0,0,0,0,1 0,0,0,0,0,0,0 0,1,0,0,0,0,1 0,0,1,0,1,0,0 0,0,0,1,0,0,0 example request output: 1 2 3 2 1 4 -- View this message in context: http://r.789695.n4.nabble.com/How-to-assign-scores-to-rows-based-on-column-values-tp2064018p2064018.html Sent from the R help mailing list archive at Nabble.com.
David Winsemius
2010-Apr-25 14:27 UTC
[R] How to assign scores to rows based on column values
On Apr 25, 2010, at 1:08 AM, burgundy wrote:> > Hi, > > I'm trying to assign a score to each row which allow me to identify > which > rows differ. In the example file below, I've used "," to indicate > column > separators. In this example, I'd like to identify that row 1 and row > 5 are > the same, and row 2 and row 4 are teh same. > Any help much appreciated. Also, any comments on what the command > lines do > would be fantastic. > Thanks!! > > example file: > 0,0,1,0,1,0,0 > 0,1,0,0,0,0,1 > 0,0,0,0,0,0,0 > 0,1,0,0,0,0,1 > 0,0,1,0,1,0,0 > 0,0,0,1,0,0,0 > > example request output: > 1 > 2 > 3 > 2 > 1 > 4If you use apply by rows with paste and a collapse argument you can get a text column. Using factor on that text column and then setting levels=unique(fac) one can extract the ordered elements with as.numeric(fac). On a dataframe, rrr, with those elements and such a factor, fac: > as.numeric(factor(rrr$fac, levels=unique(rrr$fac))) [1] 1 2 3 2 1 4 One needs to use factor a second time because the levels after the first call were set to an alpha-sorted version of fac. -- David Winsemius, MD West Hartford, CT