francesca casalino
2012-Mar-05 09:39 UTC
[R] Order a data frame based on the order of another data frame
Hi, I am trying to match the order of the rownames of a dataframe with the rownames of another dataframe (I can't simply sort both sets because I would have to change the order of many other connected datasets if I did that): Also, the second dataset (snp.matrix$fam) is a snp matrix slot: so for example: data_one: x y z sample_1110001 -0.3352623 -1.141462 -0.4032494 sample_1110005 0.1862424 0.015944 0.1329059 sample_1110420 0.1309120 0.004005596 0.06117253 sample_2220017 0.1145205 -0.125090054 0.04957881 rownames(snp.matrix$fam) [1] "sample_2220017" "sample_1110420" "sample_1110001" [4] "sample_1110005" I would like my data_one to look like this: x y z sample_2220017 0.1145205 -0.125090054 0.04957881 sample_1110420 0.1309120 0.004005596 0.06117253 sample_1110001 -0.3352623 -1.141462 -0.4032494 sample_1110005 0.1862424 0.015944 0.1329059 I have tried these but it doesn't work: data_one[order(rownames(snp.matrix$fam)),] data_one[rownames(data_oen)[order(rownames(snp.matrix$fam))],] Thank you for your help!
Leandro Marino
2012-Mar-05 12:13 UTC
[R] Order a data frame based on the order of another data frame
*Hi,* * * *i think that you are making a mistake. When you use * * * order(rownames(snp.matrix$fam)) you're sorting the data, so it will be in the same order of the data_one. If rownames() are defined correctly in data_one, you only have to do: data_one[rownames(snp.matrix$fam),] Note that it's good to check if all elements from rownames(snp.matrix$fam) are in rownames(data_one). setdiff( rownames(snp.matrix$fam),rownames(data_one)) setdiff(rownames(data_one),rownames(snp.matrix$fam)) * * * * * * * * Atenciosamente, Leandro Marino http://www.leandromarino.com.br (Fotógrafo) http://est.leandromarino.com.br/Blog (EstatÃstico) Cel.: + 55 21 9845-7707 Cel.: + 55 21 8777-7907 2012/3/5 francesca casalino <francy.casalino@gmail.com>> Hi, I am trying to match the order of the rownames of a dataframe with > the rownames of another dataframe (I can't simply sort both sets > because I would have to change the order of many other connected > datasets if I did that): Also, the second dataset (snp.matrix$fam) is > a snp matrix slot: > > so for example: > > data_one: > x y > z > sample_1110001 -0.3352623 -1.141462 -0.4032494 > sample_1110005 0.1862424 0.015944 0.1329059 > sample_1110420 0.1309120 0.004005596 0.06117253 > sample_2220017 0.1145205 -0.125090054 0.04957881 > > rownames(snp.matrix$fam) > [1] "sample_2220017" "sample_1110420" "sample_1110001" > [4] "sample_1110005" > > I would like my data_one to look like this: > x y > z > sample_2220017 0.1145205 -0.125090054 0.04957881 > sample_1110420 0.1309120 0.004005596 0.06117253 > sample_1110001 -0.3352623 -1.141462 -0.4032494 > sample_1110005 0.1862424 0.015944 0.1329059 > > > I have tried these but it doesn't work: > data_one[order(rownames(snp.matrix$fam)),] > data_one[rownames(data_oen)[order(rownames(snp.matrix$fam))],] > > Thank you for your help! > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]