Hi there, I'm sure there's an easy answer to this, and I can't wait to see it. The question: is there an easy way to sort a data frame by it's row names? My dilemma: I've had to pull apart a data frame, run it through a loop to do some calculations and generate new variables, and then re-construct the chunks back into a data frame at the end. Doing this preserves the row names from the original data frame (which are informative), so i thought this would make it easy to bring it back to it's original order. However, to do this, I've had to include the row names as a column, and then sort by that at the end, which seems terribly redundant and makes me think there's an easier way to do this. See some example code below: ########################### #build a dataframe set.seed(12345) a<-sample(1:50, 15) b<-15:1 c<-rep(1:3, 5) ex.dat<-data.frame(a,b,c) #pull it apart; in my case, each chunk is going through a loop and being spit out in a list sub3<-subset(ex.dat, c=="3") sub2<-subset(ex.dat, c=="2") sub1<-subset(ex.dat, c=="1") #put it back together; in my case, pull out the parts of the list generated by the loop that hold the data newdat<-rbind(sub3, sub2, sub1) #rebuild it such that it can be re-organized into it's original order rn<-as.numeric(row.names(newdat)) new2<-data.frame(newdat, rn) new3 = new2[do.call(order, new2["rn"]), ] ###################### It's those last three lines of code that I'm sure are unnecessary; why include a column of information for something that's already there? However, most of the posted solutions to sorting of data frames have to do with sorting by particular variables, not by the row names (which are just typically 1:n, and is rarely informative). So, this is the solution I came up with based on what I can find out there currently. Looking forward to any thoughts or suggestions. -- Michael D. Rennie Ph.D. Candidate University of Toronto at Mississauga 3359 Missisagua Rd. N. Mississauga, ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 www.utm.utoronto.ca/~w3rennie [[alternative HTML version deleted]]
Can't you just do newdat <- newdat[order(row.names(newdat)),] Or am I missing something? cheers, Rolf Turner On 9/07/2008, at 2:58 PM, Michael Rennie wrote:> Hi there, > > I'm sure there's an easy answer to this, and I can't wait to see it. > > The question: is there an easy way to sort a data frame by it's row > names? > > My dilemma: > > I've had to pull apart a data frame, run it through a loop to do some > calculations and generate new variables, and then re-construct the > chunks > back into a data frame at the end. > > Doing this preserves the row names from the original data frame > (which are > informative), so i thought this would make it easy to bring it back > to it's > original order. > > However, to do this, I've had to include the row names as a column, > and then > sort by that at the end, which seems terribly redundant and makes > me think > there's an easier way to do this. > > See some example code below: > > ########################### > #build a dataframe > > set.seed(12345) > a<-sample(1:50, 15) > b<-15:1 > c<-rep(1:3, 5) > > ex.dat<-data.frame(a,b,c) > > #pull it apart; in my case, each chunk is going through a loop and > being > spit out in a list > > sub3<-subset(ex.dat, c=="3") > sub2<-subset(ex.dat, c=="2") > sub1<-subset(ex.dat, c=="1") > > #put it back together; in my case, pull out the parts of the list > generated > by the loop that hold the data > > newdat<-rbind(sub3, sub2, sub1) > > #rebuild it such that it can be re-organized into it's original order > > rn<-as.numeric(row.names(newdat)) > > new2<-data.frame(newdat, rn) > > new3 = new2[do.call(order, new2["rn"]), ] > > ###################### > > It's those last three lines of code that I'm sure are unnecessary; why > include a column of information for something that's already there? > However, > most of the posted solutions to sorting of data frames have to do with > sorting by particular variables, not by the row names (which are just > typically 1:n, and is rarely informative). So, this is the solution > I came > up with based on what I can find out there currently. > > Looking forward to any thoughts or suggestions. > > -- > Michael D. Rennie > Ph.D. Candidate > University of Toronto at Mississauga > 3359 Missisagua Rd. N. > Mississauga, ON L5L 1C6 > Ph: 905-828-5452 Fax: 905-828-3792 > www.utm.utoronto.ca/~w3rennie > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.###################################################################### Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
Dear Michael, Is this what you are looking for? ex.dat$rn=as.numeric(rownames(ex.dat)) ex.dat # Are new3 and ex.dat equals? all.equal(new3,ex.dat) [1] TRUE HTH, Jorge On Tue, Jul 8, 2008 at 10:58 PM, Michael Rennie <mdrennie@gmail.com> wrote:> Hi there, > > I'm sure there's an easy answer to this, and I can't wait to see it. > > The question: is there an easy way to sort a data frame by it's row names? > > My dilemma: > > I've had to pull apart a data frame, run it through a loop to do some > calculations and generate new variables, and then re-construct the chunks > back into a data frame at the end. > > Doing this preserves the row names from the original data frame (which are > informative), so i thought this would make it easy to bring it back to it's > original order. > > However, to do this, I've had to include the row names as a column, and > then > sort by that at the end, which seems terribly redundant and makes me think > there's an easier way to do this. > > See some example code below: > > ########################### > #build a dataframe > > set.seed(12345) > a<-sample(1:50, 15) > b<-15:1 > c<-rep(1:3, 5) > > ex.dat<-data.frame(a,b,c) > > #pull it apart; in my case, each chunk is going through a loop and being > spit out in a list > > sub3<-subset(ex.dat, c=="3") > sub2<-subset(ex.dat, c=="2") > sub1<-subset(ex.dat, c=="1") > > #put it back together; in my case, pull out the parts of the list generated > by the loop that hold the data > > newdat<-rbind(sub3, sub2, sub1) > > #rebuild it such that it can be re-organized into it's original order > > rn<-as.numeric(row.names(newdat)) > > new2<-data.frame(newdat, rn) > > new3 = new2[do.call(order, new2["rn"]), ] > > ###################### > > It's those last three lines of code that I'm sure are unnecessary; why > include a column of information for something that's already there? > However, > most of the posted solutions to sorting of data frames have to do with > sorting by particular variables, not by the row names (which are just > typically 1:n, and is rarely informative). So, this is the solution I came > up with based on what I can find out there currently. > > Looking forward to any thoughts or suggestions. > > -- > Michael D. Rennie > Ph.D. Candidate > University of Toronto at Mississauga > 3359 Missisagua Rd. N. > Mississauga, ON L5L 1C6 > Ph: 905-828-5452 Fax: 905-828-3792 > www.utm.utoronto.ca/~w3rennie <http://www.utm.utoronto.ca/%7Ew3rennie> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]