wangwallace
2011-Jan-11 17:19 UTC
[R] how to sort new data frame based on the original data frame
I have a really simple question I have a data frame of 8 variables (the first column is the subjects' id): SubID G1 G2 G3 G4 W1 W2 W3 W4 1 6 5 6 2 6 2 2 4 2 6 4 7 2 6 6 2 3 3 5 5 5 5 5 5 4 5 4 5 4 3 4 4 4 5 2 5 5 6 7 5 6 4 4 1 6 5 4 3 6 4 3 7 3 7 3 6 6 3 6 5 2 1 8 3 6 6 3 6 5 4 7 this data frame have two sets of variables. each set simply represent one scale. as shown above, the first scale, say G, consists of four items: G1, G2, G3, and G4, whereas the second scale, say W, also has four items: W1, W2, W3, W4. the leftmost column lists the subjects' ID. I drew 100 new random samples based on the data frame. here is the structure of each new random sample: var var var var g g g g g g g g w w w w w w w w w w w w w w w w w w w w w w w w each random sample satisfies the following rules: ###the top two rows have to be filled with 2 random rows of the 8 rows of G numbers. the rest should be filled with 6 random rows of the 8 rows of W numbers. At the same time, the SubIDs of all eight rows should be different among each other. here below is the syntax I've used:> fff<-function(dat,g=2,w=6){+ sel1<-sample(1:8,g) + sel2<-sample((1:8)[-sel1],w) + M=dat[sel1,2:5] + N=dat[sel2,6:9] + colnames(N)<-colnames(M) + rbind(M,N) +}> result<-vector("list",100) > for(i in 1:100)result[[i]]<-fff(data,2,6) > resulthere is the first random sample:> result[[1]]G1 G2 G3 G4 3 5 5 5 5 6 5 4 3 6 4 4 4 5 2 1 6 2 2 4 7 6 5 2 1 2 6 6 2 3 5 6 4 4 1 8 6 5 4 7 I am wondering how can I sort the rows of each new random samples in the same order that is corresponding to the SubID in the original data. Specifically, what kind of syntax should I've added into the one I've used above to make the random sample, say, the first random sample, look like:> result[[1]]G1 G2 G3 G4 1 6 2 2 4 2 6 6 2 3 3 5 5 5 5 4 4 4 5 2 5 6 4 4 1 6 5 4 3 6 7 6 5 2 1 8 6 5 4 7 Many thanks!! -- View this message in context: http://r.789695.n4.nabble.com/how-to-sort-new-data-frame-based-on-the-original-data-frame-tp3209353p3209353.html Sent from the R help mailing list archive at Nabble.com.
Joshua Wiley
2011-Jan-11 19:20 UTC
[R] how to sort new data frame based on the original data frame
Hi, On Tue, Jan 11, 2011 at 9:19 AM, wangwallace <talenttree at gmail.com> wrote:> > I have a really simple question > > I have a data frame of 8 variables (the first column is the subjects' id): > > ? ?SubID ? ? G1 ? ?G2 ? ? G3 ? ? G4 ? ?W1 ? ?W2 ? ? ?W3 ? ?W4 > ? ? ?1 ? ? ? ? ?6 ? ? ?5 ? ? ? 6 ? ? ? 2 ? ? ?6 ? ? ?2 ? ? ? ?2 ? ? ? 4 > ? ? ?2 ? ? ? ? ?6 ? ? ?4 ? ? ? 7 ? ? ? 2 ? ? ?6 ? ? ?6 ? ? ? ?2 ? ? ? 3 > ? ? ?3 ? ? ? ? ?5 ? ? ?5 ? ? ? 5 ? ? ? 5 ? ? ?5 ? ? ?5 ? ? ? ?4 ? ? ? 5 > ? ? ?4 ? ? ? ? ?5 ? ? ?4 ? ? ? 3 ? ? ? 4 ? ? ?4 ? ? ?4 ? ? ? ?5 ? ? ? 2 > ? ? ?5 ? ? ? ? ?5 ? ? ?6 ? ? ? 7 ? ? ? 5 ? ? ?6 ? ? ?4 ? ? ? ?4 ? ? ? 1 > ? ? ?6 ? ? ? ? ?5 ? ? ?4 ? ? ? 3 ? ? ? 6 ? ? ?4 ? ? ?3 ? ? ? ?7 ? ? ? 3 > ? ? ?7 ? ? ? ? ?3 ? ? ?6 ? ? ? 6 ? ? ? 3 ? ? ?6 ? ? ?5 ? ? ? ?2 ? ? ? 1 > ? ? ?8 ? ? ? ? ?3 ? ? ?6 ? ? ? 6 ? ? ? 3 ? ? ?6 ? ? ?5 ? ? ? ?4 ? ? ? 7 > > > this data frame have two sets of variables. each set simply represent one > scale. as shown above, the first scale, say G, consists of four items: G1, > G2, G3, and G4, whereas the second scale, say W, also has four items: W1, > W2, W3, W4. > the leftmost column lists the subjects' ID. > > I drew 100 new random samples based on the data frame. here is the structure > of each new random sample: > ? ? ? ? ? ? ? var ? ?var ? var ? ? var > ? ? ? ? ? ? ? ?g ? ? ?g ? ? ?g ? ? ? g > ? ? ? ? ? ? ? ?g ? ? ?g ? ? ?g ? ? ? g > ? ? ? ? ? ? ? ?w ? ? ?w ? ? w ? ? ? w > ? ? ? ? ? ? ? ?w ? ? ?w ? ? w ? ? ? w > ? ? ? ? ? ? ? ?w ? ? ?w ? ? w ? ? ? w > ? ? ? ? ? ? ? ?w ? ? ?w ? ? w ? ? ? w > ? ? ? ? ? ? ? ?w ? ? ?w ? ? w ? ? ? w > ? ? ? ? ? ? ? ?w ? ? ?w ? ? w ? ? ? w > > each random sample satisfies the following rules: > > ###the top two rows have to be filled with 2 random rows of the 8 rows of G > numbers. the rest should be filled with 6 random rows of the 8 rows of W > numbers. At the same time, the SubIDs of all eight rows should be different > among each other. > > here below is the syntax I've used: > >> fff<-function(dat,g=2,w=6){ > + sel1<-sample(1:8,g) > + sel2<-sample((1:8)[-sel1],w) > + M=dat[sel1,2:5] > + N=dat[sel2,6:9] > + colnames(N)<-colnames(M)## just add the order to rbind()> + rbind(M, N)[order(c(sel1, sel2)), ] > +}>> result<-vector("list",100) >> for(i in 1:100)result[[i]]<-fff(data,2,6) >> resultFor convenience and speed, consider building this (the for loop) into your function. The only part that you actually need looped is the sample(), so you could get some performance gains, if you are interested/that is an issue. HTH, Josh -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/