I couldn't find any reference to this in the FAQ, but is it possible to sort a dataframe by multiple columns? I've created some code, similar to the following: nspr.code <- sp.results$sp.code[order( sp.results$sp.code )] nspr.tpa <- sp.results$tpa[order( sp.results$sp.code )] nspr.code <- as.character( levels( nspr.code ) )[nspr.code] nspr.tpa <- as.numeric( levels( nspr.tpa ) )[nspr.tpa] hope <- as.data.frame( cbind( nspr.code, as.numeric(nspr.tpa) ) ) and it seems to work, but I have dataframes that I would like to sort on using multiple columns (numeric and character). Something like : newframe <- sort( data=frame, list=c(plot,plant,sp) ) Or am I just barking up the wrong tree? Jeff. --- Jeff D. Hamann Forest Informatics, Inc. PO Box 1421 Corvallis, Oregon USA 97339-1421 541-754-1428 jeff.hamann at forestinformatics.com www.forestinformatics.com
?order Jeff D. Hamann wrote:>I couldn't find any reference to this in the FAQ, but is it possible to sort >a dataframe by multiple columns? > >I've created some code, similar to the following: > >nspr.code <- sp.results$sp.code[order( sp.results$sp.code )] >nspr.tpa <- sp.results$tpa[order( sp.results$sp.code )] > >nspr.code <- as.character( levels( nspr.code ) )[nspr.code] >nspr.tpa <- as.numeric( levels( nspr.tpa ) )[nspr.tpa] > >hope <- as.data.frame( cbind( nspr.code, as.numeric(nspr.tpa) ) ) > >and it seems to work, but I have dataframes that I would like to sort on >using multiple columns (numeric and character). Something like : > >newframe <- sort( data=frame, list=c(plot,plant,sp) ) > >Or am I just barking up the wrong tree? > >Jeff. > > >--- >Jeff D. Hamann >Forest Informatics, Inc. >PO Box 1421 >Corvallis, Oregon USA 97339-1421 >541-754-1428 >jeff.hamann at forestinformatics.com >www.forestinformatics.com > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://www.stat.math.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > >
On Fri 26 Mar 2004, Jeff D. Hamann wrote:>I couldn't find any reference to this in the FAQ, but is it >possible to sort a dataframe by multiple columns?>I've created some code, similar to the following:>nspr.code <- sp.results$sp.code[order( sp.results$sp.code )] >nspr.tpa <- sp.results$tpa[order( sp.results$sp.code )]>nspr.code <- as.character( levels( nspr.code ) )[nspr.code] >nspr.tpa <- as.numeric( levels( nspr.tpa ) )[nspr.tpa]>hope <- as.data.frame( cbind( nspr.code, as.numeric(nspr.tpa) ) )A simple way to sort multiple columns is to paste them together and sort the resulting character vector. THat way you only have to do one sort. This is a very old method taught to me in the first computer course I ever took (date censored); the instructor attributed the method to Von Neumann but I have no reference. You have to be careful choosing the sep character in paste. Here is an example> set.seed(78) > foo = data.frame(x= sample(letters[1:3],5,replace=TRUE),y= sample(1:5,5,replace=TRUE))> foox y 1 c 3 2 c 2 3 b 2 4 c 1 5 c 3 Sorting on y and then by x:> my.order=order(foo.paste=paste(foo[,2],foo[,1],sep="/")) > my.order[1] 4 3 2 1 5> my.order=order(paste(foo[,1],foo[,2],sep="/")) > foo[my.order,]x y 3 b 2 4 c 1 2 c 2 1 c 3 5 c 3>
Dear lists; I'm migrating to and slowly learning R. I want to expand this multicolumn sorting subject to counting the frequencies of mutiplicate rows. The motivation is to count the frequencies of individuals with same haplotypes in a population genetic study. A sample of table (ex.dta) is as follows: IDNUM DYS19 DYS388 DYS390 DYS393 DYS394 DYS395 TG002 200 129 203 133 251 119 TG053 200 129 203 133 251 119 TG020 200 129 207 133 251 127 TG066 NA NA NA NA NA NA TG104 200 129 203 133 251 119 TG018 NA NA 199 133 NA 119 TG060 200 129 203 133 251 119 TG058 NA NA NA 133 NA NA TG009 200 129 203 133 251 119 TG106 200 129 211 137 251 123 I did like this:> ex <- read.table( "ex.dta" , header=T, row.names=1 ) > one <- rep( 1,10 ) > aggregate( one , by=ex , sum )DYS19 DYS388 DYS390 DYS393 DYS394 DYS395 x 1 200 129 203 133 251 119 5 2 200 129 211 137 251 123 1 3 200 129 207 133 251 127 1 and got exactly what I wanted. However, as the table grows larger, the script takes longer time to complete. For 300x6 table, after about 10 minutes Windows complained low in virtual memory and increased the paging file while denying request from other applications. Eventually R crashed leaving Windows crippled. Did I miss something? Are there any ways other than the two line script above? Context: R 1.8.1 on WinXP Pro Rgui.exe --max-mem-size=400M Celeron 1GHz, 256 MB ram, free harddisk space 3.3 GB All best, Bambang Suryobroto, D.Sc Head, Laboratory of Zoology Department of Biology Faculty of Mathematics and Natural Sciences Bogor Agricultural University Jalan Pajajaran, Bogor 16143 INDONESIA Tel: +62-251-328391 Fax: +62-251-345011
Dear Jeff, I believe this works: data.frame <- data.frame[order(data.frame$var.1, data.frame$var.2),] HTH, Adrian ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd. 050025 Bucharest sector 5 Romania Tel./Fax: +40 (21) 312.66.18\ +40 (21) 312.02.10/ int.101 [[alternative HTML version deleted]]