Hi all, I have some email addresses that I would like to sort in reverse lexicographic order so that addresses from the same domain will be grouped together. How might that be done? Murray -- Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html Department of Statistics, University of Waikato, Hamilton, New Zealand Email: maj at waikato.ac.nz Fax 7 838 4155 Phone +64 7 838 4773 wk +64 7 849 6486 home Mobile 021 1395 862
Murray - If you could guarantee that all of the email addresses have exactly one occurrence of the "@" character in them, then something like spit <- do.call("rbind", strsplit(addresses, "@", FALSE)) will produce a data frame with either two or three character vectors as the columns, in which user name is a separate column from the domain name. Now use spit <- spit[ order(spit[ ,3], spit[ ,1]), ] to re-sort the data frame by user name within domain name, and paste() to put the columns back together again. - tom blackwell - u michigan medical school - ann arbor - On Mon, 15 Dec 2003, Murray Jorgensen wrote:> Hi all, > > I have some email addresses that I would like to sort in reverse > lexicographic order so that addresses from the same domain will be > grouped together. How might that be done? > > Murray > > -- > Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html > Department of Statistics, University of Waikato, Hamilton, New Zealand > Email: maj at waikato.ac.nz Fax 7 838 4155 > Phone +64 7 838 4773 wk +64 7 849 6486 home Mobile 021 1395 862 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help >
On Mon, 15 Dec 2003 10:08:31 +1300, you wrote:>Hi all, > >I have some email addresses that I would like to sort in reverse >lexicographic order so that addresses from the same domain will be >grouped together. How might that be done?I'm not sure this is what you want, but this function sorts a character vector by last letters, then 2nd last, 3rd last, and so on: revsort <- function(x,...) { x[order(unlist(lapply(strsplit(x,''), function(x) paste(rev(x),collapse=''))),...)] }> revsort(as.character(1:20))[1] "10" "20" "1" "11" "2" "12" "3" "13" "4" "14" "5" "15" "6" "16" "7" [16] "17" "8" "18" "9" "19" The ... args are given to order(), so na.last=FALSE and decreasing=TRUE are possibilities. Duncan Murdoch
Dr Murray Jorgensen (maj at waikato.ac.nz) wrote: > I have some email addresses that I would like to sort in reverse > lexicographic order so that addresses from the same domain will be > grouped together. How might that be done? Because he wants addresses from the same domain to be grouped together (so that foo.bar.ick.ac should be in the same group as zoo.sno.ick.ac), it is not sufficient to split at the at-sign. The obvious method is (1) reverse the strings (2) sort the reversed strings (3) reverse the sorted reversed strings All of this is obvious except reversing the strings. There's ?rev, which reverses vectors, but no strrev. Here's a strrev() I knocked together quickly: strrev <- function (s) paste(rev(strsplit(s, character(0))[[1]]), collapse="") If anyone can tell me how to vectorise this, I would be glad of the lesson.
I wrote:> If anyone can tell me how to vectorise this, I would be glad of the lesson.where "this" was> strrev <- > function (s) paste(rev(strsplit(s, character(0))[[1]]), collapse="")Thomas Lumley <tlumley at u.washington.edu> suggested strrev<- function(ss) { sapply(lapply( strsplit(ss,character(0)), rev), paste, collapse="") } Unfortunately, I failed to explain myself clearly, so this doesn't actually answer the question I _meant_ to ask. For me, sticking in some variant of 'apply' means you have _failed_ to vectorise. The string reversal code in ?rev doesn't count for the same reason. There is no reason why a built-in strrev() couldn't be as vectorised as most built-ins, it's just not common enough to deserve a lot of effort.
Here is a way to do it without using apply. sep must be set to a character not in any of the strings. Below we show its much faster than using apply yet gives the same answer. strRev <- function(x, sep = "\10") { z <- unlist( strsplit( paste( x, sep, sep="" ), "" ) ) z <- unlist( strsplit( paste( rev( z ), collapse="" ), sep ) ) rev( z[-1] ) } # Following taken from examples in ?strsplit strReverse <- function(x) sapply(lapply(strsplit(x,NULL), rev), paste, collapse="")> data(state)> system.time(for(i in 1:100)strRev(state.name))[1] 0.22 0.01 0.23 NA NA> system.time(for(i in 1:100)strReverse(state.name))[1] 1.07 0.00 1.83 NA NA> all.equal(strRev(state.name),strReverse(state.name))[1] TRUE --- Date: Tue, 16 Dec 2003 17:41:17 +1300 (NZDT) From: Richard A. O'Keefe <ok at cs.otago.ac.nz> To: <r-help at stat.math.ethz.ch> Subject: Re: [R] reverse lexicographic order I wrote:> If anyone can tell me how to vectorise this, I would be glad of the lesson.where "this" was> strrev <- > function (s) paste(rev(strsplit(s, character(0))[[1]]), collapse="")Thomas Lumley <tlumley at u.washington.edu> suggested strrev<- function(ss) { sapply(lapply( strsplit(ss,character(0)), rev), paste, collapse="") } Unfortunately, I failed to explain myself clearly, so this doesn't actually answer the question I _meant_ to ask. For me, sticking in some variant of 'apply' means you have _failed_ to vectorise. The string reversal code in ?rev doesn't count for the same reason. There is no reason why a built-in strrev() couldn't be as vectorised as most built-ins, it's just not common enough to deserve a lot of effort. ______________________________________________ R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help