Dear list, I'm having second thoughts after solving a very trivial problem: I want to extend the relevel() function to reorder an arbitrary number of levels of a factor in one go. I could not find a trivial way of using the code obtained by getS3method("relevel","factor"). Instead, I thought of solving the problem in a recursive manner (possibly after reading Paul Graham essays on Lisp too recently). Here is my attempt :> > order.factor <- function (x, ref) > { > > last.index <- length(ref) # convenience for matlab's end keyword > if(last.index == 1) return(relevel(x, ref)) # end case, normal case > of relevel > my.new.list <- list(x=relevel(x, ref[last.index]), # creating a > list with updated parameters, > # going through the list in reverse order > ref=ref[-last.index]) # chop the vector from its last level > return(do.call(order.factor, my.new.list)) # recursive call > } > > ff <- factor(c("a", "b", "c", "d")) > ff > relevel(ff, levels(ff)[1]) > relevel(ff, levels(ff)[2]) # that's the usual case: you want to put > a level first > > order.factor(x=ff, ref=c("a", "b")) > order.factor(x=ff, ref=c("c")) > order.factor(x=ff, ref=c("c", "d")) # that's my wish: put c and d in > that order as the first two levels >I'm hoping this can be improved in several aspects: - there is probably already a better function I missed or overlooked (I'd still be curious about the following points, though) - after reading a few threads, it appears that some recursive functions are fragile in some sense, and I'm not sure what this means in practice. (Should I use Recall, somehow?) - it's probably quite slow for large data.frames - I could not think of a good name, this one might clash with some S3 method perhaps? - any other thoughts welcome! Best wishes, Baptiste _____________________________ Baptiste Augui? School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag
I think that you can still use to core of stats:::relevel.factor; the only thing that needs to be changed is the controls for bad values of the 'ref' argument, i.e., relevelNew <- function (x, ref, ...) { lev <- levels(x) if (is.character(ref)) ref <- match(ref, lev) if (any(is.na(ref))) stop("'ref' must be an existing level") nlev <- length(lev) if (any(ref < 1 | ref > nlev)) stop(gettextf("ref = %d must be in 1:%d", ref, nlev), domain = NA) factor(x, levels = lev[c(ref, seq_along(lev)[-ref])]) } ff <- factor(c("a", "b", "c", "d")) ff relevelNew(ff, "c") relevelNew(ff, c("c", "d")) I hope it helps. Best, Dimitris baptiste auguie wrote:> Dear list, > > I'm having second thoughts after solving a very trivial problem: I want > to extend the relevel() function to reorder an arbitrary number of > levels of a factor in one go. I could not find a trivial way of using > the code obtained by getS3method("relevel","factor"). Instead, I thought > of solving the problem in a recursive manner (possibly after reading > Paul Graham essays on Lisp too recently). Here is my attempt : > >> >> order.factor <- function (x, ref) >> { >> >> last.index <- length(ref) # convenience for matlab's end keyword >> if(last.index == 1) return(relevel(x, ref)) # end case, normal >> case of relevel >> my.new.list <- list(x=relevel(x, ref[last.index]), # creating a >> list with updated parameters, >> # going >> through the list in reverse order >> ref=ref[-last.index]) # chop the vector >> from its last level >> return(do.call(order.factor, my.new.list)) # recursive call >> } >> >> ff <- factor(c("a", "b", "c", "d")) >> ff >> relevel(ff, levels(ff)[1]) >> relevel(ff, levels(ff)[2]) # that's the usual case: you want to put a >> level first >> >> order.factor(x=ff, ref=c("a", "b")) >> order.factor(x=ff, ref=c("c")) >> order.factor(x=ff, ref=c("c", "d")) # that's my wish: put c and d in >> that order as the first two levels >> > > > I'm hoping this can be improved in several aspects: > > - there is probably already a better function I missed or overlooked > (I'd still be curious about the following points, though) > > - after reading a few threads, it appears that some recursive functions > are fragile in some sense, and I'm not sure what this means in practice. > (Should I use Recall, somehow?) > > - it's probably quite slow for large data.frames > > - I could not think of a good name, this one might clash with some S3 > method perhaps? > > - any other thoughts welcome! > > > Best wishes, > > Baptiste > _____________________________ > > Baptiste Augui? > > School of Physics > University of Exeter > Stocker Road, > Exeter, Devon, > EX4 4QL, UK > > Phone: +44 1392 264187 > > http://newton.ex.ac.uk/research/emag > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014
Dear Baptiste, You can avoid the recursive stuff. And it will run about twice as fast.> order.factor <- function (x, ref)+ { + last.index <- length(ref) # convenience for matlab's end keyword + if(last.index == 1) return(relevel(x, ref)) # end case, normal case + my.new.list <- list(x=relevel(x, ref[last.index]), ref=ref[-last.index]) + return(do.call(order.factor, my.new.list)) # recursive call + }> > order.factor2 <- function(x, ref){+ factor(x, levels = c(ref, sort(levels(x)[!levels(x) %in% ref]))) + }> order.factor3 <- function(x, ref){+ factor(x, levels = c(ref, sort(levels(x)[!levels(x) %in% ref])), labels = c(ref, sort(levels(x)[!levels(x) %in% ref]))) + }> > x <- factor(sample(LETTERS[1:5], 10000000, replace = TRUE)) > y <- factor(sample(LETTERS[1:20], 10000000, replace = TRUE)) > system.time(order.factor(x, c("D", "B")))user system elapsed 5.69 0.38 6.09> system.time(order.factor2(x, c("D", "B")))user system elapsed 3.90 0.20 4.12> system.time(order.factor3(x, c("D", "B")))user system elapsed 3.26 0.19 3.46> system.time(order.factor(y, c("D", "B")))user system elapsed 17.43 0.39 17.84> system.time(order.factor3(y, c("D", "B")))user system elapsed 8.25 0.17 8.46 HTH, Thierry ---------------------------------------------------------------------------- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -----Oorspronkelijk bericht----- Van: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Namens baptiste auguie Verzonden: vrijdag 9 januari 2009 15:11 Aan: R R-help Onderwerp: [R] recursive relevel Dear list, I'm having second thoughts after solving a very trivial problem: I want to extend the relevel() function to reorder an arbitrary number of levels of a factor in one go. I could not find a trivial way of using the code obtained by getS3method("relevel","factor"). Instead, I thought of solving the problem in a recursive manner (possibly after reading Paul Graham essays on Lisp too recently). Here is my attempt :> > order.factor <- function (x, ref) > { > > last.index <- length(ref) # convenience for matlab's end keyword > if(last.index == 1) return(relevel(x, ref)) # end case, normal case > of relevel > my.new.list <- list(x=relevel(x, ref[last.index]), # creating a > list with updated parameters, > # going through the list in reverse order > ref=ref[-last.index]) # chop the vector from its last level > return(do.call(order.factor, my.new.list)) # recursive call > } > > ff <- factor(c("a", "b", "c", "d")) > ff > relevel(ff, levels(ff)[1]) > relevel(ff, levels(ff)[2]) # that's the usual case: you want to put > a level first > > order.factor(x=ff, ref=c("a", "b")) > order.factor(x=ff, ref=c("c")) > order.factor(x=ff, ref=c("c", "d")) # that's my wish: put c and d in > that order as the first two levels >I'm hoping this can be improved in several aspects: - there is probably already a better function I missed or overlooked (I'd still be curious about the following points, though) - after reading a few threads, it appears that some recursive functions are fragile in some sense, and I'm not sure what this means in practice. (Should I use Recall, somehow?) - it's probably quite slow for large data.frames - I could not think of a good name, this one might clash with some S3 method perhaps? - any other thoughts welcome! Best wishes, Baptiste _____________________________ Baptiste Augui? School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.