Kaspar Pflugshaupt
2001-Feb-06 12:34 UTC
AW: [Rd] Proposal: Generalizing unique() and duplicated()
On Tuesday 06 February 2001 12:36, Dr. Jens Oehlschlägel wrote:> I like the idea. Why don't you call duplicated.matrix() directly in > unique.matrix() and duplicated.data.frame() in unique.data.frame() ? > > Jens OehlschlägelGood point. I guess I got carried away with using methods (having just gotten the hang of the concept). :-) Anyway, here's a corrected version: ---------------------------------------------------- "unique.default" <- get("unique", pos="package:base") # old version becomes # default behaviour "unique" <- function(object, ...) { if (data.class(object)=="matrix") return(unique.matrix(object, ...)) else UseMethod("unique") # doesn't seem to work for matrices, hence } # the condition "duplicated.default" <- get("duplicated", pos="package:base") "duplicated" <- function(object, ...) { if (data.class(object)=="matrix") return(duplicated.matrix(object, ...)) else UseMethod("duplicated") } "duplicated.matrix" <- function(mat, MARGIN=1) # defaulting to work on rows { strvect <- drop(apply(mat, MARGIN, function(x) paste(x, collapse = "\r"))) return(duplicated(strvect)) } "unique.matrix" <- function(mat, MARGIN=1) # defaulting to work on rows { dup <- duplicated.matrix(mat, MARGIN) return(if (MARGIN==1) mat[!dup,] else mat[,!dup]) } "duplicated.data.frame" <- function(df, MARGIN=1) { strvect <- drop(apply(as.matrix(df), MARGIN, function(x) paste(x, collapse = "\r"))) duplicated(strvect) } "unique.data.frame" <- function(df, MARGIN=1) { dup <- duplicated.data.frame(df, MARGIN) return(if (MARGIN==1) df[!dup,] else df[,!dup]) } ---------------------------------------------------- Cheers Kaspar Pflugshaupt -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Prof Brian Ripley
2001-Feb-06 12:49 UTC
AW: [Rd] Proposal: Generalizing unique() and duplicated()
Method dispatch is far from free (it is quite slow). Do we want to encumber unique() (a fast internal function) in this way? There are better ways to do this if one is going to use C code: converting to character and comparing long strings are both expensive, On Tue, 6 Feb 2001, Kaspar Pflugshaupt wrote:> On Tuesday 06 February 2001 12:36, Dr. Jens Oehlschlägel wrote: > > I like the idea. Why don't you call duplicated.matrix() directly in > > unique.matrix() and duplicated.data.frame() in unique.data.frame() ? > > > > Jens Oehlschlägel > > Good point. I guess I got carried away with using methods (having just gotten > the hang of the concept). :-) > > Anyway, here's a corrected version: > > ---------------------------------------------------- > > "unique.default" <- get("unique", pos="package:base") # old version becomes > # default behaviour > "unique" <- function(object, ...) > { > if (data.class(object)=="matrix") > return(unique.matrix(object, ...)) > else > UseMethod("unique") # doesn't seem to work for matrices, hence > } # the condition > > > > "duplicated.default" <- get("duplicated", pos="package:base") > > "duplicated" <- function(object, ...) > { > if (data.class(object)=="matrix") > return(duplicated.matrix(object, ...)) > else > UseMethod("duplicated") > } > > > "duplicated.matrix" <- > function(mat, MARGIN=1) # defaulting to work on rows > { > strvect <- drop(apply(mat, MARGIN, function(x) paste(x, collapse = "\r"))) > return(duplicated(strvect)) > } > > > "unique.matrix" <- > function(mat, MARGIN=1) # defaulting to work on rows > { > dup <- duplicated.matrix(mat, MARGIN) > return(if (MARGIN==1) mat[!dup,] else mat[,!dup]) > } > > > "duplicated.data.frame" <- > function(df, MARGIN=1) > { > strvect <- drop(apply(as.matrix(df), MARGIN, function(x) paste(x, collapse > = "\r"))) > duplicated(strvect) > } > > > "unique.data.frame" <- > function(df, MARGIN=1) > { > dup <- duplicated.data.frame(df, MARGIN) > return(if (MARGIN==1) df[!dup,] else df[,!dup]) > } > > ---------------------------------------------------- > > Cheers > > Kaspar Pflugshaupt > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._