Suppose I've got p vectors x1, ..., xp each of length n merged in a (n x p)-matrix X. Now I want to calculate the number of distinct values of row-vectors (x_i1, ..., x_ip). Is there a way to do this in R? Thanks in advance, Jan. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Mon, 5 Feb 2001, Jan Seidel wrote:> Suppose I've got p vectors x1, ..., xp each of length n merged in a (n x > p)-matrix X. > Now I want to calculate the number of distinct values of row-vectors > (x_i1, ..., x_ip). > Is there a way to do this in R? Thanks in advance,Completely distinct row vectors? Take a look at the code of merge.data.frame. Something like bx <- matrix(as.character(a), nrow(a)) bx <- drop(apply(bx, 1, function(x) paste(x, collapse = "\r"))) length(unique(bx)) This turns each row into a single character string, and counts the unique ones. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Kaspar Pflugshaupt
2001-Feb-06 10:25 UTC
[Rd] Proposal: Generalizing unique() and duplicated()
Prof. Ripley wrote on r-help:> Completely distinct row vectors? Take a look at the code of > merge.data.frame. Something like > > bx <- matrix(as.character(a), nrow(a)) > bx <- drop(apply(bx, 1, function(x) paste(x, collapse = "\r"))) > length(unique(bx)) > > This turns each row into a single character string, and counts the unique > ones.Hmmm... couldn't one build on this in order to generalize the unique() function? I'm asking because when I once tried to use unique() on a matrix (to collapse duplicate rows), I found that it and duplicated() work only on vectors. I think a generalization, at least for matrices and simple data.frames, would be useful. I tried my hand at it and came up with this: ---------------------------------------------------- "unique.default" <- get("unique", pos="package:base") # old version becomes # default behaviour "unique" <- function(object, ...) { if (data.class(object)=="matrix") return(unique.matrix(object, ...)) else UseMethod("unique") # doesn't seem to work for matrices, hence } # the condition "duplicated.default" <- get("duplicated", pos="package:base") "duplicated" <- function(object, ...) { if (data.class(object)=="matrix") return(duplicated.matrix(object, ...)) else UseMethod("duplicated") } "duplicated.matrix" <- function(mat, MARGIN=1) # defaulting to work on rows { strvect <- drop(apply(mat, MARGIN, function(x) paste(x, collapse = "\r"))) return(duplicated(strvect)) } "unique.matrix" <- function(mat, MARGIN=1) # defaulting to work on rows { dup <- duplicated(mat, MARGIN) return(if (MARGIN==1) mat[!dup,] else mat[,!dup]) } "duplicated.data.frame" <- function(df, MARGIN=1) { strvect <- drop(apply(as.matrix(df), MARGIN, function(x) paste(x, collapse = "\r"))) duplicated(strvect) } "unique.data.frame" <- function(df, MARGIN=1) { dup <- duplicated(df, MARGIN) return(if (MARGIN==1) df[!dup,] else df[,!dup]) } ---------------------------------------------------- I couldn't figure out how to generalize to more than two dimensions (more accurately, how to subset in the dimension given by the variable MARGIN). Does anybody else consider this useful? Cheers Kaspar Pflugshaupt -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._