Hello, What is the best way to get ranks for a vector of values, limit the range of rank values and create equal count in each group? I call this uniform ranking...uniform count/number in each group. Here is an example using three groups: Say I have values: x = c(3, 2, -3, 1, 0, 5, 10, 30, -1, 4) names(x) = letters[1:10]> xa b c d e f g h i j 3 2 -3 1 0 5 10 30 -1 4 I would like: a b c d e f g h i j 2 2 1 2 1 3 3 3 1 3 Same thing as above, maybe easier to see: c i e d b a j f g h -3 -1 0 1 2 3 4 5 10 30 I would get: c i e d b a j f g h 1 1 1 2 2 2 3 3 3 3 Note that there are 4 values with a rank of 3 because I can't get even numbers (10/3 = 3.333). Been to ?sort, ?order, ?quantile, ?cut, and ?split. Thanks, Ben [[alternative HTML version deleted]]
Hi, On Wed, Feb 22, 2012 at 4:01 PM, Ben quant <ccquant at gmail.com> wrote:> Hello, > > What is the best way to get ranks for a vector of values, limit the range > of rank values and create equal count in each group? I call this uniform > ranking...uniform count/number in each group. > > Here is an example using three groups: > > Say I have values: > x = c(3, 2, -3, 1, 0, 5, 10, 30, -1, 4) > names(x) = letters[1:10] >> x > a ?b ?c ?d ?e ?f ? g ? h ? i ? j > 3 ?2 -3 ?1 ?0 ?5 10 30 -1 ?4 > I would like: > a ?b ?c ?d ?e ?f ?g ?h ?i ?j > 2 ?2 ?1 ?2 ?1 ?3 3 ?3 ?1 3 > > Same thing as above, maybe easier to see: > ?c ? i ?e ?d ?b ?a ? j ? f ?g ? h > -3 -1 ?0 ?1 ?2 ?3 ?4 ?5 10 30 > I would get: > c ?i e d b a ?j f ?g h > 1 1 1 2 2 2 3 3 3 3Thanks for the clear example.> Note that there are 4 values with a rank of 3 because I can't get even > numbers (10/3 = 3.333). > > Been to ?sort, ?order, ?quantile, ?cut, and ?split.What's wrong with: as.numeric(cut(x, c(min(x)-1, quantile(x, .33), quantile(x, .66), max(x) + 1))) [1] 1 1 1 2 2 2 3 3 3 3 Sarah -- Sarah Goslee http://www.functionaldiversity.org
On Feb 22, 2012, at 4:01 PM, Ben quant wrote:> Hello, > > What is the best way to get ranks for a vector of values, limit the > range > of rank values and create equal count in each group? I call this > uniform > ranking...uniform count/number in each group. > > Here is an example using three groups: > > Say I have values: > x = c(3, 2, -3, 1, 0, 5, 10, 30, -1, 4) > names(x) = letters[1:10] >> x > a b c d e f g h i j > 3 2 -3 1 0 5 10 30 -1 4 > I would like: > a b c d e f g h i j > 2 2 1 2 1 3 3 3 1 3 > > Same thing as above, maybe easier to see: > c i e d b a j f g h > -3 -1 0 1 2 3 4 5 10 30 > I would get: > c i e d b a j f g h > 1 1 1 2 2 2 3 3 3 3 > > Note that there are 4 values with a rank of 3 because I can't get even > numbers (10/3 = 3.333). > > Been to ?sort, ?order, ?quantile, ?cut, and ?split.You may need to look more carefully at the definitions and adjustments to `cut` and `quantile` but this does roughly what you asked: n=3 as.numeric( cut(x, breaks=quantile(x, prob=(0:n)/n) , include.lowest=TRUE) ) @ [1] 1 1 1 1 2 2 2 3 3 3 It a fairly common task and Harrell's cut2 function has a g= parameter (for number of groups) that I generally use: library(Hmisc) > cut2(x, g=3) [1] [-3, 2) [-3, 2) [-3, 2) [-3, 2) [ 2, 5) [ 2, 5) [ 2, 5) [ 5,30] [ 5,30] [ 5,30] Levels: [-3, 2) [ 2, 5) [ 5,30] > as.numeric( cut2(x, g=3)) [1] 1 1 1 1 2 2 2 3 3 3> > Thanks, > > Ben > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT