In some contexts, I find the current behavior of rank() very `suboptimal'. We have the argument na.last = {TRUE | FALSE | NA } where the first two cases treating NAs (almost) as if they were == +Inf or == -Inf whereas the 3rd case just drops NAs. For the typical ``Rank Transformation'' that is recommended in EDA in several contexts, I would however want something else, namely keep the NAs ! An example -- including the new option as I'm proposing it --- makes things more clear :> y <- c(2:1,NA,0) > rank(y, na.last = TRUE)## ==== rank(y)[1] 3 2 4 1> rank(y, na.last = FALSE)[1] 4 3 1 2> rank(y, na.last = NA)[1] 3 2 1> rank(y, na.last = "keep") ### <<<<< NEW >>[1] 3 2 NA 1>--- Alternatively to extending the possible values of `na.last' I first thought of a new (boolean) argument, but found the current solution less ugly. Feedback welcome! Martin Maechler <maechler@stat.math.ethz.ch> http://stat.ethz.ch/~maechler/ Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-3408 fax: ...-1228 <>< PS: Stumbled over this while implementing cor.test()s method = c("pearson", "spearman", "kendall") for cor() itself.
Gabor Grothendieck
2003-Sep-11 16:09 UTC
[Rd] rank(*) with NAs -- new option "keep" desired
I could have used this also. Currently I do this: z <- ifelse(is.na(y),NA,rank(y)) names(z) <- names(y) The following also works but is less transparent: z <- y z[z==z] <- rank(y) Another extension of rank that could be useful would be to have an argument that causes it NOT to do tie averaging. This is useful when you are using rank(x) in the sense of an inverse permutation. Currently I do this: z <- order(order(y)) names(z) <- names(y) --- Martin Maechler <maechler@stat.math.ethz.ch> wrote:>In some contexts, I find the current behavior of rank() very >`suboptimal'. > >We have the argument na.last = {TRUE | FALSE | NA } >where the first two cases treating NAs (almost) as if they were > == +Inf or == -Inf whereas the 3rd case just drops NAs. >For the typical ``Rank Transformation'' that is recommended in >EDA in several contexts, I would however want something else, >namely keep the NAs ! > >An example -- including the new option as I'm proposing it --- >makes things more clear : > >> y <- c(2:1,NA,0) >> rank(y, na.last = TRUE)## ==== rank(y) >[1] 3 2 4 1 >> rank(y, na.last = FALSE) >[1] 4 3 1 2 >> rank(y, na.last = NA) >[1] 3 2 1 >> rank(y, na.last = "keep") ### <<<<< NEW >> >[1] 3 2 NA 1 >> >--- > >Alternatively to extending the possible values of `na.last' I >first thought of a new (boolean) argument, but found the current >solution less ugly. > >Feedback welcome! > >Martin Maechler <maechler@stat.math.ethz.ch> http://stat.ethz.ch/~maechler/ >Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27 >ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND >phone: x-41-1-632-3408 fax: ...-1228 <>< > >PS: Stumbled over this while implementing cor.test()s > method = c("pearson", "spearman", "kendall") for cor() > itself. > >______________________________________________ >R-devel@stat.math.ethz.ch mailing list >https://www.stat.math.ethz.ch/mailman/listinfo/r-devel