> observation at 0% and the largest at 100% and the value at observation
> point i being (i-1)*100%/(N-1). This is not quite the definition you
> have and often not what people expect...
I recall (and have looked up) that SAS offers five ways to compute
percentiles, which are in short, 1. weighted average, 2. closest observation
number, 3. empirical distribution function, 4. "aimed" weighted
average, and
5. empirical distribution function with averaging. The last is the default.
We once choose a different option, for a certain question. Here is a quick
translation into R, could this be interesting for anyone?
#################################################
# percentile computation functions, adapted from
# SAS Procedures Guide, Version 6, 3rd ed., 625f.
#
perc <- function (x, p = c(5,10,25,50,75,95), pctldef = 5) {
#
# initialization lacking error checking
#
x <- na.omit (as.vector (x)) # transform input into non-empty vector
p <- na.omit (as.vector (p)) # wanted percentiles into non-empty vector
x <- sort (x) # sort it in ascending order
n <- length (x) # length of non-empty vector
#
if (pctldef == 4) n <- n + 1 # increase n by one in case pctldef == 4
#
j <- trunc (n * p / 100) # j is the integer part of the product n * p
g <- - j + (n * p / 100) # g is the fractional part of the product n *
p
#
# the different computational procedures follow
#
if (pctldef == 1) # weighted average at x\sub{np}
# x\sub{0} is taken to be x\sub{1}, cf. above
perc <- (1 - g) * x [j] + g * x [j + 1]
#
if (pctldef == 2) { # observation number closest to np
i <- ifelse (g == 0.5,
ifelse (trunc (j / 2) == j / 2, j, j + 1),
trunc (n * p / 100 + 0.5))
perc <- x [i]}
#
if (pctldef == 3) # empirical distribution function
perc <- x [ifelse (g == 0, j, j + 1)]
#
if (pctldef == 4) # weighted average aimed at x\sub{p*(n+1)}
# x\sub{n+1} is taken to be x\sub{n}, cf.
above
perc <- (1 - g) * x [j] + g * x [j + 1]
#
if (pctldef == 5) # empirical distribution function with
averaging
perc <- ifelse (g == 0,
(x [j] + x [j + 1]) / 2,
x [j + 1])
#
names (perc) <- p
return (perc)
}
# Ralf Herold, ralf herold at charite.de, 17.10.2000 18:42:04
############################################################################
###
# for testing:
test.x <- c (1,2,3,4,5,6,7,8,9,4,2,3,6,7,8,2,5,2)
test.p <- c (26, 51)
for (i in 1:5) print (perc (test.x, test.p, i))
# end.
---- Dr. med. Ralf Herold
| Koordinationszentrale Kompetenznetz P?diatrische Onkologie und H?matologie
| Charit? Campus Virchow-Klinikum Medizinische Fakult?t Humboldt-Universit?t
| D-13353 Berlin, Augustenburger Platz 1, Raum 4.3412 4. Etage Mittelallee 8
| Tel. +49(30)450-66834 Fax -66906 Sprach-/Faxbox +49(180)505254-873936
| mailto:ralf.herold at charite.de http://www.knm-poh.charite.de/
> -----Original Message-----
> From: owner-r-help at stat.math.ethz.ch
> [mailto:owner-r-help at stat.math.ethz.ch]On Behalf Of Peter Dalgaard BSA
> Sent: Tuesday, October 17, 2000 2:40 PM
> To: salwin at mitretek.org
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] Percentile function
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._