thr3ads.net - R help - [R] Percentiles/Quantiles with Weighting [Feb 2009]

If this information is useful, please help other people find it:
Share via:

Brigid Mooney

2009-Feb-17 16:57 UTC

[R] Percentiles/Quantiles with Weighting

Hi All,

I am looking at applications of percentiles to time sequenced data.  I had
just been using the quantile function to get percentiles over various
periods, but am more interested in if there is an accepted (and/or
R-implemented) method to apply weighting to the data so as to weigh recent
data more heavily.

I wrote the following function, but it seems quite inefficient, and not
really very flexible in its applications - so if anyone has any suggestions
on how to look at quantiles/percentiles within R while also using a
weighting schema, I would be very interested.

Note - this function supposes the data in X is time-sequenced, with the most
recent (and thus heaviest weighted) data at the end of the vector

WtPercentile <- function(X=rnorm(100), pctile=seq(.1,1,.1))
{
  Xprime <- NA

  for(i in 1:length(X))
  {
    Xprime <- c(Xprime, rep(X[i], times=i))
  }

  print("Percentiles:")
  print(quantile(X, pctile))
  print("Weighted:")
  print(Xprime)
  print("Weighted Percentiles:")
  print(quantile(Xprime, pctile, na.rm=TRUE))
}

WtPercentile(1:10)
WtPercentile(rnorm(10))

	[[alternative HTML version deleted]]

David Winsemius

2009-Feb-17 17:22 UTC

head link

[R] Percentiles/Quantiles with Weighting

I do know that Harrell's Quantile function in the Hmisc package will  
allow quantile estimates from models. Whether it is general enough to  
extend to time series, I have no experience and cannot say.

-- 
David Winsemius


On Feb 17, 2009, at 11:57 AM, Brigid Mooney wrote:
> Hi All,
>
> I am looking at applications of percentiles to time sequenced data.   
> I had
> just been using the quantile function to get percentiles over various
> periods, but am more interested in if there is an accepted (and/or
> R-implemented) method to apply weighting to the data so as to weigh  
> recent
> data more heavily.
>
> I wrote the following function, but it seems quite inefficient, and  
> not
> really very flexible in its applications - so if anyone has any  
> suggestions
> on how to look at quantiles/percentiles within R while also using a
> weighting schema, I would be very interested.
>
> Note - this function supposes the data in X is time-sequenced, with  
> the most
> recent (and thus heaviest weighted) data at the end of the vector
>
> WtPercentile <- function(X=rnorm(100), pctile=seq(.1,1,.1))
> {
>  Xprime <- NA
>
>  for(i in 1:length(X))
>  {
>    Xprime <- c(Xprime, rep(X[i], times=i))
>  }
>
>  print("Percentiles:")
>  print(quantile(X, pctile))
>  print("Weighted:")
>  print(Xprime)
>  print("Weighted Percentiles:")
>  print(quantile(Xprime, pctile, na.rm=TRUE))
> }
>
> WtPercentile(1:10)
> WtPercentile(rnorm(10))
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Brigid Mooney

2009-Feb-17 19:58 UTC

head link

[R] Percentiles/Quantiles with Weighting

Thanks for pointing me to the quantreg package as a resource.  I was hoping
to ask be able to address one quick follow-up question...

I get slightly different variants between using the rq funciton with formula
= mydata ~ 1 as I would if I ran the same data using the quantile function.

Example:

mydata <- (1:10)^2/2
pctile <- seq(.59, .99, .1)

quantile(mydata, pctile)
59%    69%    79%    89%    99%
20.015 26.075 32.935 40.595 49.145

rq(mydata~1, tau=pctile)
Call:
rq(formula = mydata ~ 1, tau = pctile)
Coefficients:
            tau= 0.59 tau= 0.69 tau= 0.79 tau= 0.89 tau= 0.99
(Intercept)        18      24.5        32      40.5        50
Degrees of freedom: 10 total; 9 residual

Is it correct to assume this is due to the different accepted methods of
calculating quantiles?  If so, do you know where I would be able to see the
algorithms used in these functions?  I'm not finding it in the documentation
for function rq, and am new enough to R that I don't know where those
references would generally be.




On Tue, Feb 17, 2009 at 12:29 PM, roger koenker <rkoenker@uiuc.edu> wrote:
> http://www.nabble.com/weighted-quantiles-to19864562.html#a19865869
>
> gives one possibility...
>
> url:    www.econ.uiuc.edu/~roger            Roger Koenker
> email    rkoenker@uiuc.edu            Department of Economics
> vox:     217-333-4558                University of Illinois
> fax:       217-244-6678                Champaign, IL 61820
>
>
>
>
> On Feb 17, 2009, at 10:57 AM, Brigid Mooney wrote:
>
>   Hi All,
>>
>> I am looking at applications of percentiles to time sequenced data.  I
had
>> just been using the quantile function to get percentiles over various
>> periods, but am more interested in if there is an accepted (and/or
>> R-implemented) method to apply weighting to the data so as to weigh
recent
>> data more heavily.
>>
>> I wrote the following function, but it seems quite inefficient, and not
>> really very flexible in its applications - so if anyone has any
>> suggestions
>> on how to look at quantiles/percentiles within R while also using a
>> weighting schema, I would be very interested.
>>
>> Note - this function supposes the data in X is time-sequenced, with the
>> most
>> recent (and thus heaviest weighted) data at the end of the vector
>>
>> WtPercentile <- function(X=rnorm(100), pctile=seq(.1,1,.1))
>> {
>>  Xprime <- NA
>>
>>  for(i in 1:length(X))
>>  {
>>   Xprime <- c(Xprime, rep(X[i], times=i))
>>  }
>>
>>  print("Percentiles:")
>>  print(quantile(X, pctile))
>>  print("Weighted:")
>>  print(Xprime)
>>  print("Weighted Percentiles:")
>>  print(quantile(Xprime, pctile, na.rm=TRUE))
>> }
>>
>> WtPercentile(1:10)
>> WtPercentile(rnorm(10))
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>>
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
	[[alternative HTML version deleted]]

Stavros Macrakis

2009-Feb-17 22:48 UTC

head link

[R] Percentiles/Quantiles with Weighting

Here is one kind of weighted quantile function.

The basic idea is very simple:

wquantile <- function( v, w, p )
  {
    v <- v[order(v)]
    w <- w[order(v)]
    v [ which.max( cumsum(w) / sum(w) >= p ) ]
  }

With some more error-checking and general clean-up, it looks like this:

# Simple weighted quantile
#
# v  A numeric vector of observations
# w  A numeric vector of positive weights
# p  The probability 0<=p<=1
#
# Nothing fancy: no interpolation etc.

# Basic idea

wquantile <- function( v, w, p )
  {
    v <- v[order(v)]
    w <- w[order(v)]
    v [ which.max( cumsum(w) / sum(w) >= p ) ]
  }

# Simple weighted quantile
#
# v  A numeric vector of observations
# w  A numeric vector of positive weights
# p  The probability 0<=p<=1
#
# Nothing fancy: no interpolation etc.

wquantile <- function(v,w=rep(1,length(v)),p=.5)
   {
     if (!is.numeric(v) || !is.numeric(w) || length(v) != length(w))
       stop("Values and weights must be equal-length numeric vectors")
     if ( !is.numeric(p) || any( p<0 | p>1 ) )
       stop("Quantiles must be 0<=p<=1")
     ranking <- order(v)
     sumw <- cumsum(w[ranking])
     if ( is.na(w[1]) || w[1]<0 ) stop("Weights must be non-negative
numbers")
     plist <- sumw/sumw[length(sumw)]
     sapply(p, function(p) v [ ranking [ which.max( plist >= p ) ] ])
   }

I would appreciate any comments people have on this -- whether
correctness, efficiency, style, ....

              -s


On Tue, Feb 17, 2009 at 11:57 AM, Brigid Mooney <bkmooney at gmail.com>
wrote:> Hi All,
>
> I am looking at applications of percentiles to time sequenced data.  I had
> just been using the quantile function to get percentiles over various
> periods, but am more interested in if there is an accepted (and/or
> R-implemented) method to apply weighting to the data so as to weigh recent
> data more heavily.
>
> I wrote the following function, but it seems quite inefficient, and not
> really very flexible in its applications - so if anyone has any suggestions
> on how to look at quantiles/percentiles within R while also using a
> weighting schema, I would be very interested.
>
> Note - this function supposes the data in X is time-sequenced, with the
most
> recent (and thus heaviest weighted) data at the end of the vector
>
> WtPercentile <- function(X=rnorm(100), pctile=seq(.1,1,.1))
> {
>  Xprime <- NA
>
>  for(i in 1:length(X))
>  {
>    Xprime <- c(Xprime, rep(X[i], times=i))
>  }
>
>  print("Percentiles:")
>  print(quantile(X, pctile))
>  print("Weighted:")
>  print(Xprime)
>  print("Weighted Percentiles:")
>  print(quantile(Xprime, pctile, na.rm=TRUE))
> }
>
> WtPercentile(1:10)
> WtPercentile(rnorm(10))
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Stavros Macrakis

2009-Feb-18 04:25 UTC

head link

[R] Fwd: Percentiles/Quantiles with Weighting

Some minor improvements and corrections below

# Simple weighted quantile
#
# v  A vector of sortable observations
# w A numeric vector of positive weights
# p  The quantile 0<=p<=1
#
# Nothing fancy: no interpolation etc.; NA cases not thought through

 wquantile <- function(v,w=rep(1,length(v)),p=.5)
  {
    if ( !is.numeric(w) || length(v) != length(w) )
      stop("Values and weights must be equal-length vectors")
    if ( !is.numeric(p) || any( p<0 | p>1) )
      stop("Quantiles must be 0<=p<=1")
    if ( min(w) < 0 ) stop("Weights must be non-negative numbers")
    ranking <- order(v)
    sumw <- cumsum(w[ranking])
    plist <- sumw / sumw[ length(sumw) ]
    sapply(p, function(p) v [ ranking [ which.max( plist >= p ) ] ] )
  }

I would appreciate any comments people have on this -- whether
correctness, efficiency, style, ....

             -s

Dieter Menne

2009-Feb-18 06:59 UTC

head link

[R] Fwd: Percentiles/Quantiles with Weighting

Stavros Macrakis <macrakis <at> alum.mit.edu> writes:
> 
> Some minor improvements and corrections below
> 
> # Simple weighted quantile
> #
> # v  A vector of sortable observations
> # w A numeric vector of positive weights
> # p  The quantile 0<=p<=1
> #
> # Nothing fancy: no interpolation etc.; NA cases not thought through
> 
>  wquantile <- function(v,w=rep(1,length(v)),p=.5)
You could compare it with wtd.quantile in Hmisc.

Dieter

Thomas Lumley

2009-Feb-18 09:50 UTC

head link

[R] Percentiles/Quantiles with Weighting

On Tue, 17 Feb 2009, Brigid Mooney wrote:
> Thanks for pointing me to the quantreg package as a resource.  I was hoping
> to ask be able to address one quick follow-up question...
>
> I get slightly different variants between using the rq funciton with
formula
> = mydata ~ 1 as I would if I ran the same data using the quantile function.
>
> Example:
>
> mydata <- (1:10)^2/2
> pctile <- seq(.59, .99, .1)
>
> quantile(mydata, pctile)
> 59%    69%    79%    89%    99%
> 20.015 26.075 32.935 40.595 49.145
>
> rq(mydata~1, tau=pctile)
> Call:
> rq(formula = mydata ~ 1, tau = pctile)
> Coefficients:
>            tau= 0.59 tau= 0.69 tau= 0.79 tau= 0.89 tau= 0.99
> (Intercept)        18      24.5        32      40.5        50
> Degrees of freedom: 10 total; 9 residual
>
> Is it correct to assume this is due to the different accepted methods of
> calculating quantiles?
If you try
  lapply(1:9, function(i)quantile(mydata, pctile,type=i))
the answers from type=1 or 2 agree with rq().

       -thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

Apparently Analagous Threads

Search for more possibly parallel threads

R help - Feb 2009 - Percentiles/Quantiles with Weighting

[R] Percentiles/Quantiles with Weighting

[R] Percentiles/Quantiles with Weighting

[R] Percentiles/Quantiles with Weighting

[R] Percentiles/Quantiles with Weighting

[R] Fwd: Percentiles/Quantiles with Weighting

[R] Fwd: Percentiles/Quantiles with Weighting

[R] Percentiles/Quantiles with Weighting

Apparently Analagous Threads