thr3ads.net - R help - [R] Problem with Weighted Variance in Hmisc [May 2007]

If this information is useful, please help other people find it:
Share via:

Tom La Bone

2007-May-31 23:03 UTC

[R] Problem with Weighted Variance in Hmisc

The function wtd.var(x,w) in Hmisc calculates the weighted variance of x
where w are the weights.  It appears to me that wtd.var(x,w) = var(x) if all
of the weights are equal, but this does not appear to be the case. Can
someone point out to me where I am going wrong here?  Thanks.

 

Tom La Bone


	[[alternative HTML version deleted]]

jiho

2007-Jun-01 06:16 UTC

head link

[R] Problem with Weighted Variance in Hmisc

On 2007-June-01  , at 01:03 , Tom La Bone wrote:> The function wtd.var(x,w) in Hmisc calculates the weighted variance  
> of x
> where w are the weights.  It appears to me that wtd.var(x,w) = var 
> (x) if all
> of the weights are equal, but this does not appear to be the case. Can
> someone point out to me where I am going wrong here?  Thanks.
The true formula of weighted variance is this one:
	http://www.itl.nist.gov/div898/software/dataplot/refman2/ch2/ 
weighvar.pdf
But for computation purposes, wtd.var uses another definition which  
considers the weights as repeats instead of true weights. However if  
the weights are normalized (sum to one) to two formulas are equal. If  
you consider weights as real weights instead of repeats, I would  
recommend to use this option.
With normwt=T, your issue is solved:

 > a=1:10
 > b=a
 > b[]=2
 > b
[1] 2 2 2 2 2 2 2 2 2 2
 > wtd.var(a,b)
[1] 8.68421
# all weights equal 2 <=> there are two repeats of each element of a
 > var(c(a,a))
[1] 8.68421
 > wtd.var(a,b,normwt=T)
[1] 9.166667
 > var(a)
[1] 9.166667

Cheers,

JiHO
---
http://jo.irisson.free.fr/

Tom La Bone

2007-Jun-01 11:00 UTC

head link

[R] Problem with Weighted Variance in Hmisc

Thanks.  I have another related question:  

The equation for weighted variance given in the NIST DataPlot documentation
is the usual variance equation with the weights inserted.  The weighted
variance of the weighted mean is this weighted variance divided by N.

There is another approach to calculating the weighted variance of the
weighted mean that propagates the uncertainty of each term in the weighted
mean (see Data Reduction and Error Analysis for the Physical Sciences by
Bevington & Robinson).  The two approaches do not give the same answer. Can
anyone suggest a reference that discusses the merits of the DataPlot
approach versus the Bevington approach?

Tom La Bone

-----Original Message-----
From: jiho [mailto:jo.irisson at gmail.com] 
Sent: Friday, June 01, 2007 2:17 AM
To: labone at gforcecable.com; R-help
Subject: Re: [R] Problem with Weighted Variance in Hmisc

On 2007-June-01  , at 01:03 , Tom La Bone wrote:> The function wtd.var(x,w) in Hmisc calculates the weighted variance  
> of x
> where w are the weights.  It appears to me that wtd.var(x,w) = var 
> (x) if all
> of the weights are equal, but this does not appear to be the case. Can
> someone point out to me where I am going wrong here?  Thanks.
The true formula of weighted variance is this one:
	http://www.itl.nist.gov/div898/software/dataplot/refman2/ch2/ 
weighvar.pdf
But for computation purposes, wtd.var uses another definition which  
considers the weights as repeats instead of true weights. However if  
the weights are normalized (sum to one) to two formulas are equal. If  
you consider weights as real weights instead of repeats, I would  
recommend to use this option.
With normwt=T, your issue is solved:

 > a=1:10
 > b=a
 > b[]=2
 > b
[1] 2 2 2 2 2 2 2 2 2 2
 > wtd.var(a,b)
[1] 8.68421
# all weights equal 2 <=> there are two repeats of each element of a
 > var(c(a,a))
[1] 8.68421
 > wtd.var(a,b,normwt=T)
[1] 9.166667
 > var(a)
[1] 9.166667

Cheers,

JiHO
---
http://jo.irisson.free.fr/

Frank E Harrell Jr

2007-Jun-01 12:11 UTC

head link

[R] Problem with Weighted Variance in Hmisc

jiho wrote:> On 2007-June-01  , at 01:03 , Tom La Bone wrote:
>> The function wtd.var(x,w) in Hmisc calculates the weighted variance  
>> of x
>> where w are the weights.  It appears to me that wtd.var(x,w) = var 
>> (x) if all
>> of the weights are equal, but this does not appear to be the case. Can
>> someone point out to me where I am going wrong here?  Thanks.
> 
> The true formula of weighted variance is this one:
> 	http://www.itl.nist.gov/div898/software/dataplot/refman2/ch2/ 
> weighvar.pdf
> But for computation purposes, wtd.var uses another definition which  
> considers the weights as repeats instead of true weights. However if  
> the weights are normalized (sum to one) to two formulas are equal. If  
> you consider weights as real weights instead of repeats, I would  
> recommend to use this option.
> With normwt=T, your issue is solved:
> 
>  > a=1:10
>  > b=a
>  > b[]=2
>  > b
> [1] 2 2 2 2 2 2 2 2 2 2
>  > wtd.var(a,b)
> [1] 8.68421
> # all weights equal 2 <=> there are two repeats of each element of a
>  > var(c(a,a))
> [1] 8.68421
>  > wtd.var(a,b,normwt=T)
> [1] 9.166667
>  > var(a)
> [1] 9.166667
> 
> Cheers,
> 
> JiHO
The issue is what is being assumed for N in the denominator of the 
variance formula, since the unbiased estimator subtracts one.  Using 
normwt=TRUE means you are in effect assuming N is the number of elements 
in the data vector, ignoring the weights.

Frank Harrell
> ---
> http://jo.irisson.free.fr/
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

Maybe Matching Threads

Search for more reasonably related threads

R help - May 2007 - Problem with Weighted Variance in Hmisc

[R] Problem with Weighted Variance in Hmisc

[R] Problem with Weighted Variance in Hmisc

[R] Problem with Weighted Variance in Hmisc

[R] Problem with Weighted Variance in Hmisc

Maybe Matching Threads