thr3ads.net - R help - [R] OT: a weighted rank-based, non-paired test statistic ? [Jun 2009]

If this information is useful, please help other people find it:
Share via:

Dylan Beaudette

2009-Jun-05 22:24 UTC

[R] OT: a weighted rank-based, non-paired test statistic ?

Hi,

Is anyone aware of a rank-based, non-paired test such as the Krustal-Wallis, 
that can accommodate weights?

Alternatively, would it make sense to simulate a dataset by duplicating 
observations in proportion to their weight, and then using the Krustal-Wallis 
test?

thanks!
Dylan

Thomas Lumley

2009-Jun-05 23:09 UTC

head link

[R] OT: a weighted rank-based, non-paired test statistic ?

On Fri, 5 Jun 2009, Dylan Beaudette wrote:> Is anyone aware of a rank-based, non-paired test such as the
Krustal-Wallis,
> that can accommodate weights?
You don't say what sort of weights, but basically, no.

Whether you have precision weights or sampling weights, the test will no 
longer be distribution-free.
> Alternatively, would it make sense to simulate a dataset by duplicating
> observations in proportion to their weight, and then using the
Krustal-Wallis
> test?
No.

 	-thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

Torsten Hothorn

2009-Jun-09 07:27 UTC

head link

[R] OT: a weighted rank-based, non-paired test statistic ?

> Date: Fri, 5 Jun 2009 16:09:42 -0700 (PDT)
> From: Thomas Lumley <tlumley at u.washington.edu>
> To: dylan.beaudette at gmail.com
> Cc: "'r-help at stat.math.ethz.ch'" <r-help at
stat.math.ethz.ch>
> Subject: Re: [R] OT: a weighted rank-based, non-paired test statistic ?
>
> On Fri, 5 Jun 2009, Dylan Beaudette wrote:
>> Is anyone aware of a rank-based, non-paired test such as the 
>> Krustal-Wallis,
>> that can accommodate weights?
>
> You don't say what sort of weights, but basically, no.
>
> Whether you have precision weights or sampling weights, the test will no 
> longer be distribution-free.
>
>> Alternatively, would it make sense to simulate a dataset by duplicating
>> observations in proportion to their weight, and then using the 
>> Krustal-Wallis
>> test?
>
> No.
>
well, if you have case weights, i.e., w[i] == 5 means: there are five 
observations which look exactly like observation i, then there are several 
ways to do it:
> library("coin")
>
> set.seed(29)
> x <- gl(3, 10)
> y <- rnorm(length(x), mean = c(0, 0, 1)[x])
> d <- data.frame(y = y, x = x)
> w <- rep(2, nrow(d)) ### double each obs
>
> ### all the same
> kruskal_test(y ~ x, data = rbind(d, d))
 	Asymptotic Kruskal-Wallis Test

data:  y by x (1, 2, 3)
chi-squared = 12.1176, df = 2, p-value = 0.002337
>
> kruskal_test(y ~ x, data = d[rep(1:nrow(d), w),])
 	Asymptotic Kruskal-Wallis Test

data:  y by x (1, 2, 3)
chi-squared = 12.1176, df = 2, p-value = 0.002337
>
> kruskal_test(y ~ x, data = d, weights = ~ w)
 	Asymptotic Kruskal-Wallis Test

data:  y by x (1, 2, 3)
chi-squared = 12.1176, df = 2, p-value = 0.002337

the first two work by duplicating data, the latter one is more memory 
efficient since it computes weighted statistics (and their distribution).

However, as Thomas pointed out, other forms of weights are more difficult 
to deal with.

Best wishes,

Torsten

> 	-thomas
>
> Thomas Lumley			Assoc. Professor, Biostatistics
> tlumley at u.washington.edu	University of Washington, Seattle
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

Possibly Parallel Threads

Search for more possibly parallel threads

R help - Jun 2009 - OT: a weighted rank-based, non-paired test statistic ?

[R] OT: a weighted rank-based, non-paired test statistic ?

[R] OT: a weighted rank-based, non-paired test statistic ?

[R] OT: a weighted rank-based, non-paired test statistic ?

Possibly Parallel Threads