thr3ads.net - R help - [R] non-parametric permutation and signed paired-difference distributions [Oct 2011]

If this information is useful, please help other people find it:
Share via:

Michael Friendly

2011-Oct-14 15:38 UTC

[R] non-parametric permutation and signed paired-difference distributions

Hi all
Consider the classic data below from Darwin on the heights of 15 pairs 
of zea mays (corn) plants
either cross-fertilized or self-fertilized, where the goal is to see if 
it makes a difference.

 > head(ZeaMays)
   pair pot  cross   self   diff
1    1   1 23.500 17.375  6.125
2    2   1 12.000 20.375 -8.375
3    3   1 21.000 20.000  1.000
4    4   2 22.000 20.000  2.000
5    5   2 19.125 18.375  0.750
6    6   2 21.500 18.625  2.875
...

I'd like to illustrate two types of non-parametric tests of whether the 
mean(diff) = 0.

(a) Permutation test, where the values of, say self are permuted and 
diff=cross - self
is calculated for each permutation.  There are 15! permutations, but a 
reasonably
large number of random permutations would suffice.

(b) Test based on assigning each abs(diff) a + or - sign, and 
calculating the mean(diff).
There are 2^15 such possible values, but again, a reasonably large 
number of random
samples would do.

This is obviously a case for apply and friends, but I can't quite see 
how to set it up.

The complete data:

 > dput(ZeaMays)
structure(list(pair = 1:15, pot = structure(c(1L, 1L, 1L, 2L,
2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L), .Label = c("1",
"2", "3", "4"), class = "factor"), cross
= c(23.5, 12, 21, 22,
19.125, 21.5, 22.125, 20.375, 18.25, 21.625, 23.25, 21, 22.125,
23, 12), self = c(17.375, 20.375, 20, 20, 18.375, 18.625, 18.625,
15.25, 16.5, 18, 16.25, 18, 12.75, 15.5, 18), diff = c(6.125,
-8.375, 1, 2, 0.75, 2.875, 3.5, 5.125, 1.75, 3.625, 7, 3, 9.375,
7.5, -6)), row.names = c(NA, -15L), .Names = c("pair",
"pot",
"cross", "self", "diff"), class =
"data.frame")



-- Michael Friendly Email: friendly AT yorku DOT ca Professor, 
Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 
736-5814 4700 Keele Street Web: http://www.datavis.ca Toronto, ONT M3J 
1P3 CANADA

Weidong Gu

2011-Oct-14 17:20 UTC

head link

[R] non-parametric permutation and signed paired-difference distributions

On Fri, Oct 14, 2011 at 11:38 AM, Michael Friendly <friendly at yorku.ca>
wrote:> Hi all
> Consider the classic data below from Darwin on the heights of 15 pairs of
> zea mays (corn) plants
> either cross-fertilized or self-fertilized, where the goal is to see if it
> makes a difference.
>
>> head(ZeaMays)
> ?pair pot ?cross ? self ? diff
> 1 ? ?1 ? 1 23.500 17.375 ?6.125
> 2 ? ?2 ? 1 12.000 20.375 -8.375
> 3 ? ?3 ? 1 21.000 20.000 ?1.000
> 4 ? ?4 ? 2 22.000 20.000 ?2.000
> 5 ? ?5 ? 2 19.125 18.375 ?0.750
> 6 ? ?6 ? 2 21.500 18.625 ?2.875
> ...
>
> I'd like to illustrate two types of non-parametric tests of whether the
> mean(diff) = 0.
>
> (a) Permutation test, where the values of, say self are permuted and
> diff=cross - self
> is calculated for each permutation. ?There are 15! permutations, but a
> reasonably
> large number of random permutations would suffice.
>
> (b) Test based on assigning each abs(diff) a + or - sign, and calculating
> the mean(diff).
> There are 2^15 such possible values, but again, a reasonably large number
of
> random
> samples would do.
>
What do you mean by 'assigning each abs(diff) a + or - sign, and
calculating the mean(diff)'?
abs(diff) should be all positive, right?
> This is obviously a case for apply and friends, but I can't quite see
how to
> set it up.
>
> The complete data:
>
>> dput(ZeaMays)
> structure(list(pair = 1:15, pot = structure(c(1L, 1L, 1L, 2L,
> 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L), .Label = c("1",
> "2", "3", "4"), class = "factor"),
cross = c(23.5, 12, 21, 22,
> 19.125, 21.5, 22.125, 20.375, 18.25, 21.625, 23.25, 21, 22.125,
> 23, 12), self = c(17.375, 20.375, 20, 20, 18.375, 18.625, 18.625,
> 15.25, 16.5, 18, 16.25, 18, 12.75, 15.5, 18), diff = c(6.125,
> -8.375, 1, 2, 0.75, 2.875, 3.5, 5.125, 1.75, 3.625, 7, 3, 9.375,
> 7.5, -6)), row.names = c(NA, -15L), .Names = c("pair",
"pot",
> "cross", "self", "diff"), class =
"data.frame")
>
>
>
> -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology
> Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700
> Keele Street Web: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Jean V Adams

2011-Oct-14 18:45 UTC

head link

[R] non-parametric permutation and signed paired-difference distributions

Michael Friendly wrote on 10/14/2011 10:38:44 AM:> 
> Hi all
> Consider the classic data below from Darwin on the heights of 15 pairs 
> of zea mays (corn) plants
> either cross-fertilized or self-fertilized, where the goal is to see if 
> it makes a difference.
> 
>  > head(ZeaMays)
>    pair pot  cross   self   diff
> 1    1   1 23.500 17.375  6.125
> 2    2   1 12.000 20.375 -8.375
> 3    3   1 21.000 20.000  1.000
> 4    4   2 22.000 20.000  2.000
> 5    5   2 19.125 18.375  0.750
> 6    6   2 21.500 18.625  2.875
> ...
> 
> I'd like to illustrate two types of non-parametric tests of whether the
> mean(diff) = 0.
> 
> (a) Permutation test, where the values of, say self are permuted and 
> diff=cross - self
> is calculated for each permutation.  There are 15! permutations, but a 
> reasonably
> large number of random permutations would suffice.

You have paired data.  To conduct a permutation test you would randomly 
assign one member of each pair to being either cross or self-fertilized. 
The other member of the pair would be assigned to the opposite 
"treatment".  That would lead to 2^15 = 32,768 permutations.  See the 
perm.test() function in the R package exactRankTests.

> (b) Test based on assigning each abs(diff) a + or - sign, and 
> calculating the mean(diff).
> There are 2^15 such possible values, but again, a reasonably large 
> number of random
> samples would do.

Sounds like you are attempting to combine a sign test and a permutation 
test.  You could do this by using the logical cross>self rather than the 
difference cross-self.

Jean

> This is obviously a case for apply and friends, but I can't quite see 
> how to set it up.
> 
> The complete data:
> 
>  > dput(ZeaMays)
> structure(list(pair = 1:15, pot = structure(c(1L, 1L, 1L, 2L,
> 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L), .Label = c("1",
> "2", "3", "4"), class = "factor"),
cross = c(23.5, 12, 21, 22,
> 19.125, 21.5, 22.125, 20.375, 18.25, 21.625, 23.25, 21, 22.125,
> 23, 12), self = c(17.375, 20.375, 20, 20, 18.375, 18.625, 18.625,
> 15.25, 16.5, 18, 16.25, 18, 12.75, 15.5, 18), diff = c(6.125,
> -8.375, 1, 2, 0.75, 2.875, 3.5, 5.125, 1.75, 3.625, 7, 3, 9.375,
> 7.5, -6)), row.names = c(NA, -15L), .Names = c("pair",
"pot",
> "cross", "self", "diff"), class =
"data.frame")
	[[alternative HTML version deleted]]

Possibly Parallel Threads

Search for more maybe matching threads

R help - Oct 2011 - non-parametric permutation and signed paired-difference distributions

[R] non-parametric permutation and signed paired-difference distributions

[R] non-parametric permutation and signed paired-difference distributions

[R] non-parametric permutation and signed paired-difference distributions

Possibly Parallel Threads