thr3ads.net - R help - [R] nonparametric multiple sample comparison [Apr 2004]

If this information is useful, please help other people find it:
Share via:

aolinto_r@bignet.com.br

2004-Apr-25 15:52 UTC

[R] nonparametric multiple sample comparison

Hello all,

Here goes one of my first functions.

I want to make a nonparametric multiple sample comparison with unequal sample 
sizes (see Zar?s Biostatistical Analysis, 3rd. Ed., pg. 201 Example 10.11, pg. 
288 Example 11.10). In the real world, I want to compare samples of fish 
length captured with different fishing gears.

After using the Kruskal-Wallis test I want to check the differences between 
each two samples. 

I wrote the function multcomp (see below) and it?s working OK but I still have 
some doubts. To use it one must have a two-column dataframe with groups and 
measurements.

1) In line 20 (Results <- data.frame....) I create a dataframe to store the
final
results. It has a fixed row number but it would be better to have a variable 
number, i.e., the number of combination of the groups taken 2 by 2 (4 groups = 
5 combinations, 6 groups = 15 combinations). Which function in R returns the 
number of combinations? I couldn?t find it!

2) In lines 35 to 39 I print the result tables. How can I make a
"clean"
print, without row numbers and quotation marks?

3) Also I?d like to calculate the critical values "q" and p-values but
I
couldn?t find the distribution indicated in Zar?s Table B.15 
(App107) "Critical Values of Q for Nonparametric Multiple Comparison
Testing".
Is it possible to calculate these numbers in R?

Thanks in advance for any help or comments to improve this function.

Best regards,

Antonio Olinto
Sao Paulo Fisheries Institute
Brazil

multcomp <- function(DataSet) {
dat.multcomp <- DataSet
names(dat.multcomp) <- c("VarCat","VarNum")
attach(dat.multcomp)
dat.multcomp$Rank <- rank(VarNum)
attach(dat.multcomp)
RankList <- aggregate(Rank,list(Rank=Rank),FUN=length)
t <- length(RankList$Rank)
st <- 0
for (i in 1:t) if (RankList[i,2]>1) st<-st+(RankList[i,2]^3-RankList[i,2])
LevCat <- levels(dat.multcomp$VarCat)
NLevCat <- aggregate(VarCat,list(LevCat=VarCat),FUN=length)
RLevCat <- aggregate(Rank,list(LevCat=VarCat),FUN=sum)
MLevCat <- aggregate(Rank,list(LevCat=VarCat),FUN=mean)
SampleSummary <- data.frame(LevCat,RLevCat[,2],NLevCat[,2],MLevCat[,2])
names(SampleSummary)<-c("Samples","RSum","N","RMean")
SampleSummary <- SampleSummary[order(SampleSummary$RMean,decreasing=T),]
NCat <- length(LevCat)
N <- length(dat.multcomp$VarCat)
Results <- data.frame(rep(NA,6),rep(NA,6),rep(NA,6),rep(NA,6))
names(Results) <-
c("Comparison","Difference","SE","Q")
l <- 1
for (i in 1:(NCat-1)) {
   for (j in NCat:(i+1)) {
   SE <- sqrt(((N*(N+1)/12)-(st/(12*(N-1))))*((1/SampleSummary[i,3])+
(1/SampleSummary[j,3])))
   Dif <- SampleSummary[i,4]-SampleSummary[j,4]
   Q=Dif/SE
Results[l,1] <- paste(SampleSummary[i,1],"vs",SampleSummary[j,1])
Results[l,2] <- round(Dif,4)
Results[l,3] <- round(SE,4)
Results[l,4] <- round(Q,4)
l <-l+1
   }
}
print("Sample summary ranked by mean ranks")
print(SampleSummary)
print("")
print("Table of multiple comparisons")
print(Results)
}

==========
Data example taken from Zar (Biostatiscical Analysis):

dat.limno <- data.frame(scan(what = list (Pound=" ", pH=0),
sep=" "))
1 7.68
1 7.69
1 7.7
1 7.7
1 7.72
1 7.73
1 7.73
1 7.76
2 7.71
2 7.73
2 7.74
2 7.74
2 7.78
2 7.78
2 7.8
2 7.81
3 7.74
3 7.75
3 7.77
3 7.78
3 7.8
3 7.81
3 7.84
4 7.71
4 7.71
4 7.74
4 7.79
4 7.81
4 7.85
4 7.87
4 7.91

kruskal.test(pH~Pound)

        Kruskal-Wallis rank sum test

data
Kruskal-Wallis chi-squared = 11.9435, df = 3, p-value = 0.007579

multcomp(dat.limno)

[1] "Sample summary ranked by mean ranks"
  Samples  RSum N    RMean
3       3 145.0 7 20.71429
4       4 163.5 8 20.43750
2       2 132.5 8 16.56250
1       1  55.0 8  6.87500
[1] ""
[1] "Table of multiple comparisons"
  Comparison Difference     SE      Q
1     3 vs 1    13.8393 4.6923 2.9493
2     3 vs 2     4.1518 4.6923 0.8848
3     3 vs 4     0.2768 4.6923 0.0590
4     4 vs 1    13.5625 4.5332 2.9918
5     4 vs 2     3.8750 4.5332 0.8548
6     2 vs 1     9.6875 4.5332 2.1370


-------------------------------------------------
WebMail Bignet - O seu provedor do litoral
www.bignet.com.br

Gabor Grothendieck

2004-Apr-25 17:36 UTC

head link

[R] nonparametric multiple sample comparison

I have not attempted to follow this in detail but a few comments 
are interspersed within your discussion and the code:

 <aolinto_r <at> bignet.com.br> writes:

: 
: Hello all,
: 
: Here goes one of my first functions.
: 
: I want to make a nonparametric multiple sample comparison with unequal 
sample 
: sizes (see Zar伮抯 Biostatistical Analysis, 3rd. Ed., pg. 201 Example 10.11, 
pg. 
: 288 Example 11.10). In the real world, I want to compare samples of fish 
: length captured with different fishing gears.
: 
: After using the Kruskal-Wallis test I want to check the differences between 
: each two samples. 
: 
: I wrote the function multcomp (see below) and it伮抯 working OK but I still 
have 
: some doubts. To use it one must have a two-column dataframe with groups and 
: measurements.
: 
: 1) In line 20 (Results <- data.frame伮....) I create a dataframe to store
the
final 
: results. It has a fixed row number but it would be better to have a variable 
: number, i.e., the number of combination of the groups taken 2 by 2 (4 groups 
= 
: 5 combinations, 6 groups = 15 combinations). Which function in R returns the 
: number of combinations? I couldn伮抰 find it!


?choose


: 
: 2) In lines 35 to 39 I print the result tables. How can I make a
"clean"
: print, without row numbers and quotation marks?


?cat

: 
: 3) Also I伮抎 like to calculate the critical values "q" and p-values
but I
: couldn伮抰 find the distribution indicated in Zar伮抯 Table B.15 
: (App107) "Critical Values of Q for Nonparametric Multiple Comparison 
Testing". 
: Is it possible to calculate these numbers in R?
: 
: Thanks in advance for any help or comments to improve this function.
: 
: Best regards,
: 
: Antonio Olinto
: Sao Paulo Fisheries Institute
: Brazil
: 
: multcomp <- function(DataSet) {
: dat.multcomp <- DataSet

You don't need to copy the DataSet.  If you try to modify any object
within a function then R makes a copy of it automatically anyways.

: names(dat.multcomp) <- c("VarCat","VarNum")
: attach(dat.multcomp)

attach is generally frowned upon.  Just call it a short name such as d so that
its easy to refer to.  (Alternately wrap references to data frame in a with.)
I have assumed the data frame is called d in what follows.

: dat.multcomp$Rank <- rank(VarNum)
: attach(dat.multcomp)
: RankList <- aggregate(Rank,list(Rank=Rank),FUN=length)
: t <- length(RankList$Rank)
: st <- 0
: for (i in 1:t) if (RankList[i,2]>1)
st<-st+(RankList[i,2]^3-RankList[i,2])

The last two statements could be reduced to:

st <- sum(ifelse( RankList[,2]>1, RankList[,2]^3-RankList[,2], 0 )

: LevCat <- levels(dat.multcomp$VarCat)
: NLevCat <- aggregate(VarCat,list(LevCat=VarCat),FUN=length)
: RLevCat <- aggregate(Rank,list(LevCat=VarCat),FUN=sum)
: MLevCat <- aggregate(Rank,list(LevCat=VarCat),FUN=mean)
: SampleSummary <- data.frame(LevCat,RLevCat[,2],NLevCat[,2],MLevCat[,2])
:
names(SampleSummary)<-c("Samples","RSum","N","RMean")

If I have followed this correctly then the above 6 lines can be reduced to:

SampleSummary <- do.call( "rbind", by(d$Rank, d$VarCat, 
	function(x) c(Rsum=sum(x),N=length(x),RMean=mean(x))))

although that produces a matrix rather than a data frame and puts the 
levels as the row rather than a column so you will have to adjust 
what follows.

: SampleSummary <- SampleSummary[order(SampleSummary$RMean,decreasing=T),]
: NCat <- length(LevCat)
: N <- length(dat.multcomp$VarCat)
: Results <- data.frame(rep(NA,6),rep(NA,6),rep(NA,6),rep(NA,6))
: names(Results) <-
c("Comparison","Difference","SE","Q")
: l <- 1
: for (i in 1:(NCat-1)) {
:    for (j in NCat:(i+1)) {
:    SE <- sqrt(((N*(N+1)/12)-(st/(12*(N-1))))*((1/SampleSummary[i,3])+
: (1/SampleSummary[j,3])))
:    Dif <- SampleSummary[i,4]-SampleSummary[j,4]
:    Q=Dif/SE
: Results[l,1] <- paste(SampleSummary[i,1],"vs",SampleSummary[j,1])
: Results[l,2] <- round(Dif,4)
: Results[l,3] <- round(SE,4)
: Results[l,4] <- round(Q,4)
: l <-l+1
:    }
: }
: print("Sample summary ranked by mean ranks")
: print(SampleSummary)
: print("")
: print("Table of multiple comparisons")
: print(Results)
: }
: 
: ==========: 
: Data example taken from Zar (Biostatiscical Analysis):
: 
: dat.limno <- data.frame(scan(what = list (Pound=" ", pH=0),
sep=" "))
: 1 7.68
: 1 7.69
: 1 7.7
: 1 7.7
: 1 7.72
: 1 7.73
: 1 7.73
: 1 7.76
: 2 7.71
: 2 7.73
: 2 7.74
: 2 7.74
: 2 7.78
: 2 7.78
: 2 7.8
: 2 7.81
: 3 7.74
: 3 7.75
: 3 7.77
: 3 7.78
: 3 7.8
: 3 7.81
: 3 7.84
: 4 7.71
: 4 7.71
: 4 7.74
: 4 7.79
: 4 7.81
: 4 7.85
: 4 7.87
: 4 7.91
: 
: kruskal.test(pH~Pound)
: 
:         Kruskal-Wallis rank sum test
: 
: data
: Kruskal-Wallis chi-squared = 11.9435, df = 3, p-value = 0.007579
: 
: multcomp(dat.limno)
: 
: [1] "Sample summary ranked by mean ranks"
:   Samples  RSum N    RMean
: 3       3 145.0 7 20.71429
: 4       4 163.5 8 20.43750
: 2       2 132.5 8 16.56250
: 1       1  55.0 8  6.87500
: [1] ""
: [1] "Table of multiple comparisons"
:   Comparison Difference     SE      Q
: 1     3 vs 1    13.8393 4.6923 2.9493
: 2     3 vs 2     4.1518 4.6923 0.8848
: 3     3 vs 4     0.2768 4.6923 0.0590
: 4     4 vs 1    13.5625 4.5332 2.9918
: 5     4 vs 2     3.8750 4.5332 0.8548
: 6     2 vs 1     9.6875 4.5332 2.1370
: 
: 
: -------------------------------------------------
: WebMail Bignet - O seu provedor do litoral
: www.bignet.com.br
: 
: ______________________________________________
: R-help <at> stat.math.ethz.ch mailing list
: https://www.stat.math.ethz.ch/mailman/listinfo/r-help
: PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
: 
:

Liaw, Andy

2004-Apr-26 00:06 UTC

head link

[R] nonparametric multiple sample comparison

Have you looked at the npmc package on CRAN?

HTH,
Andy
> From: aolinto_r at bignet.com.br
> 
> Hello all,
> 
> Here goes one of my first functions.
> 
> I want to make a nonparametric multiple sample comparison 
> with unequal sample 
> sizes (see Zar's Biostatistical Analysis, 3rd. Ed., pg. 201 
> Example 10.11, pg. 
> 288 Example 11.10). In the real world, I want to compare 
> samples of fish 
> length captured with different fishing gears.
> 
> After using the Kruskal-Wallis test I want to check the 
> differences between 
> each two samples. 
> 
> I wrote the function multcomp (see below) and it's working OK 
> but I still have 
> some doubts. To use it one must have a two-column dataframe 
> with groups and 
> measurements.
> 
> 1) In line 20 (Results <- data.frame...) I create a dataframe 
> to store the final 
> results. It has a fixed row number but it would be better to 
> have a variable 
> number, i.e., the number of combination of the groups taken 2 
> by 2 (4 groups = 
> 5 combinations, 6 groups = 15 combinations). Which function 
> in R returns the 
> number of combinations? I couldn't find it!
> 
> 2) In lines 35 to 39 I print the result tables. How can I 
> make a "clean" 
> print, without row numbers and quotation marks?
> 
> 3) Also I'd like to calculate the critical values "q" and 
> p-values but I 
> couldn't find the distribution indicated in Zar's Table B.15 
> (App107) "Critical Values of Q for Nonparametric Multiple 
> Comparison Testing". 
> Is it possible to calculate these numbers in R?
> 
> Thanks in advance for any help or comments to improve this function.
> 
> Best regards,
> 
> Antonio Olinto
> Sao Paulo Fisheries Institute
> Brazil
> 
> multcomp <- function(DataSet) {
> dat.multcomp <- DataSet
> names(dat.multcomp) <- c("VarCat","VarNum")
> attach(dat.multcomp)
> dat.multcomp$Rank <- rank(VarNum)
> attach(dat.multcomp)
> RankList <- aggregate(Rank,list(Rank=Rank),FUN=length)
> t <- length(RankList$Rank)
> st <- 0
> for (i in 1:t) if (RankList[i,2]>1) 
> st<-st+(RankList[i,2]^3-RankList[i,2])
> LevCat <- levels(dat.multcomp$VarCat)
> NLevCat <- aggregate(VarCat,list(LevCat=VarCat),FUN=length)
> RLevCat <- aggregate(Rank,list(LevCat=VarCat),FUN=sum)
> MLevCat <- aggregate(Rank,list(LevCat=VarCat),FUN=mean)
> SampleSummary <- 
> data.frame(LevCat,RLevCat[,2],NLevCat[,2],MLevCat[,2])
>
names(SampleSummary)<-c("Samples","RSum","N","RMean")
> SampleSummary <- 
> SampleSummary[order(SampleSummary$RMean,decreasing=T),]
> NCat <- length(LevCat)
> N <- length(dat.multcomp$VarCat)
> Results <- data.frame(rep(NA,6),rep(NA,6),rep(NA,6),rep(NA,6))
> names(Results) <-
c("Comparison","Difference","SE","Q")
> l <- 1
> for (i in 1:(NCat-1)) {
>    for (j in NCat:(i+1)) {
>    SE <- sqrt(((N*(N+1)/12)-(st/(12*(N-1))))*((1/SampleSummary[i,3])+
> (1/SampleSummary[j,3])))
>    Dif <- SampleSummary[i,4]-SampleSummary[j,4]
>    Q=Dif/SE
> Results[l,1] <-
paste(SampleSummary[i,1],"vs",SampleSummary[j,1])
> Results[l,2] <- round(Dif,4)
> Results[l,3] <- round(SE,4)
> Results[l,4] <- round(Q,4)
> l <-l+1
>    }
> }
> print("Sample summary ranked by mean ranks")
> print(SampleSummary)
> print("")
> print("Table of multiple comparisons")
> print(Results)
> }
> 
> ==========> 
> Data example taken from Zar (Biostatiscical Analysis):
> 
> dat.limno <- data.frame(scan(what = list (Pound=" ", pH=0),
sep=" "))
> 1 7.68
> 1 7.69
> 1 7.7
> 1 7.7
> 1 7.72
> 1 7.73
> 1 7.73
> 1 7.76
> 2 7.71
> 2 7.73
> 2 7.74
> 2 7.74
> 2 7.78
> 2 7.78
> 2 7.8
> 2 7.81
> 3 7.74
> 3 7.75
> 3 7.77
> 3 7.78
> 3 7.8
> 3 7.81
> 3 7.84
> 4 7.71
> 4 7.71
> 4 7.74
> 4 7.79
> 4 7.81
> 4 7.85
> 4 7.87
> 4 7.91
> 
> kruskal.test(pH~Pound)
> 
>         Kruskal-Wallis rank sum test
> 
> data
> Kruskal-Wallis chi-squared = 11.9435, df = 3, p-value = 0.007579
> 
> multcomp(dat.limno)
> 
> [1] "Sample summary ranked by mean ranks"
>   Samples  RSum N    RMean
> 3       3 145.0 7 20.71429
> 4       4 163.5 8 20.43750
> 2       2 132.5 8 16.56250
> 1       1  55.0 8  6.87500
> [1] ""
> [1] "Table of multiple comparisons"
>   Comparison Difference     SE      Q
> 1     3 vs 1    13.8393 4.6923 2.9493
> 2     3 vs 2     4.1518 4.6923 0.8848
> 3     3 vs 4     0.2768 4.6923 0.0590
> 4     4 vs 1    13.5625 4.5332 2.9918
> 5     4 vs 2     3.8750 4.5332 0.8548
> 6     2 vs 1     9.6875 4.5332 2.1370
> 
> 
> -------------------------------------------------
> WebMail Bignet - O seu provedor do litoral
> www.bignet.com.br
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
> 

------------------------------------------------------------------------------
Notice:  This e-mail message, together with any attachments,...{{dropped}}

Reasonably Related Threads

Search for more reasonably related threads

R help - Apr 2004 - nonparametric multiple sample comparison

[R] nonparametric multiple sample comparison

[R] nonparametric multiple sample comparison

[R] nonparametric multiple sample comparison

Reasonably Related Threads