thr3ads.net - R help - [R] significance in difference of proportions [Nov 2003]

If this information is useful, please help other people find it:
Share via:

Arne.Muller@aventis.com

2003-Nov-27 16:04 UTC

[R] significance in difference of proportions

Hello,

I'm looking for some guidance with the following problem:

I've 2 samples A (111 items) and B (10 items) drawn from the same unknown
population. Witihn A I find 9 "positives" and in B 0 positives.
I'd like to
know if the 2 samples A and B are different, ie is there a way to find out
whether the number of "positives" is significantly different in A and
B?

I'm currently using prop.test, but unfortunately some of my data contains
less than 5 items in a group (like in the example above), and the test
statistics may not hold:
> prop.test(c(9,0), c(111,10))
        2-sample test for equality of proportions with continuity correction

data:  c(9, 0) out of c(111, 10) 
X-squared = 0.0941, df = 1, p-value = 0.759
alternative hypothesis: two.sided 
95 percent confidence interval:
 -0.02420252  0.18636468 
sample estimates:
    prop 1     prop 2 
0.08108108 0.00000000 

Warning message: 
Chi-squared approximation may be incorrect in: prop.test(c(9, 0), c(111, 10))


Do you have suggestions for an alternative test?
	
	many thanks for your help,
	+kind regards,

	Arne

Jonathan Baron

2003-Nov-27 17:09 UTC

head link

[R] significance in difference of proportions

On 11/27/03 17:04, Arne.Muller at aventis.com wrote:>Hello,
>
>I'm looking for some guidance with the following problem:
>
>I've 2 samples A (111 items) and B (10 items) drawn from the same
unknown
>population. Witihn A I find 9 "positives" and in B 0 positives.
I'd like to
>know if the 2 samples A and B are different, ie is there a way to find out
>whether the number of "positives" is significantly different in A
and B?
>
>I'm currently using prop.test, but unfortunately some of my data
contains
>less than 5 items in a group (like in the example above), and the test
>statistics may not hold:
fisher.test in the ctest package, which loads automatically.

-- 
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page:            http://www.sas.upenn.edu/~baron

Torsten Hothorn

2003-Nov-27 17:18 UTC

head link

[R] significance in difference of proportions

> Hello,
>
> I'm looking for some guidance with the following problem:
>
> I've 2 samples A (111 items) and B (10 items) drawn from the same
unknown
> population. Witihn A I find 9 "positives" and in B 0 positives.
I'd like to
> know if the 2 samples A and B are different, ie is there a way to find out
> whether the number of "positives" is significantly different in A
and B?
>
> I'm currently using prop.test, but unfortunately some of my data
contains
> less than 5 items in a group (like in the example above), and the test
> statistics may not hold:
The statistic is fine, the approximation to its null distribution may be
questionable :-)


>
> > prop.test(c(9,0), c(111,10))
>
>         2-sample test for equality of proportions with continuity
correction
>
> data:  c(9, 0) out of c(111, 10)
> X-squared = 0.0941, df = 1, p-value = 0.759
> alternative hypothesis: two.sided
> 95 percent confidence interval:
>  -0.02420252  0.18636468
> sample estimates:
>     prop 1     prop 2
> 0.08108108 0.00000000
>
> Warning message:
> Chi-squared approximation may be incorrect in: prop.test(c(9, 0), c(111,
10))
>
>
> Do you have suggestions for an alternative test?
>
you may consider a permutation test for two independent samples:

R> library(exactRankTests)
R> x = c(rep(1, 9), rep(0, 102))
R> y = rep(0, 10)
R> mean(x)
[1] 0.08108108
R> mean(y)
[1] 0
R> perm.test(y, x, exact = TRUE)

        2-sample Permutation Test

data:  y and x
T = 0, p-value = 0.6092
alternative hypothesis: true mu is not equal to 0

Best,

Torsten

> 	many thanks for your help,
> 	+kind regards,
>
> 	Arne
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>
>

(Ted Harding)

2003-Nov-27 17:43 UTC

head link

[R] significance in difference of proportions

On 27-Nov-03 Arne.Muller at aventis.com wrote:> I've 2 samples A (111 items) and B (10 items) drawn from the same
> unknown population. Witihn A I find 9 "positives" and in B 0
> positives. I'd like to know if the 2 samples A and B are different,
> ie is there a way to find out whether the number of "positives"
is
> significantly different in A and B?
Pretty obviously not, just from looking at the numbers:

9 out of 111 -> p = P(positive) approx = 1/10

P(0 out of 10 when p = 1/10) is not unlikely (in fact = 0.35).

However, a Fisher exact test will give you a respectable P-value:
> library(ctest)
> ?fisher.test
> fisher.test(matrix(c(102,9,10,0),nrow=2))  [...]
  p-value = 1
  alternative hypothesis: true odds ratio is not equal to 1 
  95 percent confidence interval:
   0.000000 6.088391 > fisher.test(matrix(c(102,9,9,1),nrow=2))
  p-value = 0.5926> fisher.test(matrix(c(102,9,8,2),nrow=2))
  p-value = 0.2257> fisher.test(matrix(c(102,9,7,3),nrow=2))
  p-value = 0.0605> fisher.test(matrix(c(102,9,6,4),nrow=2))  p-value = 0.01202

So there's a 95% CI (0,6.1) for the odds ratio which, for
identical probabilities of "+", is 1.0 hence well within the CI.
And, keeping the numbers for the larger sample fixed for
simplicity, you have to go quite a way with the smaller one to get
a result significant at 5%:

(102,9):(7,3) -> P = 0.06
(102,9):(6,4) -> P = 0.01

and, to have 80% power (0.8 probability of this event), the
probability of "+" in the second sample would have to be as
high as 0.41.

Conclusion: your second sample size is quite inadequate except
to detect rather large differences between the true proportions
in the two cases!

Best wishes,
Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 167 1972
Date: 27-Nov-03                                       Time: 17:43:00
------------------------------ XFMail ------------------------------

Arne.Muller@aventis.com

2003-Dec-01 17:41 UTC

head link

[R] significance in difference of proportions

Hello,

thanks for the replies to this subject. I'm using a fisher.test to test if
the proportions of my 2 samples are different (see Ted's example below).

The assumption was that the two samples are from the same population and that
they may contain a different number of "positives" (due to different
treatment). 

I may be able to figues out the true probability to get a "positive",
since I
for some of my experiments I know the entire population. E.g. the samples
(111 items, and 10 items) come from a population of 10,000 items, and I know
that there are 200 positives in the population.

Is it possible to use the fisher test for testing equallity of proportions
and to include the known probability to find a positive - would that make
sense at all? If the two samples come from the same population the
probability to find a positive shouldn't influence the test for difference
of
proportions, should it? 

At some point I'd like to extend the statistics so that the two samples can
come from 2 different populations (with known probability for the positives).

I'm happy to receive suggestions and comments on this.

	thanks a lot again for your help,

	Arne 
> 
> On 27-Nov-03 Arne.Muller at aventis.com wrote:
> > I've 2 samples A (111 items) and B (10 items) drawn from the same
> > unknown population. Witihn A I find 9 "positives" and in B 0
> > positives. I'd like to know if the 2 samples A and B are
different,
> > ie is there a way to find out whether the number of
"positives" is
> > significantly different in A and B?
> 
> Pretty obviously not, just from looking at the numbers:
> 
> 9 out of 111 -> p = P(positive) approx = 1/10
> 
> P(0 out of 10 when p = 1/10) is not unlikely (in fact = 0.35).
> 
> However, a Fisher exact test will give you a respectable P-value:
> 
> > library(ctest)
> > ?fisher.test
> > fisher.test(matrix(c(102,9,10,0),nrow=2))
>   [...]
>   p-value = 1
>   alternative hypothesis: true odds ratio is not equal to 1 
>   95 percent confidence interval:
>    0.000000 6.088391 
> > fisher.test(matrix(c(102,9,9,1),nrow=2))
>   p-value = 0.5926
> > fisher.test(matrix(c(102,9,8,2),nrow=2))
>   p-value = 0.2257
> > fisher.test(matrix(c(102,9,7,3),nrow=2))
>   p-value = 0.0605
> > fisher.test(matrix(c(102,9,6,4),nrow=2))
>   p-value = 0.01202
> 
> So there's a 95% CI (0,6.1) for the odds ratio which, for
> identical probabilities of "+", is 1.0 hence well within the CI.
> And, keeping the numbers for the larger sample fixed for
> simplicity, you have to go quite a way with the smaller one to get
> a result significant at 5%:
> 
> (102,9):(7,3) -> P = 0.06
> (102,9):(6,4) -> P = 0.01
> 
> and, to have 80% power (0.8 probability of this event), the
> probability of "+" in the second sample would have to be as
> high as 0.41.
> 
> Conclusion: your second sample size is quite inadequate except
> to detect rather large differences between the true proportions
> in the two cases!
> 
> Best wishes,
> Ted.
> 
> 
> --------------------------------------------------------------------
> E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
> Fax-to-email: +44 (0)870 167 1972
> Date: 27-Nov-03                                       Time: 17:43:00
> ------------------------------ XFMail ------------------------------
>

Possibly Parallel Threads

Search for more seemingly similar threads

R help - Nov 2003 - significance in difference of proportions

[R] significance in difference of proportions

[R] significance in difference of proportions

[R] significance in difference of proportions

[R] significance in difference of proportions

[R] significance in difference of proportions

Possibly Parallel Threads