thr3ads.net - R help - [R] chisq.test, basic question [Jul 2002]

If this information is useful, please help other people find it:
Share via:

Huntsinger, Reid

2002-Jul-30 16:07 UTC

[R] chisq.test, basic question

The cells are interpreted as counts, so by scaling you're analyzing a
different experiment (one with fewer observations). So the chi-squared value
will change (the terms (O-E)^2/E in the statistic scale linearly ignoring
rounding and "Yates' continuity correction"). 

The chisq.test on the original data is a test of association. Conventionally
you decide ahead of time on a threshold for "false positives", say 5%,
then
use the reported p-value to determine whether to accept or reject the null
hypothesis of no association. Had you chosen 5%, since the reported p-value
is smaller than 5%, you would reject, i.e., decide that association is
present.

Chisq.test is not really a measure of association. Your observation is a
nice illustration of why. There are many measures of association (e.g., odds
ratio); see for example Alan Agresti's "Categorical Data Analysis"
for some
discussion. 

Reid Huntsinger

-----Original Message-----
From: juli g. pausas [mailto:juli at ceam.es]
Sent: Tuesday, July 30, 2002 12:12 PM
To: r-help
Subject: [R] chisq.test, basic question


Dear R-users,
I have a question, which I'm not sure if it is related to my
misunderstanding of basic statistics, or my misunderstanding of R, or
both.
I've got the counts of a 2 x 2 contingency table, and I'd like to test
the association:

m <-  matrix(c(15,28,32,135), 2, 2)
colnames(m) <- c("R-", "R+"); rownames(m) <-
c("P-", "P+")
m
#    R-  R+
# P- 15  32
# P+ 28 135

chisq.test(m)  # X-squared = 4.0027, df = 1, p-value = 0.04543

Is this the correct way to test association between P and R? (I haven't
got the original data).
My problem is that if I use percentage, then I get different results:

m2 <- 100*m/sum(m) #
chisq.test(round(m2)) # X-squared = 1.5318, df = 1, p-value = 0.2158

Should this give about the same (a part from the rounding)? Should the
degree of association between P and R be he same?  Or, am I using
chisq.test() wrongly?

Thanks in advance,

Juli


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
_._


------------------------------------------------------------------------------
Notice:  This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that
may be confidential, proprietary copyrighted and/or legally privileged, and is
intended solely for the use of the individual or entity named in this message. 
If you are not the intended recipient, and have received this message in error,
please immediately return this by e-mail and then delete it.

=============================================================================
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

juli g. pausas

2002-Jul-30 16:11 UTC

head link

[R] chisq.test, basic question

Dear R-users,
I have a question, which I?m not sure if it is related to my
misunderstanding of basic statistics, or my misunderstanding of R, or
both.
I?ve got the counts of a 2 x 2 contingency table, and I'd like to test
the association:

m <-  matrix(c(15,28,32,135), 2, 2)
colnames(m) <- c("R-", "R+"); rownames(m) <-
c("P-", "P+")
m
#    R-  R+
# P- 15  32
# P+ 28 135

chisq.test(m)  # X-squared = 4.0027, df = 1, p-value = 0.04543

Is this the correct way to test association between P and R? (I haven?t
got the original data).
My problem is that if I use percentage, then I get different results:

m2 <- 100*m/sum(m) #
chisq.test(round(m2)) # X-squared = 1.5318, df = 1, p-value = 0.2158

Should this give about the same (a part from the rounding)? Should the
degree of association between P and R be he same?  Or, am I using
chisq.test() wrongly?

Thanks in advance,

Juli


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Brett Magill

2002-Jul-30 19:55 UTC

head link

[R] chisq.test, basic question

>From help(chisq.test)
If `x' is a matrix with at least two rows and columns, it is taken as a
two-dimensional contingency table, and hence its entries should be nonnegative
"INTEGERS".

-----Original Message-----
From: juli g. pausas [mailto:juli at ceam.es]
Sent: Tuesday, July 30, 2002 12:12 PM
To: r-help
Subject: [R] chisq.test, basic question

Dear R-users,
I have a question, which I'm not sure if it is related to my
misunderstanding of basic statistics, or my misunderstanding of R, or
both.
I've got the counts of a 2 x 2 contingency table, and I'd like to test
the association:

m <-  matrix(c(15,28,32,135), 2, 2)
colnames(m) <- c("R-", "R+"); rownames(m) <-
c("P-", "P+")
m
#    R-  R+
# P- 15  32
# P+ 28 135

chisq.test(m)  # X-squared = 4.0027, df = 1, p-value = 0.04543

Is this the correct way to test association between P and R? (I haven't
got the original data).
My problem is that if I use percentage, then I get different results:

m2 <- 100*m/sum(m) #
chisq.test(round(m2)) # X-squared = 1.5318, df = 1, p-value = 0.2158

Should this give about the same (a part from the rounding)? Should the
degree of association between P and R be he same?  Or, am I using
chisq.test() wrongly?

Thanks in advance,

Juli

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
_._

------------------------------------------------------------------------------

Notice:  This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that
may be confidential, proprietary copyrighted and/or legally privileged, and is
intended solely for the use of the individual or entity named in this message.
 If you are not the intended recipient, and have received this message in
error, please immediately return this by e-mail and then delete it.

=============================================================================

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Huntsinger, Reid

2002-Jul-30 21:15 UTC

head link

[R] chisq.test, basic question

My previous reply (below) uses "false positive" in a particularly
misleading
way. I intended this to mean "incorrect rejection of the null hypothesis of
no association". I succumbed to the temptation to call a "rejection of
the
null hypothesis of no association" a "positive" (cancelling a
double
negative?), but as it is a rejection (of no matter what) I should have
called it a "negative". 

Reid Huntsinger

-----Original Message-----
From: Huntsinger, Reid [mailto:reid_huntsinger at merck.com]
Sent: Tuesday, July 30, 2002 12:07 PM
To: 'juli g. pausas'; r-help
Subject: RE: [R] chisq.test, basic question


The cells are interpreted as counts, so by scaling you're analyzing a
different experiment (one with fewer observations). So the chi-squared value
will change (the terms (O-E)^2/E in the statistic scale linearly ignoring
rounding and "Yates' continuity correction"). 

The chisq.test on the original data is a test of association. Conventionally
you decide ahead of time on a threshold for "false positives", say 5%,
then
use the reported p-value to determine whether to accept or reject the null
hypothesis of no association. Had you chosen 5%, since the reported p-value
is smaller than 5%, you would reject, i.e., decide that association is
present.

Chisq.test is not really a measure of association. Your observation is a
nice illustration of why. There are many measures of association (e.g., odds
ratio); see for example Alan Agresti's "Categorical Data Analysis"
for some
discussion. 

Reid Huntsinger

-----Original Message-----
From: juli g. pausas [mailto:juli at ceam.es]
Sent: Tuesday, July 30, 2002 12:12 PM
To: r-help
Subject: [R] chisq.test, basic question


Dear R-users,
I have a question, which I'm not sure if it is related to my
misunderstanding of basic statistics, or my misunderstanding of R, or
both.
I've got the counts of a 2 x 2 contingency table, and I'd like to test
the association:

m <-  matrix(c(15,28,32,135), 2, 2)
colnames(m) <- c("R-", "R+"); rownames(m) <-
c("P-", "P+")
m
#    R-  R+
# P- 15  32
# P+ 28 135

chisq.test(m)  # X-squared = 4.0027, df = 1, p-value = 0.04543

Is this the correct way to test association between P and R? (I haven't
got the original data).
My problem is that if I use percentage, then I get different results:

m2 <- 100*m/sum(m) #
chisq.test(round(m2)) # X-squared = 1.5318, df = 1, p-value = 0.2158

Should this give about the same (a part from the rounding)? Should the
degree of association between P and R be he same?  Or, am I using
chisq.test() wrongly?

Thanks in advance,

Juli


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
_._


----------------------------------------------------------------------------
--
Notice:  This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that
may be confidential, proprietary copyrighted and/or legally privileged, and
is intended solely for the use of the individual or entity named in this
message.  If you are not the intended recipient, and have received this
message in error, please immediately return this by e-mail and then delete
it.

============================================================================
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
_._

----------------------------------------------------------------------------
--
Notice: This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that
may be confidential, proprietary copyrighted and/or legally privileged, and
is intended solely for the use of the individual or entity named on this
message.  If you are not the intended recipient, and have received this
message in error, please immediately return this by e-mail and then delete
it.

============================================================================

------------------------------------------------------------------------------
Notice:  This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that
may be confidential, proprietary copyrighted and/or legally privileged, and is
intended solely for the use of the individual or entity named in this message. 
If you are not the intended recipient, and have received this message in error,
please immediately return this by e-mail and then delete it.

=============================================================================
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Jan_Svatos@eurotel.cz

2002-Jul-31 06:52 UTC

head link

[R] chisq.test, basic question

Hi,

your first use of chisq.test is correct.
But by multiplying by 100 and dividing by sum(m) (210), you analyze
different experiment
(with fewer "observations") and, in general, this is a _gross_
mistake.
In general, our example is (very basic, though) a well-known problem with
statistical vs. practical "significance".
Just try to chisq.test(2*m), chisq.test(3*m), etc.
With sufficiently large sample it is almost sure (in practical, not
mathematical meaning) that you get
statistically significant difference even when practical, "real-life"
difference is negligible.

An trivial example:
m<-matrix(c(100,101,110,115),2,2) #rows and cols are "practically"
independent
chisq.test(m)  #X-squared = 0.0065, df = 1, p-value = 0.9357
chisq.test(10*m)  #X-squared = 0.2823, df = 1, p-value = 0.5952
chisq.test(100*m)  #X-squared = 3.1241, df = 1, p-value = 0.07714
chisq.test(1000*m)  #X-squared = 31.551, df = 1, p-value = 1.943e-08

Therefore, your question about m2 is due to misunderstanding of
math-statistical principles behind chisq.test.

HTH,
Jan

-------------------------------------------------
designed for _monospaced_ font
-------------------------------------------------
/- Jan Svatos,  PhD         Sokolovska 855/225 -/
/- Data Analyst             Prague 9           -/
/- Eurotel Praha            190 00             -/
/- jan_svatos at eurotel.cz    Czechia            -/
-------------------------------------------------


- - - Original message: - - -
From: owner-r-help at stat.math.ethz.ch
Send: 30.7.2002 18:47:51
To: r-help <r-help at stat.math.ethz.ch>
Subject: [R] chisq.test, basic question

Dear R-users,
I have a question, which I?m not sure if it is related to my
misunderstanding of basic statistics, or my misunderstanding of R, or
both.
I?ve got the counts of a 2 x 2 contingency table, and I'd like to test
the association:

m <-  matrix(c(15,28,32,135), 2, 2)
colnames(m) <- c("R-", "R+"); rownames(m) <-
c("P-", "P+")
m
#    R-  R+
# P- 15  32
# P+ 28 135

chisq.test(m)  # X-squared = 4.0027, df = 1, p-value = 0.04543

Is this the correct way to test association between P and R? (I haven?t
got the original data).
My problem is that if I use percentage, then I get different results:

m2 <- 100*m/sum(m) #
chisq.test(round(m2)) # X-squared = 1.5318, df = 1, p-value = 0.2158

Should this give about the same (a part from the rounding)? Should the
degree of association between P and R be he same?  Or, am I using
chisq.test() wrongly?

Thanks in advance,

Juli


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
_._._

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Possibly Parallel Threads

Search for more maybe matching threads

R help - Jul 2002 - chisq.test, basic question

[R] chisq.test, basic question

[R] chisq.test, basic question

[R] chisq.test, basic question

[R] chisq.test, basic question

[R] chisq.test, basic question

Possibly Parallel Threads