mik07
2009-Aug-18 14:29 UTC
[R] Odd results with Chi-square test. (Not an R problem, but general statistics, I think.)
Hi, I am working on a system which automatically answers user questions (such systems are commonly called "Question Answering systems"). I evaluated different versions of the same system on a publicly available test sets. Naturally, there is a fixed number of questions in the test set, and the system answers some right and some wrong. I want to compare each version of the system against a baseline and see whether the increase is statistically significant. I used one-tailed chi square tests for this. Here's the data I got: Test set 1: total incorrect correct p baseline 1908 1718 190 version_1 1908 1698 210 0,145 version_2 1908 1690 218 0,071 version_3 1908 1677 231 0,017 I compared every version with the baseline, so that I get something like a 2x2 contingency table, as here: incorrect correct baseline 1718 190 version_1 1698 210 p: 0,145 This works fine, the results seem to make sense intuitively. First question: Do you think this is a legitimate way to compute significance? But then I also have figures on *partial* test sets, because there are some questions for which we just cannot expect the system to return correct answers. (The reason for this is beyond the scope of this post.) So different versions of the system work on test sets of different sizes. Then we get: Test set 2: total incorrect correct p baseline 898 708 190 version_1 898 688 210 0,128 version_2 898 680 218 0,057 version_3 1021 790 231 0,219 Here, the p value for version_3 (when compared with the baseline) seems to make no sense whatsoever. It shouldn't be larger that the other two p values, the increase in correct answers (that is what counts!) is bigger after all. Any idea what's going on here? I thought the sample size should have no impact on the results? Thanks a lot, Mika -- View this message in context: http://www.nabble.com/Odd-results-with-Chi-square-test.-%28Not-an-R-problem%2C-but-general-statistics%2C-I-think.%29-tp25026167p25026167.html Sent from the R help mailing list archive at Nabble.com.
mik07
2009-Aug-19 10:55 UTC
[R] Odd results with Chi-square test. (Not an R problem, but general statistics, I think.)
Anybody any ideas? Any help would be appreciated! Cheers, Mika -- View this message in context: http://www.nabble.com/Odd-results-with-Chi-square-test.-%28Not-an-R-problem%2C-but-general-statistics%2C-I-think.%29-tp25026167p25041900.html Sent from the R help mailing list archive at Nabble.com.
David Winsemius
2009-Aug-19 13:16 UTC
[R] Odd results with Chi-square test. (Not an R problem, but general statistics, I think.)
On Aug 19, 2009, at 6:55 AM, mik07 wrote:> > > Anybody any ideas?Post in an appropriate forum? sci.stat.consult?> Any help would be appreciated! > > Cheers, > Mika > -- > View this message in context: http://www.nabble.com/Odd-results-with-Chi-square-test.-%28Not-an-R-problem%2C-but-general-statistics%2C-I-think.%29-tp25026167p25041900.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________David Winsemius, MD Heritage Laboratories West Hartford, CT