thr3ads.net - R help - [R] Odd results with Chi-square test. (Not an R problem, but general statistics, I think) [Aug 2009]

If this information is useful, please help other people find it:
Share via:

Polwart Calum (County Durham and Darlington NHS Foundation Trust)

2009-Aug-18 17:17 UTC

[R] Odd results with Chi-square test. (Not an R problem, but general statistics, I think)

I'm far from an expert on stats but what I think you are saying is if you
try and compare Baseline with Version 3 you don't think your p-value is as
good as version 1 and 2.  I'm not 100% sure you are meant to do that with
p-values but I'll let someone else comment on that!.

                total    incorrect  correct   % correct
baseline     898      708         190       21.2%
version_1   898      688         210       23.4%
version_2   898      680         218      24.3%
version_3   1021    790          231      22.6%
>
> Here, the p value for version_3 (when compared with the baseline) seems to
> make no sense whatsoever. It shouldn't be larger that the other two p
> values, the increase in correct answers (that is what counts!) is bigger
> after all.
>No its not the raw numbers its the proportion of correct answers that counts.

I've added a % correct to your data - does that  make it clearer?  Only
22.6% of version 3's answers were correct - so the difference in terms of %
change is smaller than version 1 and 2 produced.  From my niave persepctive
I'd want to test for a difference between all results and baseline, and then
v1 & v2, v1 & v3, v2 & v3  (you may tell me they are unsound things
to test - in which case don't test them.  You'd then need to determine a
threshold for accepting that the test is valid (say p < 0.05).  I'#d
contest that the test should be two tailed - results could be better or worse?

You should also develop a hypothesis.  Let me create one for you:


A.
H1: version1 of the software is better than baseline
(H0: version 1 is no better than baseline)

B.
H1: version2 of the software is better than version 1
(H0: version 2 is no better than version 1)

C.
H1: version3 of the software is better than version 2
(H0: version 3 is no better than version 2)

Now look at you results and p-values and and work out if the H1 or H0 applies.
You could develop further variants (D: version 3 is better than baseline).

Finally - remember to consider the 'clinical significance' as well as
the statistical significance.  I'd have hoped a software change might have
increase correct answers to say 40%?  And remember also that p-value of 0.05 has
a false positive rate of 1:20.
>
> Any idea what's going on here? I thought the sample size should have no
> impact on the results?
>Erm.. sample size always has an influence of results,  If you show a  difference
in 100 samples - you would expect a larger p value for virtually any statistical
test you chose than if you show that same difference in 1000 results.  You have
a bigger sample but a smaller overall difference so in effect you can be less
sure that that change is not down to chance. (Purists statisticians will likely
challenge that definition)


********************************************************************************************************************

This message may contain confidential information. If yo...{{dropped:21}}

Apparently Analagous Threads

Search for more maybe matching threads

R help - Aug 2009 - Odd results with Chi-square test. (Not an R problem, but general statistics, I think)

[R] Odd results with Chi-square test. (Not an R problem, but general statistics, I think)

Apparently Analagous Threads

Wisdom of the Ancients