Panos Hadjinicolaou
2010-Aug-10 11:30 UTC
[R] one (small) sample wilcox.test confidence intervals
Dear R people, I notice that the confidence intervals of a very small sample (e.g. n=6) derived from the one-sample wilcox.test are just the maximum and minimum values of the sample. This only occurs when the required confidence level is higher than 0.93. Example:> sample <- c(1.22, 0.89, 1.14, 0.98, 1.37, 1.06)> summary(sample)Min. 1st Qu. Median Mean 3rd Qu. Max. 0.89 1.00 1.10 1.11 1.20 1.37> wilcox.test(sample,conf.int=TRUE,conf.lev=0.95)Wilcoxon signed rank test data: sample V = 21, p-value = 0.03125 alternative hypothesis: true location is not equal to 0 95 percent confidence interval: 0.89 1.37 sample estimates: (pseudo)median 1.1 According to "help", since my sample contains less than 50 values, an exact p-value is calculated that should enable the confidence interval to be obtained from Bauer (1972) (I have not read it): << By default (if ‘exact’ is not specified), an exact p-value is computed if the samples contain less than 50 finite values and there are no ties. Otherwise, a normal approximation is used. ........... If exact p-values are available, an exact confidence interval is obtained by the algorithm described in Bauer (1972), and the Hodges-Lehmann estimator is employed. Otherwise, the returned confidence interval and point estimate are based on normal approximations. These are continuity-corrected for the interval but _not_ the estimate (as the correction depends on the ‘alternative’). With small samples it may not be possible to achieve very high confidence interval coverages. If this happens a warning will be given and an interval with lower coverage will be substituted. >> The latter indeed happens if I ask for confidence level of 0.99:> wilcox.test(sample,mu=0,conf.int=TRUE,conf.lev=0.99)Wilcoxon signed rank test data: sample V = 21, p-value = 0.03125 alternative hypothesis: true location is not equal to 0 96.9 percent confidence interval: 0.89 1.37 sample estimates: (pseudo)median 1.1 Warning message: In wilcox.test.default(sample, mu = 0, conf.int = TRUE, conf.lev = 0.99) : Requested conf.level not achievable My questions (finally!) are: 1. Why the above warning for conf.lev = 0.99 does not appear for 0.93 < conf.lev < 0.98 although it produces the same summary? 2. For conf.lev = 0.95, is there anything else I can do in order to obtain confidence intervals other than the max. and min. values of my sample or I am limited from my sample's size ? Thanks for your patience in reading this, Panos ------------------------------------------------------- Dr Panos Hadjinicolaou Energy Environment& Water Research Center (EEWRC) The Cyprus Institute ------------------------------------------------------- [[alternative HTML version deleted]]
peter dalgaard
2010-Aug-10 12:41 UTC
[R] one (small) sample wilcox.test confidence intervals
On Aug 10, 2010, at 1:30 PM, Panos Hadjinicolaou wrote:> My questions (finally!) are: > > 1. Why the above warning for conf.lev = 0.99 does not appear for 0.93 < conf.lev < 0.98 although it produces the same summary? > > 2. For conf.lev = 0.95, is there anything else I can do in order to obtain confidence intervals other than the max. and min. values of my sample or I am limited from my sample's size ? > > Thanks for your patience in reading this, > > PanosIt is as it should be. I think it is instructive to look at explicitly shifted samples, e.g.> wilcox.test(sample-1.369)Wilcoxon signed rank test data: sample - 1.369 V = 1, p-value = 0.0625 alternative hypothesis: true location is not equal to 0> wilcox.test(sample-1.371)Wilcoxon signed rank test data: sample - 1.371 V = 0, p-value = 0.03125 alternative hypothesis: true location is not equal to 0 Notice how the p-value jumps as the shift crosses 1.37. You can shift the distribution by 1.369999 to the left and have nonsignificant test that the center is at zero. However if you shift by more than 1.37, then you do get significance. This is true for all significance levels between 0.03125 and 0.0625 (and 0.03125 == 1/32, the probability that all ranks have the same sign). The above explains almost everything if you think a little about the definitions. The only slightly puzzling thing is why confidence levels larger than 1-0.03125 are considered achievable. The actual code has if (achieved.alpha - alpha > alpha/2) { warning("Requested conf.level not achievable") conf.level <- 1 - signif(achieved.alpha, 2) } so I have to assume that the author has considered this with some care. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com