automorphism at gmail.com
2007-Jun-29 20:55 UTC
[Rd] Shapiro Test P Value Incorrect? (PR#9768)
Full_Name: Jason Polak Version: R version 2.5.0 (2007-04-23) OS: Xubuntu 7.04 Submission from: (NULL) (137.122.144.35) Dear R group, I have noticed a strange anomaly with the shapiro.test() function. Unfortunately I do not know how to calculate the shapiro test P values manually so I don't know if this is an actual bug. So, to produce the results, run the following code: pvalues = 0; for (i in 1:17) { j = 1:(i+3); pvalues[i]=shapiro.test(j)$p; } plot(pvalues); print(pvalues); Now I just made the graph to illustrate that the p-values are strictly decreasing. To me this makes intuitive sense: we are using the Shapiro test to test normality of (1,2,3,4),(1,2,3,4,5), and so on. So the p-value should decrease. These are the p-values: [1] 0.9718776 0.9671740 0.9605557 0.9492892 0.9331653 0.9135602 0.8923668 [8] 0.8698419 0.8757315 0.8371814 0.7964400 0.7545289 0.7123167 0.6704457 [15] 0.6294307 0.5896464 0.5513749 However, there is an increase in p-value when you go from (1,..,11) to (1,..,12). Is this just a quirk of the Shapiro test, or is there an error in the calculation algorithm?
This is not a bug. The algorithm uses different approximation of the p-value for n=3 (exact value), 4<=n<=11 and n>=12 as seen in src/library/stats/src/swilk.c below the line 202 /* Calculate significance level for W */ The W statistic monotonically decreases in the presented example. Petr.> Full_Name: Jason Polak > Version: R version 2.5.0 (2007-04-23) > OS: Xubuntu 7.04 > Submission from: (NULL) (137.122.144.35) > > > Dear R group, > > I have noticed a strange anomaly with the shapiro.test() function. Unfortunately > I do not know how to calculate the shapiro test P values manually so I don't > know if this is an actual bug. > > So, to produce the results, run the following code: > > pvalues = 0; > for (i in 1:17) > { > j = 1:(i+3); > pvalues[i]=shapiro.test(j)$p; > } > > plot(pvalues); > print(pvalues); > > Now I just made the graph to illustrate that the p-values are strictly > decreasing. To me this makes intuitive sense: we are using the Shapiro test to > test normality of (1,2,3,4),(1,2,3,4,5), and so on. So the p-value should > decrease. > > These are the p-values: > [1] 0.9718776 0.9671740 0.9605557 0.9492892 0.9331653 0.9135602 0.8923668 > [8] 0.8698419 0.8757315 0.8371814 0.7964400 0.7545289 0.7123167 0.6704457 > [15] 0.6294307 0.5896464 0.5513749 > > However, there is an increase in p-value when you go from (1,..,11) to > (1,..,12). Is this just a quirk of the Shapiro test, or is there an error in the > calculation algorithm? > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >