I have just joined this list (and just started using R), so please excuse any etiquette breaches as I do not yet have a feel for how the list operates. I am in the process of teaching myself statistics using R as my utility as my ultimate goals cannot be satisfied by Excel or any of the plug-ins I could afford. I am currently looking at chap12 page 552 of Weiss's Introductory Statistics 9th edition. Example 12.5 demonstrates using "Technology" to obtain a One-Proportion z-Interval. n=202 x=1010 confidence interval = .95. Answer given by Minitab 0.175331, .224669 Answer given by TI-83/84 .17533, .22467 Answer given by Weiss's Excel Plug-in 0.175 < p < 0.225 Here is what I got with R prop.test(202,1010,correct="FALSE") 1-sample proportions test without continuity correction data: 202 out of 1010, null probability 0.5 X-squared = 363.6, df = 1, p-value < 2.2e-16 alternative hypothesis: true p is not equal to 0.5 95 percent confidence interval: 0.1764885 0.2257849 sample estimates: p 0.2 I'm also getting slight differences in the answers for exercises and find this disconcerting. Why are these differences present (or am I doing something wrong)? Jack
On 17-Jul-11 16:27:25, Jack Sofsky wrote:> I have just joined this list (and just started using R), so please > excuse any etiquette breaches as I do not yet have a feel for how the > list operates. > > I am in the process of teaching myself statistics using R as my utility > as my ultimate goals cannot be satisfied by Excel or any of the > plug-ins > I could afford. > > I am currently looking at chap12 page 552 of Weiss's Introductory > Statistics 9th edition. Example 12.5 demonstrates using "Technology" > to > obtain a One-Proportion z-Interval. > > n=202 > x=1010 > confidence interval = .95. > > Answer given by Minitab > 0.175331, .224669 > Answer given by TI-83/84 > .17533, .22467 > Answer given by Weiss's Excel Plug-in > 0.175 < p < 0.225 > > Here is what I got with R > prop.test(202,1010,correct="FALSE") > > 1-sample proportions test without continuity correction > > data: 202 out of 1010, null probability 0.5 > X-squared = 363.6, df = 1, p-value < 2.2e-16 > alternative hypothesis: true p is not equal to 0.5 > 95 percent confidence interval: > 0.1764885 0.2257849 > sample estimates: > p > 0.2 > > I'm also getting slight differences in the answers for exercises > and find this disconcerting. > > Why are these differences present (or am I doing something wrong)? > JackYou are not doing anything wrong (at any rate where prop.test is concerned). The point is that (certainly for R's prop.test, undoubtedly also for the others to which, however, I do not have access) none of these procedures uses an exact method -- all are based on some form of approximation. In the case of R's prop.test (see the help in '?prop.test') "The confidence interval is computed by inverting the score test." That is to say that (possibly after applying Yates's correction) a Normal-distribution approximation is used for the distribution of the Z score (deviation/(SD of deviation). I do not know what methods the others use. No doubt the different answers are the result of using different approximations. If you want a really exact method, find a) The highest value of p such that the probability of a result greater than or equal to x=202 when n=1010 is at most 0.025 (2.5%) b) The lowest value of p such that the probability of a result less than or equal to x=202 when n=1010 at at least 0.975 (97.5%) These are then, respectively, the lower and upper 95% confidence limits for p with equal "non-coverage" probability (2.5%) at either side. You can search for these "by hand", with results: a): pbinom(201,1010,0.175739300) # [1] 0.97500000 ## = 1 - Prob(x >= 202) b): pbinom(202,1010,0.226022815) # [1] 0.02500000 ## = Prob(x <= 202) so (to 7 decimal places) an exact 96% CI for p is (0.1757393,0.2260228) This agrees with none of the methods you tried (though all are fairly close together): (0.1757393,0.2260228) ## Above exact (0.1753310,0.2246690) ## Minitab (0.1753300,0.2246700) ## TI-83/84 (0.1750000,0.2250000) ## Weiss's Excel Plug-in (0.1764885,0.2257849) ## R's prop.test Hoping this helps, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <ted.harding at wlandres.net> Fax-to-email: +44 (0)870 094 0861 Date: 17-Jul-11 Time: 23:24:47 ------------------------------ XFMail ------------------------------
The Minitab and TI results are (modulo different levels of rounding) just what you'd get from doing the problem ``by hand'' in the good old-fashioned way. :-) The Excel result appears to be the same with an excessive level of rounding. The ``by-hand'' procedure uses the plug-in method to get an approximation to the standard error of p.hat. The prop.test() function uses a more sophisticated approach which involves solving a quadratic equation to determine the endpoints of the confidence interval. This more sophisticated solution is a pain in the pohutukawa ( :-) ) to calculate by hand, but if you've got a computer to do the nasty arithmetic for you, then why not? The formula for the confidence interval endpoints using the more sophisticated method can be found e.g. in Jay L. Devore, Probability and Statistics for Engineering and the Sciences, Thomson --- Brooks/Cole, 6th ed., 2004, page 295. equation (7.10). The much simpler old-fashioned formula is given on the same page as equation (7.11). Presumably these formulae can be found in the Newcombe reference cited in the help for prop.test(); I haven't checked. HTH cheers, Rolf Turner On 18/07/11 04:27, Jack Sofsky wrote:> I have just joined this list (and just started using R), so please > excuse any etiquette breaches as I do not yet have a feel for how the > list operates. > > I am in the process of teaching myself statistics using R as my > utility as my ultimate goals cannot be satisfied by Excel or any of > the plug-ins I could afford. > > I am currently looking at chap12 page 552 of Weiss's Introductory > Statistics 9th edition. Example 12.5 demonstrates using "Technology" > to obtain a One-Proportion z-Interval. > > n=202 > x=1010 > confidence interval = .95. > > Answer given by Minitab > 0.175331, .224669 > Answer given by TI-83/84 > .17533, .22467 > Answer given by Weiss's Excel Plug-in > 0.175 < p < 0.225 > > Here is what I got with R > prop.test(202,1010,correct="FALSE") > > 1-sample proportions test without continuity correction > > data: 202 out of 1010, null probability 0.5 > X-squared = 363.6, df = 1, p-value < 2.2e-16 > alternative hypothesis: true p is not equal to 0.5 > 95 percent confidence interval: > 0.1764885 0.2257849 > sample estimates: > p > 0.2 > > I'm also getting slight differences in the answers for exercises and > find this disconcerting. > > Why are these differences present (or am I doing something wrong)?
Others have explained why R gives a different answer based on a different approximation, but if you want to get the same answer as the book/minitab/... for your own understanding (or so the grader doesn't get confused by superior answers, or other reasons) here is one way to do it:> x <- c( rep(1,202), rep(0, 1010-202) ) > p <- 202/1010 > sd <- sqrt( p*(1-p) ) > > library(TeachingDemos) > z.test( x, 0.5, sd )One Sample z-test data: x z = -23.8354, n = 1010.000, Std. Dev. = 0.400, Std. Dev. of the sample mean = 0.013, p-value < 2.2e-16 alternative hypothesis: true mean is not equal to 0.5 95 percent confidence interval: 0.1753312 0.2246688 sample estimates: mean of x 0.2 Which matches the others you reported below. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Jack Sofsky > Sent: Sunday, July 17, 2011 10:27 AM > To: r-help at r-project.org > Subject: [R] ?Accuracy of prop.test > > I have just joined this list (and just started using R), so please > excuse any etiquette breaches as I do not yet have a feel for how the > list operates. > > I am in the process of teaching myself statistics using R as my utility > as my ultimate goals cannot be satisfied by Excel or any of the plug- > ins > I could afford. > > I am currently looking at chap12 page 552 of Weiss's Introductory > Statistics 9th edition. Example 12.5 demonstrates using "Technology" > to > obtain a One-Proportion z-Interval. > > n=202 > x=1010 > confidence interval = .95. > > Answer given by Minitab > 0.175331, .224669 > Answer given by TI-83/84 > .17533, .22467 > Answer given by Weiss's Excel Plug-in > 0.175 < p < 0.225 > > Here is what I got with R > prop.test(202,1010,correct="FALSE") > > 1-sample proportions test without continuity correction > > data: 202 out of 1010, null probability 0.5 > X-squared = 363.6, df = 1, p-value < 2.2e-16 > alternative hypothesis: true p is not equal to 0.5 > 95 percent confidence interval: > 0.1764885 0.2257849 > sample estimates: > p > 0.2 > > I'm also getting slight differences in the answers for exercises and > find this disconcerting. > > Why are these differences present (or am I doing something wrong)? > > Jack > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.