Sometimes the p.value returned by t.test() is the same that I calculate using pt() and sometimes it's not. I don't understand the difference. I'm sure there is a simple explanation but I haven't been able to find it, even after looking at the code for t.test.default. I apologize if this is a basic and obvious question. For example:> data(sleep) > t.test(extra~group,data=sleep,var.equal=T)# the p.value returned is 0.07939> 2*pt(-1.8608,18) # using the t.statistic and the df returned above[1] 0.0791887 These p.values are the same. However, they are different when I use a different dataset:> data(beavers) > b1 <- beaver1$temp > b2 <- beaver2$temp > t.test(b1,b2,var.equal=T)# the p.value returned is 2.2e-16> 2*pt(-15.9366,212) # using the t.statistic and the df returned above[1] 4.10686e-38 Jonathan B. Thayn, Ph.D. Illinois State University Department of Geography and Geology 200A Felmley Hall Normal, Illinois 61790 (309) 438-8112 jthayn@ilstu.edu my.ilstu.edu/~jthayn [[alternative HTML version deleted]]
On Tue, Nov 1, 2011 at 12:40 PM, Jonathan Thayn <jthayn at ilstu.edu> wrote:> Sometimes the p.value returned by t.test() is the same that I calculate using pt() and sometimes it's not. I don't understand the difference. I'm sure there is a simple explanation but I haven't been able to find it, even after looking at the code for t.test.default. I apologize if this is a basic and obvious question. For example: > >> data(sleep) >> t.test(extra~group,data=sleep,var.equal=T) > > # the p.value returned is 0.07939 > >> 2*pt(-1.8608,18) ? # using the t.statistic and the df returned above > [1] 0.0791887 > > These p.values are the same. However, they are different when I use a different dataset: > >> data(beavers) >> b1 <- beaver1$temp >> b2 <- beaver2$temp >> t.test(b1,b2,var.equal=T) > > # the p.value returned is 2.2e-16 > >> 2*pt(-15.9366,212) ? # using the t.statistic and the df returned above > [1] 4.10686e-38 > >If you read the output of t.test carefully, you will find something like p-value < 2.2e-16 not p-value = 2.2e-16 so the results are not inconsistent. Not sure why t.test is coded that way, perhaps the p-value calculation is not very reliable below roughly 2e-16. This issue could also come up if the function doesn't use lower/upper tail of the distribution function as needed and then must subtract the calculated results from 1 to obtain the returned value. Here's an example:> x = rnorm(100) > y = x<0 > t.test(x~y)Welch Two Sample t-test data: x by y t = 12.9463, df = 97.424, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 1.397253 1.903200 sample estimates: mean in group FALSE mean in group TRUE 0.7596083 -0.8906181 Now do a naive pt:> pt(12.9463, df = 97.424)[1] 1 my desired p-value is 1-pt(12.9463, df = 97.424) but that's zero. Of course, I can get the p-value in a more intelligent way,> pt(12.9463, df = 97.424, lower.tail = FALSE)[1] 3.394337e-23 Peter
The print method is the issue:> t.out <- t.test(b1,b2,var.equal=T) > t.out$p.value[1] 4.108001e-38> t.out$statistict -15.93656 albyn On Tue, Nov 01, 2011 at 02:40:15PM -0500, Jonathan Thayn wrote:> Sometimes the p.value returned by t.test() is the same that I calculate using pt() and sometimes it's not. I don't understand the difference. I'm sure there is a simple explanation but I haven't been able to find it, even after looking at the code for t.test.default. I apologize if this is a basic and obvious question. For example: > > > data(sleep) > > t.test(extra~group,data=sleep,var.equal=T) > > # the p.value returned is 0.07939 > > > 2*pt(-1.8608,18) # using the t.statistic and the df returned above > [1] 0.0791887 > > These p.values are the same. However, they are different when I use a different dataset: > > > data(beavers) > > b1 <- beaver1$temp > > b2 <- beaver2$temp > > t.test(b1,b2,var.equal=T) > > # the p.value returned is 2.2e-16 > > > 2*pt(-15.9366,212) # using the t.statistic and the df returned above > [1] 4.10686e-38 > > > Jonathan B. Thayn, Ph.D. > Illinois State University > Department of Geography and Geology > 200A Felmley Hall > Normal, Illinois 61790 > > (309) 438-8112 > jthayn at ilstu.edu > my.ilstu.edu/~jthayn > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Albyn Jones Reed College jones at reed.edu