Dear Thomas, it is, unfortunately, not that simple. t.test() returns an object of class "htest" and not all such objects have standard errors. I'm not entirely sure what the point is since it's easy to compute the standard error of the difference from the information in the object (adapting an example from ?t.test):> (res <- t.test(1:10, y = c(7:20)))Welch Two Sample t-test data: 1:10 and c(7:20) t = -5.4349, df = 21.982, p-value = 1.855e-05 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -11.052802 -4.947198 sample estimates: mean of x mean of y 5.5 13.5> as.vector(abs(diff(res$estimate)/res$statistic)) # SE[1] 1.47196> class(res)[1] "htest" and if you really want to print the SE as a matter of course, you could always write your own wrapper for t.test() that returns an object of class, say, "t.test" for which you can provide a print() method. Much of the advantage of working in a statistical computing environment like R (or Stata, for that matter) is that you can make things work the way you like. Best, John ------------------------------------------------- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada Web: http::/socserv.mcmaster.ca/jfox> On Feb 21, 2019, at 3:57 PM, Thomas J. Leeper <thosjleeper at gmail.com> wrote: > > A recent thread on Twitter [1] by a Stata user highlighted that t.test() > does not return or print the standard error of the mean difference, despite > it being calculated by the function. > > I know this isn?t the kind of change that?s likely to be made but could we > at least return the SE even if the print() method isn?t updated? Or, > better, update the print() method to display this as well? > > Best, > Thomas > > [1] > https://twitter.com/amandayagan/status/1098314654470819840?s=21 > -- > > Thomas J. Leeper > http://www.thomasleeper.com > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Hello, Something like this? t.test2 <- function(...) { ht <- t.test(...) class(ht) <- c("htest_tjl", class(ht)) ht } print.htest_tjl <- function(x, ...) { NextMethod(x, ...) se <- as.vector(abs(diff(x$estimate)/x$statistic)) cat("Standard error of the difference:", se, "\n\n") invisible(x) } t.test2(1:10, y = c(7:20)) t.test2(extra ~ group, data = sleep) # last example from ?t.test (The suffix tjl commes from the OP's initials.) Hope this helps, Rui Barradas ?s 21:51 de 21/02/2019, Fox, John escreveu:> Dear Thomas, > > it is, unfortunately, not that simple. t.test() returns an object of class "htest" and not all such objects have standard errors. I'm not entirely sure what the point is since it's easy to compute the standard error of the difference from the information in the object (adapting an example from ?t.test): > >> (res <- t.test(1:10, y = c(7:20))) > > Welch Two Sample t-test > > data: 1:10 and c(7:20) > t = -5.4349, df = 21.982, p-value = 1.855e-05 > alternative hypothesis: true difference in means is not equal to 0 > 95 percent confidence interval: > -11.052802 -4.947198 > sample estimates: > mean of x mean of y > 5.5 13.5 > >> as.vector(abs(diff(res$estimate)/res$statistic)) # SE > [1] 1.47196 >> class(res) > [1] "htest" > > and if you really want to print the SE as a matter of course, you could always write your own wrapper for t.test() that returns an object of class, say, "t.test" for which you can provide a print() method. Much of the advantage of working in a statistical computing environment like R (or Stata, for that matter) is that you can make things work the way you like. > > Best, > John > > ------------------------------------------------- > John Fox, Professor Emeritus > McMaster University > Hamilton, Ontario, Canada > Web: http::/socserv.mcmaster.ca/jfox > >> On Feb 21, 2019, at 3:57 PM, Thomas J. Leeper <thosjleeper at gmail.com> wrote: >> >> A recent thread on Twitter [1] by a Stata user highlighted that t.test() >> does not return or print the standard error of the mean difference, despite >> it being calculated by the function. >> >> I know this isn?t the kind of change that?s likely to be made but could we >> at least return the SE even if the print() method isn?t updated? Or, >> better, update the print() method to display this as well? >> >> Best, >> Thomas >> >> [1] >> https://twitter.com/amandayagan/status/1098314654470819840?s=21 >> -- >> >> Thomas J. Leeper >> http://www.thomasleeper.com >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >
Hi John, Thanks for your reply. Of course I could write a package and of course I would find that trivial to do. The point is this is a main entry point to R for probably (at this point) hundreds of thousands of students. I?d like them to be able to get a basic quantity of interest from a t-test without four subsequent function calls. I also don?t really see the point about the object class, given we?re talking S3. print() doesn?t have to print everything in the object (see e.g., print.lm() ), so there should be little harm in returning additional information when relevant. Leaving the print() method unchanged and simply returning the SE as an additional element should affect almost nothing. I?m all for continuity and conservative development, but we also should aim to make R as useful and usable as possible. This seems like a nice simple way to do that. Best, Thomas On Thu, 21 Feb 2019 at 21:51 Fox, John <jfox at mcmaster.ca> wrote:> Dear Thomas, > > it is, unfortunately, not that simple. t.test() returns an object of class > "htest" and not all such objects have standard errors. I'm not entirely > sure what the point is since it's easy to compute the standard error of the > difference from the information in the object (adapting an example from > ?t.test): > > > (res <- t.test(1:10, y = c(7:20))) > > Welch Two Sample t-test > > data: 1:10 and c(7:20) > t = -5.4349, df = 21.982, p-value = 1.855e-05 > alternative hypothesis: true difference in means is not equal to 0 > 95 percent confidence interval: > -11.052802 -4.947198 > sample estimates: > mean of x mean of y > 5.5 13.5 > > > as.vector(abs(diff(res$estimate)/res$statistic)) # SE > [1] 1.47196 > > class(res) > [1] "htest" > > and if you really want to print the SE as a matter of course, you could > always write your own wrapper for t.test() that returns an object of class, > say, "t.test" for which you can provide a print() method. Much of the > advantage of working in a statistical computing environment like R (or > Stata, for that matter) is that you can make things work the way you like. > > Best, > John > > ------------------------------------------------- > John Fox, Professor Emeritus > McMaster University > Hamilton, Ontario, Canada > Web: http::/socserv.mcmaster.ca/jfox > > > On Feb 21, 2019, at 3:57 PM, Thomas J. Leeper <thosjleeper at gmail.com> > wrote: > > > > A recent thread on Twitter [1] by a Stata user highlighted that t.test() > > does not return or print the standard error of the mean difference, > despite > > it being calculated by the function. > > > > I know this isn?t the kind of change that?s likely to be made but could > we > > at least return the SE even if the print() method isn?t updated? Or, > > better, update the print() method to display this as well? > > > > Best, > > Thomas > > > > [1] > > https://twitter.com/amandayagan/status/1098314654470819840?s=21 > > -- > > > > Thomas J. Leeper > > http://www.thomasleeper.com > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > --Thomas J. Leeper http://www.thomasleeper.com [[alternative HTML version deleted]]
>>>>> Thomas J Leeper >>>>> on Thu, 21 Feb 2019 22:21:21 +0000 writes:> Hi John, > Thanks for your reply. Of course I could write a package and of course I > would find that trivial to do. The point is this is a main entry point to R > for probably (at this point) hundreds of thousands of students. I?d like > them to be able to get a basic quantity of interest from a t-test without > four subsequent function calls. > I also don?t really see the point about the object class, given we?re > talking S3. print() doesn?t have to print everything in the object (see > e.g., print.lm() ), so there should be little harm in returning additional > information when relevant. Leaving the print() method unchanged and simply > returning the SE as an additional element should affect almost nothing. > I?m all for continuity and conservative development, but we also should aim > to make R as useful and usable as possible. This seems like a nice simple > way to do that. I agree with both John Fox and Thomas Leeper (well, with a subset of their union ;-) John made the point [and Rui nicely showed how to implement it] that *printing* such a standard error in addition to the other stats needs potentially more changes via a specialized print() method. Also, R in general has refused for good reasons to behave like old batch stats software and printing all possibly interesting output. .. and I'd really like us to stay in that tradition and hence *not* print such extra numbers. Also, if your students learn slightly more about stats, they are hopefully taught that the t.test is a simple special case of linear (gaussian) regression, and you can teach them the corresponding summary(lm( .. )) which gives identical t-stat and p-value and does print SEs. OTOH, I agree with Thomas (and IIRC earlier correspondents on this issue) that it does seem natural to return the SE here, as it is crucially used in the formula for the t-stat anyway, and people can use it to easily compute confidence intervals (or p-values if really desired) for other levels than 95% / 5% .. So adding another component to the list returned by t.test() seems fine to me, and hopefully saves us future e-mails on the topic [... well almost surely there will be those asking us to change the print() method too, but we'll survive that.] Martin > On Thu, 21 Feb 2019 at 21:51 Fox, John <jfox at mcmaster.ca> wrote: >> Dear Thomas, >> >> it is, unfortunately, not that simple. t.test() returns an object of class >> "htest" and not all such objects have standard errors. I'm not entirely >> sure what the point is since it's easy to compute the standard error of the >> difference from the information in the object (adapting an example from >> ?t.test): >> >> > (res <- t.test(1:10, y = c(7:20))) >> >> Welch Two Sample t-test >> >> data: 1:10 and c(7:20) >> t = -5.4349, df = 21.982, p-value = 1.855e-05 >> alternative hypothesis: true difference in means is not equal to 0 >> 95 percent confidence interval: >> -11.052802 -4.947198 >> sample estimates: >> mean of x mean of y >> 5.5 13.5 >> >> > as.vector(abs(diff(res$estimate)/res$statistic)) # SE >> [1] 1.47196 >> > class(res) >> [1] "htest" >> >> and if you really want to print the SE as a matter of course, you could >> always write your own wrapper for t.test() that returns an object of class, >> say, "t.test" for which you can provide a print() method. Much of the >> advantage of working in a statistical computing environment like R (or >> Stata, for that matter) is that you can make things work the way you like. >> >> Best, >> John >> >> ------------------------------------------------- >> John Fox, Professor Emeritus >> McMaster University >> Hamilton, Ontario, Canada >> Web: http::/socserv.mcmaster.ca/jfox >> >> > On Feb 21, 2019, at 3:57 PM, Thomas J. Leeper <thosjleeper at gmail.com> >> wrote: >> > >> > A recent thread on Twitter [1] by a Stata user highlighted that t.test() >> > does not return or print the standard error of the mean difference, >> despite >> > it being calculated by the function. >> > >> > I know this isn?t the kind of change that?s likely to be made but could >> we >> > at least return the SE even if the print() method isn?t updated? Or, >> > better, update the print() method to display this as well? >> > >> > Best, >> > Thomas >> > >> > [1] >> > https://twitter.com/amandayagan/status/1098314654470819840?s=21 >> > -- >> > >> > Thomas J. Leeper >> > http://www.thomasleeper.com >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-devel at r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-devel >> >> -- > Thomas J. Leeper > http://www.thomasleeper.com > [[alternative HTML version deleted]] > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
It's not a problem per se to put additional information into class htest objects (hey, it's S3 after all...) and there is a precedent in chisq.test which returns $observed and $expected. Getting such information printed by print.htest is more tricky, although it might be possible to (ab)use the $estimate slot. The further question is whether one would really want to do that (change the output and/or modify the current return values), at the risk of affecting a rather large bundle of existing scripts, books, lecture notes, etc. I don't think that I would want to do that for the case of the s.e.d., although I'll admit that there is another thing that has always been a bit of an eyesore to me: We give a confidence interval but not the corresponding point estimate (i.e. the _difference_ of the means). It might be better to simply start over and write a new function. In the process one might address other things that people have been asking for, like calculations based on the sample mean and SDs (which would useful for dealing with published summaries and textbook examples). Oh, and a formula interface for the one-sample test. -pd> On 21 Feb 2019, at 22:51 , Fox, John <jfox at mcmaster.ca> wrote: > > Dear Thomas, > > it is, unfortunately, not that simple. t.test() returns an object of class "htest" and not all such objects have standard errors. I'm not entirely sure what the point is since it's easy to compute the standard error of the difference from the information in the object (adapting an example from ?t.test): > >> (res <- t.test(1:10, y = c(7:20))) > > Welch Two Sample t-test > > data: 1:10 and c(7:20) > t = -5.4349, df = 21.982, p-value = 1.855e-05 > alternative hypothesis: true difference in means is not equal to 0 > 95 percent confidence interval: > -11.052802 -4.947198 > sample estimates: > mean of x mean of y > 5.5 13.5 > >> as.vector(abs(diff(res$estimate)/res$statistic)) # SE > [1] 1.47196 >> class(res) > [1] "htest" > > and if you really want to print the SE as a matter of course, you could always write your own wrapper for t.test() that returns an object of class, say, "t.test" for which you can provide a print() method. Much of the advantage of working in a statistical computing environment like R (or Stata, for that matter) is that you can make things work the way you like. > > Best, > John > > ------------------------------------------------- > John Fox, Professor Emeritus > McMaster University > Hamilton, Ontario, Canada > Web: http::/socserv.mcmaster.ca/jfox > >> On Feb 21, 2019, at 3:57 PM, Thomas J. Leeper <thosjleeper at gmail.com> wrote: >> >> A recent thread on Twitter [1] by a Stata user highlighted that t.test() >> does not return or print the standard error of the mean difference, despite >> it being calculated by the function. >> >> I know this isn?t the kind of change that?s likely to be made but could we >> at least return the SE even if the print() method isn?t updated? Or, >> better, update the print() method to display this as well? >> >> Best, >> Thomas >> >> [1] >> https://twitter.com/amandayagan/status/1098314654470819840?s=21 >> -- >> >> Thomas J. Leeper >> http://www.thomasleeper.com >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
>>>>> peter dalgaard >>>>> on Fri, 22 Feb 2019 12:38:14 +0100 writes:> It's not a problem per se to put additional information > into class htest objects (hey, it's S3 after all...) and > there is a precedent in chisq.test which returns $observed > and $expected. It seems the consent is to simply return the SE but *not* change the print() method, and also be careful not to mess with existing parts of the result. So, a minimal patch is to add the short line stderr = stderr, inside the list(..) constucting the return value... and that's what I'm planning to commit (to the sources). With thanks for the suggestion and considerations to Thomas, John and Peter! Martin > Getting such information printed by print.htest is more tricky, although it might be possible to (ab)use the $estimate slot. > The further question is whether one would really want to do that (change the output and/or modify the current return values), at the risk of affecting a rather large bundle of existing scripts, books, lecture notes, etc. I don't think that I would want to do that for the case of the s.e.d., although I'll admit that there is another thing that has always been a bit of an eyesore to me: We give a confidence interval but not the corresponding point estimate (i.e. the _difference_ of the means). > It might be better to simply start over and write a new function. In the process one might address other things that people have been asking for, like calculations based on the sample mean and SDs (which would useful for dealing with published summaries and textbook examples). Oh, and a formula interface for the one-sample test. > -pd >> On 21 Feb 2019, at 22:51 , Fox, John <jfox at mcmaster.ca> wrote: >> >> Dear Thomas, >> >> it is, unfortunately, not that simple. t.test() returns an object of class "htest" and not all such objects have standard errors. I'm not entirely sure what the point is since it's easy to compute the standard error of the difference from the information in the object (adapting an example from ?t.test): >> >>> (res <- t.test(1:10, y = c(7:20))) >> >> Welch Two Sample t-test >> >> data: 1:10 and c(7:20) >> t = -5.4349, df = 21.982, p-value = 1.855e-05 >> alternative hypothesis: true difference in means is not equal to 0 >> 95 percent confidence interval: >> -11.052802 -4.947198 >> sample estimates: >> mean of x mean of y >> 5.5 13.5 >> >>> as.vector(abs(diff(res$estimate)/res$statistic)) # SE >> [1] 1.47196 >>> class(res) >> [1] "htest" >> >> and if you really want to print the SE as a matter of course, you could always write your own wrapper for t.test() that returns an object of class, say, "t.test" for which you can provide a print() method. Much of the advantage of working in a statistical computing environment like R (or Stata, for that matter) is that you can make things work the way you like. >> >> Best, >> John >> >> ------------------------------------------------- >> John Fox, Professor Emeritus >> McMaster University >> Hamilton, Ontario, Canada >> Web: http::/socserv.mcmaster.ca/jfox >> >>> On Feb 21, 2019, at 3:57 PM, Thomas J. Leeper <thosjleeper at gmail.com> wrote: >>> >>> A recent thread on Twitter [1] by a Stata user highlighted that t.test() >>> does not return or print the standard error of the mean difference, despite >>> it being calculated by the function. >>> >>> I know this isn?t the kind of change that?s likely to be made but could we >>> at least return the SE even if the print() method isn?t updated? Or, >>> better, update the print() method to display this as well? >>> >>> Best, >>> Thomas >>> >>> [1] >>> https://twitter.com/amandayagan/status/1098314654470819840?s=21 >>> -- >>> >>> Thomas J. Leeper >>> http://www.thomasleeper.com >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-devel at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > -- > Peter Dalgaard, Professor, > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Office: A 4.23 > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel