I have two pairs of related vectors x1,y1 and x2,y2 I wish to do a test for differences in means of x1 and y1, ditto x2 and y2. I am getting odd results. I am not sure I am using 'pt' properly... I have not included the raw vectors as they are long. I am interested if I am using R properly...> c(length(x1), length(y1), length(x2), length(y2))[1] 3436 1619 2677 2378 First where the T-stat and the DF do not give the same result as 't.test' when passed into 'pt'> t.1 <- t.test(x1, y1) > 2 * pt(t.1$statistic, t.1$parameter)t 1.353946> t.1$p.value[1] 0.646054 I would have thought these would have been the same. Like below....> t.2 <- t.test(x2, y2) > 2 * pt(t.2$statistic, t.2$parameter)t 0.8679732> t.2$p.value[1] 0.8679732 This is what I expect. clearly I misunderstand some thing. What is it? cheers Worik [[alternative HTML version deleted]]
If it were not for the fact that I get inconsistent results I would be sure that I need... 2*pt(stat, df) Section 8.1 of R-intro.pdf is explicit. Problem is it gives inconsistent results Worik On Thu, Jun 17, 2010 at 10:30 AM, Worik R <worikr@gmail.com> wrote:> I have two pairs of related vectors > x1,y1 > > and > > x2,y2 > > I wish to do a test for differences in means of x1 and y1, ditto x2 and y2. > > I am getting odd results. I am not sure I am using 'pt' properly... > > I have not included the raw vectors as they are long. I am interested if I > am using R properly... > > > c(length(x1), length(y1), length(x2), length(y2)) > [1] 3436 1619 2677 2378 > > > First where the T-stat and the DF do not give the same result as 't.test' > when passed into 'pt' > > > t.1 <- t.test(x1, y1) > > 2 * pt(t.1$statistic, t.1$parameter) > t > 1.353946 > > t.1$p.value > [1] 0.646054 > > I would have thought these would have been the same. Like below.... > > > t.2 <- t.test(x2, y2) > > 2 * pt(t.2$statistic, t.2$parameter) > t > 0.8679732 > > t.2$p.value > [1] 0.8679732 > > This is what I expect. > > clearly I misunderstand some thing. What is it? > > cheers > Worik >[[alternative HTML version deleted]]
More: When the t-stat is > 0 should I use 'pt' differently? I have been checking my results and (except for the example I posted) all the inconsistencies occur when t>0 Worik On Thu, Jun 17, 2010 at 10:30 AM, Worik R <worikr@gmail.com> wrote:> I have two pairs of related vectors > x1,y1 > > and > > x2,y2 > > I wish to do a test for differences in means of x1 and y1, ditto x2 and y2. > > I am getting odd results. I am not sure I am using 'pt' properly... > > I have not included the raw vectors as they are long. I am interested if I > am using R properly... > > > c(length(x1), length(y1), length(x2), length(y2)) > [1] 3436 1619 2677 2378 > > > First where the T-stat and the DF do not give the same result as 't.test' > when passed into 'pt' > > > t.1 <- t.test(x1, y1) > > 2 * pt(t.1$statistic, t.1$parameter) > t > 1.353946 > > t.1$p.value > [1] 0.646054 > > I would have thought these would have been the same. Like below.... > > > t.2 <- t.test(x2, y2) > > 2 * pt(t.2$statistic, t.2$parameter) > t > 0.8679732 > > t.2$p.value > [1] 0.8679732 > > This is what I expect. > > clearly I misunderstand some thing. What is it? > > cheers > Worik >[[alternative HTML version deleted]]
On Wed, Jun 16, 2010 at 3:30 PM, Worik R <worikr at gmail.com> wrote:> I have two pairs of related vectors > x1,y1 > > and > > x2,y2 > > I wish to do a test for differences in means of x1 and y1, ditto x2 and y2. > > I am getting odd results. ?I am not sure I am using 'pt' properly... > > I have not included the raw vectors as they are long. ?I am interested if I > am using R properly... > >> c(length(x1), length(y1), length(x2), length(y2)) > [1] 3436 1619 2677 2378 > > > First where the T-stat and the DF do not give the same result as 't.test' > when passed into 'pt' > >> t.1 <- t.test(x1, y1) >> 2 * pt(t.1$statistic, t.1$parameter) > ? ? ? t > 1.353946Sorry, I realized that is is fairly easy to test that it is an issue with which tail of the distribution you use. This should show what is going on better than my prior message. 1.353946/2 = 0.676973 1 - 0.676973 = 0.323027 0.323027 * 2 = 0.646054 in pt(), the default is lower.tail=TRUE. Switching it to FALSE just looks at the other tail of the distribution. It was difficult to see because of the multiplication by 2. Josh>> t.1$p.value > [1] 0.646054 > > I would have thought these would have been the same. ?Like below.... > >> t.2 <- t.test(x2, y2) >> 2 * pt(t.2$statistic, t.2$parameter) > ? ? ? ?t > 0.8679732 >> t.2$p.value > [1] 0.8679732 > > This is what I expect. > > clearly I misunderstand some thing. ?What is it? > > cheers > Worik > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Joshua Wiley Ph.D. Student Health Psychology University of California, Los Angeles
Hi Worik, You can try 2*pt(abs(t1$statistic), t1$parameter, lower.tail=FALSE) If the test is two sided. Cheers, Oscar Oscar M. Rueda, PhD Postdoc, Breast Cancer Functional Genomics Cancer Research UK Cambridge Research Institute Li Ka Shing Centre Robinson Way Cambridge CB2 0RE England On 16/6/10 23:30, "Worik R" <worikr at gmail.com> wrote:> I have two pairs of related vectors > x1,y1 > > and > > x2,y2 > > I wish to do a test for differences in means of x1 and y1, ditto x2 and y2. > > I am getting odd results. I am not sure I am using 'pt' properly... > > I have not included the raw vectors as they are long. I am interested if I > am using R properly... > >> c(length(x1), length(y1), length(x2), length(y2)) > [1] 3436 1619 2677 2378 > > > First where the T-stat and the DF do not give the same result as 't.test' > when passed into 'pt' > >> t.1 <- t.test(x1, y1) >> 2 * pt(t.1$statistic, t.1$parameter) > t > 1.353946 >> t.1$p.value > [1] 0.646054 > > I would have thought these would have been the same. Like below.... > >> t.2 <- t.test(x2, y2) >> 2 * pt(t.2$statistic, t.2$parameter) > t > 0.8679732 >> t.2$p.value > [1] 0.8679732 > > This is what I expect. > > clearly I misunderstand some thing. What is it? > > cheers > Worik > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.This communication is from Cancer Research UK. Our website is at www.cancerresearchuk.org. We are a charity registered under number 1089464 and a company limited by guarantee registered in England & Wales under number 4325234. Our registered address is 61 Lincoln's Inn Fields, London WC2A 3PX. Our central telephone number is 020 7242 0200. This communication and any attachments contain information which is confidential and may also be privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s) please note that any form of disclosure, distribution, copying or use of this communication or the information in it or in any attachments is strictly prohibited and may be unlawful. If you have received this communication in error, please notify the sender and delete the email and destroy any copies of it. E-mail communications cannot be guaranteed to be secure or error free, as information could be intercepted, corrupted, amended, lost, destroyed, arrive late or incomplete, or contain viruses. We do not accept liability for any such matters or their consequences. Anyone who communicates with us by e-mail is taken to accept the risks in doing so.
On 16-Jun-10 22:30:39, Worik R wrote:> I have two pairs of related vectors > x1,y1 > and > x2,y2 > > I wish to do a test for differences in means of x1 and y1, > ditto x2 and y2. > > I am getting odd results. I am not sure I am using 'pt' properly... > I have not included the raw vectors as they are long. > I am interested if I am using R properly... > >> c(length(x1), length(y1), length(x2), length(y2)) > [1] 3436 1619 2677 2378 > > First where the T-stat and the DF do not give the same result as > 't.test' when passed into 'pt' > >> t.1 <- t.test(x1, y1) >> 2 * pt(t.1$statistic, t.1$parameter) > t > 1.353946 >> t.1$p.value > [1] 0.646054 > > I would have thought these would have been the same. Like below.... > >> t.2 <- t.test(x2, y2) >> 2 * pt(t.2$statistic, t.2$parameter) > t > 0.8679732 >> t.2$p.value > [1] 0.8679732 > > This is what I expect. > > clearly I misunderstand some thing. What is it? > > cheers > WorikThe P-value is the tail-area (or the sum of the two tail-areas for a two-sided test). The value of pt() is the total probability to the left of the upper tail. Taking your results above: [1]: t.1 <- t.test(x1, y1) 2 * pt(t.1$statistic, t.1$parameter) # t # 1.353946 t.1$p.value # [1] 0.646054 The "t.1$p.value" result will (by default) be the two-tailed test, so one tail will have probability equal to half the P-value, while the value of pt() will be Prob(T <= t1$statistic). Hence the former will be 2*(1 - the latter) **provided the t-statistic is positive** -- otherwise, if the t-statistic is negative, the former is twice the latter. . Check: 2*(1 - 1.353946/2) # [1] 0.646054 2*(1 - 0.646054/2) # [1] 1.353946 So this indicates that the t-value (which you did not quote) was positive. [2]: t.2 <- t.test(x2, y2) 2 * pt(t.2$statistic, t.2$parameter) # t # 0.8679732 t.2$p.value # [1] 0.8679732 2*(1 - 0.8679732/2) # [1] 1.132027 (so no agreement), but: 2*(0.8679732/2) # 0.8679732 so here the t-value was negative. And that is the difference between thw two cases. Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 17-Jun-10 Time: 13:43:37 ------------------------------ XFMail ------------------------------