I need a two sample t.test between M and F. The data are arranged in one column, x. Cant seem to figure how to run a two sample t.test. Not really sure what this output is giving me, but there should be no difference between M and F in the example, but summary p-value indicates this. How can I run a two sample t.test with data in one column. x=rep(c(1,2,3,4),2) y=rep(c("M","M","M","M","F","F","F","F")) data<-cbind(x,y) t.test(x,by=list(y)) Thank you ahead of time. keith -- M. Keith Cox, Ph.D. Alaska NOAA Fisheries, National Marine Fisheries Service Auke Bay Laboratories 17109 Pt. Lena Loop Rd. Juneau, AK 99801 Keith.Cox@noaa.gov marlinkcox@gmail.com U.S. (907) 789-6603 [[alternative HTML version deleted]]
Hello, Marlin Keith Cox wrote:> I need a two sample t.test between M and F. The data are arranged in one > column, x. Cant seem to figure how to run a two sample t.test. Not really > sure what this output is giving me, but there should be no difference > between M and F in the example, but summary p-value indicates this. > > How can I run a two sample t.test with data in one column. > > x=rep(c(1,2,3,4),2) > y=rep(c("M","M","M","M","F","F","F","F")) > data<-cbind(x,y) > t.test(x,by=list(y))Several issues: First, your usage of cbind makes 'data' a matrix of type character, R no longer sees your numeric x. You most likely want a data.frame (Which can contain multiple types) instead of a matrix (which has one type of data), so replace line 3 (and "data" is a function and argument name, so let's call it something else) with df <- data.frame(x, y) I don't see the "by" argument documented anywhere in ?t.test. I do see the "formula" argument, documented as: formula: a formula of the form ?lhs ~ rhs? where ?lhs? is a numeric variable giving the data values and ?rhs? a factor with two levels giving the corresponding groups. So try, t.test(x ~ y, data = df)
Marlin - Consider the following:> df = data.frame(x=rep(1:4,2),y=rep(c('M','F'),c(2,2))) > t.test(x~y,data=df)Welch Two Sample t-test data: x by y t = 4.899, df = 6, p-value = 0.002714 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 1.001052 2.998948 sample estimates: mean in group F mean in group M 3.5 1.5 If you're uncomfortable with the formula notation, try with(df,t.test(x[y=='M'],x[y=='F'])) - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spector at stat.berkeley.edu On Thu, 1 Apr 2010, Marlin Keith Cox wrote:> I need a two sample t.test between M and F. The data are arranged in one > column, x. Cant seem to figure how to run a two sample t.test. Not really > sure what this output is giving me, but there should be no difference > between M and F in the example, but summary p-value indicates this. > > How can I run a two sample t.test with data in one column. > > x=rep(c(1,2,3,4),2) > y=rep(c("M","M","M","M","F","F","F","F")) > data<-cbind(x,y) > t.test(x,by=list(y)) > Thank you ahead of time. > keith > > > > -- > M. Keith Cox, Ph.D. > Alaska NOAA Fisheries, National Marine Fisheries Service > Auke Bay Laboratories > 17109 Pt. Lena Loop Rd. > Juneau, AK 99801 > Keith.Cox at noaa.gov > marlinkcox at gmail.com > U.S. (907) 789-6603 > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >