Marino Taussig De Bodonia, Agnese
2010-Aug-24 14:12 UTC
[R] chisq.test on samples of different lengths
Hello, I am trying to see whether there has been a significant difference in whether people experienced damages from wildlife in two different years. I therefore have two columns: year 1: yes no no no yes yes no year 2: no yes no yes I wanted to do a chisq.test, but if I enter it this way: chisq.test(year1, year2) I get the error saying the columns are two different lengths. So then I tried doing: damages<-matrix(c(3,4, 2,2), ncol=2, dimnames=list(answer=c("yes", "no"), year=c("year1", year2))) chisq.test(damages) Does that make sense? Should I maybe be doing a different test instead? Any help would be appreciated, thank you. Agnese
On Aug 24, 2010, at 10:12 AM, Marino Taussig De Bodonia, Agnese wrote:> Hello, > > I am trying to see whether there has been a significant difference > in whether people experienced damages from wildlife in two different > years. I therefore have two columns: > > year 1: > yes > no > no > no > yes > yes > no > > year 2: > no > yes > no > yes > > I wanted to do a chisq.test, but if I enter it this way: > > chisq.test(year1, year2) > > I get the error saying the columns are two different lengths. So > then I tried doing: > > damages<-matrix(c(3,4, 2,2), ncol=2, dimnames=list(answer=c("yes", > "no"), year=c("year1", year2))) > chisq.test(damages)Which should throw an error because year2 is not quoted. Consider using prop.test: ?proptest So your matrix is the transpose of what is needed for prop.test, at least as I read the docs: > damages<-matrix(c(3,4, 2,2), ncol=2, byrow=TRUE, dimnames=list(year=c("year1", "year2"),success=c("yes", "no"))) > damages success year yes no year1 3 4 year2 2 2 > prop.test(damages) 2-sample test for equality of proportions with continuity correction data: damages X-squared = 0, df = 1, p-value = 1 alternative hypothesis: two.sided 95 percent confidence interval: -0.7548099 0.6119528 sample estimates: prop 1 prop 2 0.4285714 0.5000000 Warning message: In prop.test(damages) : Chi-squared approximation may be incorrect> > Does that make sense? Should I maybe be doing a different test > instead? > > Any help would be appreciated, thank you. > > Agnese > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT
On Aug 24, 2010, at 4:12 PM, Marino Taussig De Bodonia, Agnese wrote:> Hello, > > I am trying to see whether there has been a significant difference in whether people experienced damages from wildlife in two different years. I therefore have two columns: > > year 1: > yes > no > no > no > yes > yes > no > > year 2: > no > yes > no > yes > > I wanted to do a chisq.test, but if I enter it this way: > > chisq.test(year1, year2) > > I get the error saying the columns are two different lengths. So then I tried doing: > > damages<-matrix(c(3,4, 2,2), ncol=2, dimnames=list(answer=c("yes", "no"), year=c("year1", year2))) > chisq.test(damages) > > Does that make sense? Should I maybe be doing a different test instead?The procedure is fine as such. A more automated way would be to mat <- cbind(table(year1),table(year2)) chisq.test(mat) (some may prefer rbind(...), but the chi-square won't care) The issue with the two-variable format is that it expects cross-classifying factors of the same individuals, not two independent groups. So you might do answer <- c(year1,year2) year <- rep(1:2, length(year1),length(year2)) table(answer, year) # just for enlightenment chisq.test(answer, year) Another matter is that you are below the usual rule of thumb for chi-square: expected >5 obs in all 4 cells, which is obviously not going to happen with 10 observations in total. fisher.test is an option, but you need pretty extreme configurations to obtain significance. (BTW, all of the above assumes that there are no empty cells. Caveat emptor.)> > Any help would be appreciated, thank you. > > Agnese > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com