hi, is there a way of calculating of measuring dependence between two categorical variables. i tried using the chi square test to test for independence but i got error saying that the lengths of the two vectors don't match. Suppose X and Y are two factors. X has 5 levels and Y has 7 levels. This is what i tried doing>temp<-chisq.test(x,y)but got error "the lengths of the two vectors don't match". any help will be appreciated -- Regards, Rana Shoaaib Mehmood
Hi, When testing whether random variables X and Y are independent the usual assumption is that you have n pairs of outcomes - (X1,Y1), (X2,Y2), ... , (Xn,Yn) and you are basically checking whether the value of X affects the value of Y. If you have 7 observations of X and 5 separate observations of Y (which have nothing to do with the observations of X) you can not test for independence. Regards, Moshe. --- Shoaaib Mehmood <shoaaib at gmail.com> wrote:> hi, > > is there a way of calculating of measuring > dependence between two > categorical variables. i tried using the chi square > test to test for > independence but i got error saying that the lengths > of the two > vectors don't match. Suppose X and Y are two > factors. X has 5 levels > and Y has 7 levels. This is what i tried doing > > >temp<-chisq.test(x,y) > > but got error "the lengths of the two vectors don't > match". any help > will be appreciated > -- > Regards, > Rana Shoaaib Mehmood > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >
"Shoaaib Mehmood" <shoaaib at gmail.com> wrote in news:ab02bb240711220316q25e0bbd6rd2de31610c245422 at mail.gmail.com:> hi, > > is there a way of calculating of measuring dependence between two > categorical variables. i tried using the chi square test to test for > independence but i got error saying that the lengths of the two > vectors don't match. Suppose X and Y are two factors. X has 5 levels > and Y has 7 levels. This is what i tried doing > >>temp<-chisq.test(x,y) > > but got error "the lengths of the two vectors don't match". any help > will be appreciatedIf you posted the table, it might be more clear why the error was being thrown. In the example shown you have mixed "x" and "X". They would be different in R. chisq.test should not be having a problem with unequal row and column lengths. #simulate a 5 x 7 table> TT<-r2dtable(1,5*c(1,8,5,8,4),5*c(3,3,3,3,4,4,6)) > TT[[1]] [,1] [,2] [,3] [,4] [,5] [,6] [,7] [1,] 0 1 1 0 2 1 0 [2,] 3 3 6 6 2 8 12 [3,] 1 2 3 3 9 2 5 [4,] 8 3 3 3 6 7 10 [5,] 3 6 2 3 1 2 3 #general test for association> chisq.test(TT[[1]],TT[[2]])Pearson's Chi-squared test data: TT[[1]] X-squared = 33.5942, df = 24, p-value = 0.09214 Warning message: In chisq.test(TT[[1]], TT[[2]]) : Chi-squared approximation may be incorrect -- David Winsemius
i cant find help for xtab. Which package contains this function On Nov 24, 2007 12:16 PM, G Ilhamto <gilhamto at gmail.com> wrote:> hi shohaib, > have you tried xtab instead of chisq.test? > > Ilham > > > > On Nov 22, 2007 6:16 AM, Shoaaib Mehmood <shoaaib at gmail.com> wrote: > > > > > > > > hi, > > > > is there a way of calculating of measuring dependence between two > > categorical variables. i tried using the chi square test to test for > > independence but i got error saying that the lengths of the two > > vectors don't match. Suppose X and Y are two factors. X has 5 levels > > and Y has 7 levels. This is what i tried doing > > > > >temp<-chisq.test(x,y) > > > > but got error "the lengths of the two vectors don't match". any help > > will be appreciated > > -- > > Regards, > > Rana Shoaaib Mehmood > > > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > >-- Regards, Rana Shoaaib Mehmood (+92) 333 550 4531
Bernardo Rangel Tura
2007-Nov-26 08:28 UTC
[R] testing independence of categorical variables
On Thu, 2007-11-22 at 16:16 +0500, Shoaaib Mehmood wrote:> hi, > > is there a way of calculating of measuring dependence between two > categorical variables. i tried using the chi square test to test for > independence but i got error saying that the lengths of the two > vectors don't match. Suppose X and Y are two factors. X has 5 levels > and Y has 7 levels. This is what i tried doing > > >temp<-chisq.test(x,y) > > but got error "the lengths of the two vectors don't match". any help > will be appreciatedHi Shoaaib, Try using chisq.test(table(x,y)). If you using chisq.test(x,y) R will testing goodness-of-fit. -- Bernardo Rangel Tura, M.D,Ph.D National Institute of Cardiology Brazil
The chi-square does not need your two categorical variables to have equal levels, nor limitation for the number of levels. The Chi-square procedure is as follow: ?^2=?_(All Cells)??(Observed-Expected)?^2/Expected Expected Cell= E_ij=n((i^th RowTotal)/n)((j^th RowTotal)/n) Degree of Freedom=df= (row-1)(Col-1) This way should not give you any errors if your calculations are all correct. I usually use SAS for calculations like this. Below is a sample code I wrote to test whether US_State and Blood type are independent. You can modify it for your data and should give you no error. data bloodtype; input bloodtype$ state$ count@@; datalines; A FL 122 B FL 117 AB FL 19 O FL 244 A IA 1781 B IA 351 AB IA 289 O IA 3301 A MO 353 B MO 269 AB MO 60 O MO 713 ; proc freq data=bloodtype; tables bloodtype*state / cellchi2 chisq expected norow nocol nopercent; weight count; quit; Best Ramin Gainesville Shoaaib Mehmood wrote:> > hi, > > is there a way of calculating of measuring dependence between two > categorical variables. i tried using the chi square test to test for > independence but i got error saying that the lengths of the two > vectors don't match. Suppose X and Y are two factors. X has 5 levels > and Y has 7 levels. This is what i tried doing > >>temp<-chisq.test(x,y) > > but got error "the lengths of the two vectors don't match". any help > will be appreciated > -- > Regards, > Rana Shoaaib Mehmood > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- View this message in context: http://www.nabble.com/testing-independence-of-categorical-variables-tf4855773.html#a14202348 Sent from the R help mailing list archive at Nabble.com.