James Bull
2011-Jun-14 02:20 UTC
[R] Analyzing three-way contingency tables with many zero cells
Hi all, I am trying to analyze the following data. The first three columns are categorical variables (colors of three traits for a peripatus species) and the last column the count of individuals in each three-way classification. I wish to test if the three traits vary independently or if they are correlated across individuals, i.e. - this is a basic three way contingency table question. Segment,Body,Pattern,Count 1,1,1,91 1,1,2,139 1,1,3,2 1,1,4,195 1,2,1,0 1,2,2,0 1,2,3,0 1,2,4,0 1,3,1,1 1,3,2,1 1,3,3,0 1,3,4,0 2,1,1,5 2,1,2,34 2,1,3,6 2,1,4,80 2,2,1,2 2,2,2,0 2,2,3,0 2,2,4,14 2,3,1,2 2,3,2,3 2,3,3,6 2,3,4,376 3,1,1,1 3,1,2,0 3,1,3,0 3,1,4,0 3,2,1,0 3,2,2,0 3,2,3,0 3,2,4,0 3,3,1,0 3,3,2,1 3,3,3,0 3,3,4,71 I can run the following code, but am unsure if the Log-linear model is inappropriate given the large number of zero cells in my matrix. I've had a look for permutation based equivalents to avoid this issue, but can only find mh_test in the coin package, which I don't think is appropriate as none of my factors can be considered to blocking? # Import data COR<-read.table('Independence of traits.csv',header=T,sep=',',strip.white=T) # Transform data into an appropriate table form COR.tab<-xtabs(Count~Segment+Body+Pattern,COR) # To run G2 Log-linear model test, not sure if super appropriate as many empty cells COR.glmF <- glm(Count~Segment*Body*Pattern, family=poisson, COR) anova(COR.glmF, test="Chisq") Any advice would be greatly appreciated. Many thanks in advance, James Bull (Monash University, Australia)