christiaan pauw
2011-Oct-04 09:28 UTC
[R] matrix of chi-square results for all combinations of data frame
Hi everybody I have a questionnaire with a lot of questions that allow for more than one option to be chosen (like a tickbox in a html form). The data captured on a mobile device and is supplied in a format where every option is a separate variable (logical). I want to develop a generic function to process these questions. As part of the analysis I want make a matrix of the p-value from the Chi-sqaure test for all combinations of options for each question. I tried to make a dataframe with all possible combinations and then use that in a loop to get the p-values with chisq.test() . It works if I specify the combination by hand but not in the loop. What am I doing wrong? ( I would appreciate any advice. Sample code below. Thanks in advance Christiaan # Sample Code # create test data df=data.frame(x=sample(0:1,100,replace=TRUE),x.1=sample(0:1,100,replace=TRUE), x.2=sample(0:1,100,replace=TRUE), x.3=sample(0:1,100,replace=TRUE)) # make a data frame of all possible combinations grd=expand.grid(colnames(df),colnames(df)) # make vector of p values pval <- for (i in 1: length(grd[,1])){ chisq.test(df[,paste(grd$Var1[[i]])], df[,paste(grd$Var2[[i]])], correct TRUE)$p.value } # It works if I do i=3 and then chisq.test(df[,paste(grd$Var1[[i]])], df[,paste(grd$Var2[[i]])], correct = TRUE)$p.value Why does this not work in the loop? ______________________________> sessionInfo()R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] splines stats graphics grDevices utils datasets methods base other attached packages: [1] Hmisc_3.8-0 survival_2.35-8 prettyR_1.8-1 loaded via a namespace (and not attached): [1] cluster_1.12.3 grid_2.11.1 lattice_0.18-8 tools_2.11.1 [[alternative HTML version deleted]]