Hi, I have a large data set in binary code, no covariates. Say, positions along a genomic sequence where reference sequence is represented by 0 and changes represented by 1. I have 99 positions and 2000 sequences to analyze. I want to run a univariate analysis to isolate positions where majority of changes are amongst all sequences, then run multivariate to assess significance of these porsitions as a whole. Would like to do it in R but don't know how. Please help. e.g. 1 2 3 4 5 6 7 8 (positions) 0 0 0 0 1 0 0 0 (sequence1) 1 0 0 0 0 1 0 0 (sequence2) 1 0 0 0 1 0 0 0 Thanks in advance... [[alternative HTML version deleted]]