Hi,
I have a large data set in binary code, no covariates. Say, positions along a
genomic sequence where reference sequence is represented by 0 and changes
represented by 1. I have 99 positions and 2000 sequences to analyze. I want to
run a univariate analysis to isolate positions where majority of changes are
amongst all sequences, then run multivariate to assess significance of these
porsitions as a whole. Would like to do it in R but don't know how. Please
help.
e.g.
1 2 3 4 5 6 7 8 (positions)
0 0 0 0 1 0 0 0 (sequence1)
1 0 0 0 0 1 0 0 (sequence2)
1 0 0 0 1 0 0 0
Thanks in advance...
[[alternative HTML version deleted]]