Buergmann, Helmut
2010-Mar-23 10:40 UTC
[R] Extracting all members with a specific similarity value from a large similarity matrix
I have a large dataframe (1400x1400) containing a symmetric similarity matrix. Now I would like to extract subsets of elements where all elements have a specific similarity with all other elements of this subset. For example if the data looks like this Spl1 Spl2 Spl3 Spl4 Spl5 [...] Spl1 1 0.125 0.000 0.000 0.125 Spl2 0.125 1 0.000 0.000 0.125 Spl3 0.000 0.000 1 0.000 0.500 Spl4 0.000 0.000 0.000 1 0.750 Spl5 0.125 0.125 0.500 0-750 1 [...] I am looking for a way to either like to extract, all elements that are mutually 0, e.g: Spl1 Spl3 Spl4 [...] Spl1 1 0.000 0.000 Spl3 0.000 1 0.000 Spl4 0.000 0.000 1 [...] Or that mutually have similarity 0.125: Spl1 Spl2 Spl5 [...] Spl1 1 0.125 0.125 Spl2 0.125 1 0.125 Spl5 0.125 0.125 1 [...] Or alternatively to sort the table so that this info can easily be obtained by looking for blocks around the diagonal, like this: Spl3 Spl4 Spl1 Spl2 Spl5 [...] Spl3 1 0.000 0.000 0.125 0.500 Spl4 0.000 1 0.000 0.000 0.750 Spl1 0.000 0.000 1 0.125 0.125 Spl2 0.000 0.000 0.125 1 0.125 Spl5 0.500 0.750 0.125 0.125 1 [...] Any help is much appreciated! Helmut B?rgmann, Switzerland