Anders Malmendal
2007-May-29 09:15 UTC
[R] hierarhical cluster analysis of groups of vectors
I want to do hierarchical cluster analysis to compare 10 groups of vectors with five vectors in each group (i.e. I want to make a dendogram showing the clustering of the different groups). I've looked into using dist and hclust, but cannot see how to compare the different groups instead of the individual vectors. I am thankful for any help. Anders
It seems that you have already groups defined. Discriminant analysis would probably be more appropriate for what you want. Best regards, Rafael Duarte Anders Malmendal wrote:>I want to do hierarchical cluster analysis to compare 10 groups of >vectors with five vectors in each group (i.e. I want to make a dendogram >showing the clustering of the different groups). I've looked into using >dist and hclust, but cannot see how to compare the different groups >instead of the individual vectors. I am thankful for any help. >Anders > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. > >-- Rafael Duarte Marine Resources Department - DRM IPIMAR - National Research Institute for Agriculture and Fisheries Av. Bras?lia, 1449-006 Lisbon - Portugal Tel:+351 21 302 7000 Fax:+351 21 301 5948 e-mail: rduarte at ipimar.pt
Anders; If you want to _test_ for differences, ANOVA applied to on the (typically) first principal component scores for each object would give a fairly quick indication of whether there was a case to answer (though scaling is an issue to be aware of; a low-variance variable might differ strongly between groups yet be masked by a larger variance variable wiht no group association unless you get the scaling right for the circumstances). If you just want to cluster the 10 groups, I suspect it might be simplest to "average" (where "average" implies some consistent summary statistic for each variable) your starting vectors, _before_ playing about with your distance matrix; after all, it is the inter-"mean" distances you are after, so why not get the "means" in the first place?. Of course, scaling is again an issue if the variables differ in variance... Steve E>>> Anders Malmendal <anders at chem.au.dk> 29/05/2007 10:15:23 >>>I want to do hierarchical cluster analysis to compare 10 groups of vectors with five vectors in each group (i.e. I want to make a dendogram showing the clustering of the different groups). I've looked into using dist and hclust, but cannot see how to compare the different groups instead of the individual vectors. I am thankful for any help. Anders ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ******************************************************************* This email and any attachments are confidential. Any use, co...{{dropped}}