On Thu, Mar 15, 2012 at 1:48 PM, A J <anxusgo at hotmail.com>
wrote:>
> Hi everybody!
> Anybody knows how can I get detalied information about clusters after using
hclust?
> The issue is that if I have some items in different clusters, I would like
to get the cluster where each item is placed.
> Taking into account that my data set is too large, it is not useful to have
the dendogram or a graphic, and really I need something like a simple table with
item label and cluster name, for instance.
> Is it possible to do this in any way in R?
>
> I leave a code example from I start:
>
> a<-replicate(2000, rnorm(2000))b<-hclust(as.dist(a),
method="ward", members=NULL)
>
> And this is the information that I achieve:
>
> structure(list(merge = structure(c(-6L, -5L, -7L, -3L, -1L, -2L, 3L, 4L,
5L, -10L, -9L, -8L, 1L, -4L, 2L, 6L, 7L, 8L), .Dim = c(9L, 2L)), height =
c(-2.16431780288644, -1.77785380974643, -1.72883152083299, -1.02930929735342,
-0.957628473035096, -0.687733358846453, 1.62427849392232, 2.78818645913762,
3.01723103257677), order = c(1L, 4L, 3L, 6L, 10L, 7L, 8L, 2L, 5L, 9L), labels =
NULL, method = "ward", call = quote(hclust(d = as.dist(a), ? ? method
= "ward", members = NULL)), dist.method = NULL), .Names =
c("merge", "height", "order", "labels",
"method", "call", "dist.method"), class =
"hclust")
>
> I just need the every item with its correponding cluster in a more or less
organizated way. Of course, there is not problem in using different funtcions or
librarys (till now I have not found anything sweeting to my needs). Advices or
orientations are welcome and appreciated!
hclust by itself does not generate clusters; rather, it generates a
clustering tree. You need to identify branches (clusters) in the tree
using a "branch cutting" method. This typically entails choosing one
or more parameters that specify how sensitive the cut method should be
to branch splits.
You can do that in several ways. Simple tree cut is implemented in the
function cutree (package stats). You can specify the number of
clusters or the cut height. More advanced methods are implemented in
the function cutreeDynamic in the dynamicTreeCut package (shameless
plug alert - I'm the maintainer). Examples of use and results from the
dynamicTreeCut package can be seen at
http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting/
Our group has used the dynamicTreeCut methods extensively in
clustering gene expression data.
HTH,
Peter