thr3ads.net - R help - [R] figuring out the results from hclust [May 2008]

If this information is useful, please help other people find it:
Share via:

Karin Lagesen

2008-May-31 18:03 UTC

[R] figuring out the results from hclust

I have two examples that I run hclust on:

a = c(0,1,1.5,1.5)
b = c(1,0,1.5,1.5)
c = c(1.5,1.5,0,0.5)
d = c(1.5,1.5,0.5,0)
ll = as.matrix(rbind(a,b,c,d))
test = as.dist(ll)
long = hclust(test)

a = c(0,0.3,1,1)
b = c(0.3,0,1,1)
c = c(1,1,0,0.5)
d = c(1,1,0.5,0)
ll = as.matrix(rbind(a,b,c,d))
test = as.dist(ll)
short = hclust(test)

The main difference between them is whether a and b gets clustered
higher up or lower down than the b,c cluster.

I am working on partitioning this kind of data into three clusters. I
know I can do that with cutree. The result I get from that is the
following: 
> cutree(short, k=3)a b c d 
1 1 2 3 > cutree(long, k=3)a b c d 
1 2 3 3 > 
And I can also access the height matrix for both:
> short$height
[1] 0.3 0.5 1.0> long$height
[1] 0.5 1.0 1.5> 
So I know at what heights they get merged.

What I seem to be unable to get at is which one of the clusters as
shown by cutree correspond to what split. When I examine short in a
plot I can easily see that the highest split (i.e corresponding to the
last height, 1, in the height matrix) is between the cutree clusters 1
and 2,3. In the long example this split is between 1,2 and 3. I would
however like to not examine all of the data I have by hand:)

Could any of you point me to what I need to do to get at this data? I
have tried to examine the merge data in both cases, but I am coming up
short.

Thanks!

Karin
-- 
Karin Lagesen, PhD student
karin.lagesen at medisin.uio.no
http://folk.uio.no/karinlag

R help - May 2008 - figuring out the results from hclust

[R] figuring out the results from hclust