Dear R users,
I am mailing you about the graphical output of silhouette (cluster
package)
From the example of silhouette in help(silhouette):
> ar <- agnes(ruspini)
> si3 <- silhouette(cutree(ar, k = 5), # k = 4 gave the same as pam()
above
+ daisy(ruspini))
> plot(si3, nmax = 80, cex.names = 0.5)
from which one may conclude that group 1 is composed by units from 1
to 20, group 2 by units from 21 to 43, group 3 by units from 44 to 57,
group 4 by units from 58 to 60 and, finally, group 5 by units from 61
to 75.
However, this seems to be in contrast with the output of silhouette
where the fourth group is composed by units from 46 to 48 instead of
units from 58 to 60 (belonging to the third cluster), see
> si3
cluster neighbor sil_width
[1,] 1 5 0.679838078
[2,] 1 5 0.745615002
[3,] 1 5 0.758796123
[4,] 1 4 0.715554768
[5,] 1 5 0.664657114
[6,] 1 4 0.783993831
[7,] 1 2 0.590057470
[8,] 1 4 0.747969458
[9,] 1 5 0.792304760
[10,] 1 4 0.803547635
[11,] 1 4 0.742402051
[12,] 1 4 0.722302731
[13,] 1 4 0.665412622
[14,] 1 5 0.756910666
[15,] 1 5 0.700685403
[16,] 1 5 0.743601834
[17,] 1 5 0.614854124
[18,] 1 5 0.708007860
[19,] 1 5 0.700093839
[20,] 1 4 0.568989067
[21,] 2 4 0.751866935
[22,] 2 4 0.790783667
[23,] 2 4 0.802659788
[24,] 2 4 0.785895823
[25,] 2 4 0.822943473
[26,] 2 4 0.831313347
[27,] 2 4 0.818043337
[28,] 2 4 0.805454305
[29,] 2 4 0.770547118
[30,] 2 4 0.768289979
[31,] 2 3 0.794485567
[32,] 2 4 0.829925955
[33,] 2 4 0.807379640
[34,] 2 4 0.790626589
[35,] 2 4 0.817427927
[36,] 2 3 0.793572412
[37,] 2 4 0.760561408
[38,] 2 4 0.743170109
[39,] 2 3 0.761413953
[40,] 2 3 0.704193051
[41,] 2 4 0.297007126
[42,] 2 4 0.522049838
[43,] 2 3 0.488556828
[44,] 3 4 0.377632488
[45,] 3 4 0.007214464
[46,] 4 3 0.699407534
[47,] 4 3 0.837451212
[48,] 4 3 0.794349431
[49,] 3 4 0.632862996
[50,] 3 4 0.586149139
[51,] 3 4 0.647326133
[52,] 3 4 0.650020368
[53,] 3 4 0.629131005
[54,] 3 4 0.618843633
[55,] 3 4 0.586439350
[56,] 3 4 0.586788051
[57,] 3 4 0.668108812
[58,] 3 4 0.650074540
[59,] 3 4 0.628444500
[60,] 3 4 0.591393005
[61,] 5 1 0.770110294
[62,] 5 1 0.815309198
[63,] 5 4 0.771622667
[64,] 5 1 0.806125429
[65,] 5 1 0.850310507
[66,] 5 1 0.822984066
[67,] 5 1 0.852743923
[68,] 5 1 0.762055943
[69,] 5 1 0.839180986
[70,] 5 1 0.854894699
[71,] 5 1 0.838106473
[72,] 5 1 0.774812117
[73,] 5 1 0.795021304
[74,] 5 1 0.759681469
[75,] 5 1 0.742553847
attr(,"Ordered")
[1] FALSE
attr(,"call")
silhouette.default(x = cutree(ar, k = 5), dist = daisy(ruspini))
attr(,"class")
[1] "silhouette"
Thanks for your attention,
Cristiano
---------------------------------
Cristiano Varin
sammy@unive.it
http://www.dst.unive.it/~sammy/
[[alternative HTML version deleted]]
>>>>> "CV" == Cristiano Varin <cristiano.varin at mac.com> >>>>> on Fri, 13 Jun 2008 11:31:34 +0200 writes:CV> Dear R users, I am mailing you about the graphical CV> output of silhouette (cluster package) CV> From the example of silhouette in help(silhouette): >> ar <- agnes(ruspini) >> si3 <- silhouette(cutree(ar, k = 5), # k = 4 gave the same as pam() above CV> + daisy(ruspini)) >> plot(si3, nmax = 80, cex.names = 0.5) CV> from which one may conclude that group 1 is composed by units from 1 CV> to 20, group 2 by units from 21 to 43, group 3 by units from 44 to 57, CV> group 4 by units from 58 to 60 and, finally, group 5 by units from 61 CV> to 75. CV> However, this seems to be in contrast with the output of silhouette CV> where the fourth group is composed by units from 46 to 48 instead of CV> units from 58 to 60 (belonging to the third cluster), see You are right. Indeed, I see that this has been a bug in sortSilhouette() ever since that had been introduced in 2002. It will be fixed in cluster_1.11.11 Martin Maechler, ETH Zurich >> si3 CV> cluster neighbor sil_width CV> [1,] 1 5 0.679838078 CV> [2,] 1 5 0.745615002 CV> [3,] 1 5 0.758796123 CV> [4,] 1 4 0.715554768 CV> [5,] 1 5 0.664657114 CV> [6,] 1 4 0.783993831 CV> [7,] 1 2 0.590057470 CV> [8,] 1 4 0.747969458 CV> [9,] 1 5 0.792304760 CV> [10,] 1 4 0.803547635 CV> [11,] 1 4 0.742402051 CV> [12,] 1 4 0.722302731 CV> [13,] 1 4 0.665412622 CV> [14,] 1 5 0.756910666 CV> [15,] 1 5 0.700685403 CV> [16,] 1 5 0.743601834 CV> [17,] 1 5 0.614854124 CV> [18,] 1 5 0.708007860 CV> [19,] 1 5 0.700093839 CV> [20,] 1 4 0.568989067 CV> [21,] 2 4 0.751866935 CV> [22,] 2 4 0.790783667 CV> [23,] 2 4 0.802659788 CV> [24,] 2 4 0.785895823 CV> [25,] 2 4 0.822943473 CV> [26,] 2 4 0.831313347 CV> [27,] 2 4 0.818043337 CV> [28,] 2 4 0.805454305 CV> [29,] 2 4 0.770547118 CV> [30,] 2 4 0.768289979 CV> [31,] 2 3 0.794485567 CV> [32,] 2 4 0.829925955 CV> [33,] 2 4 0.807379640 CV> [34,] 2 4 0.790626589 CV> [35,] 2 4 0.817427927 CV> [36,] 2 3 0.793572412 CV> [37,] 2 4 0.760561408 CV> [38,] 2 4 0.743170109 CV> [39,] 2 3 0.761413953 CV> [40,] 2 3 0.704193051 CV> [41,] 2 4 0.297007126 CV> [42,] 2 4 0.522049838 CV> [43,] 2 3 0.488556828 CV> [44,] 3 4 0.377632488 CV> [45,] 3 4 0.007214464 CV> [46,] 4 3 0.699407534 CV> [47,] 4 3 0.837451212 CV> [48,] 4 3 0.794349431 CV> [49,] 3 4 0.632862996 CV> [50,] 3 4 0.586149139 CV> [51,] 3 4 0.647326133 CV> [52,] 3 4 0.650020368 CV> [53,] 3 4 0.629131005 CV> [54,] 3 4 0.618843633 CV> [55,] 3 4 0.586439350 CV> [56,] 3 4 0.586788051 CV> [57,] 3 4 0.668108812 CV> [58,] 3 4 0.650074540 CV> [59,] 3 4 0.628444500 CV> [60,] 3 4 0.591393005 CV> [61,] 5 1 0.770110294 CV> [62,] 5 1 0.815309198 CV> [63,] 5 4 0.771622667 CV> [64,] 5 1 0.806125429 CV> [65,] 5 1 0.850310507 CV> [66,] 5 1 0.822984066 CV> [67,] 5 1 0.852743923 CV> [68,] 5 1 0.762055943 CV> [69,] 5 1 0.839180986 CV> [70,] 5 1 0.854894699 CV> [71,] 5 1 0.838106473 CV> [72,] 5 1 0.774812117 CV> [73,] 5 1 0.795021304 CV> [74,] 5 1 0.759681469 CV> [75,] 5 1 0.742553847 CV> attr(,"Ordered") CV> [1] FALSE CV> attr(,"call") CV> silhouette.default(x = cutree(ar, k = 5), dist = daisy(ruspini)) CV> attr(,"class") CV> [1] "silhouette" CV> Thanks for your attention, CV> Cristiano CV> --------------------------------- CV> Cristiano Varin CV> sammy at unive.it CV> http://www.dst.unive.it/~sammy/