thr3ads.net - R help - [R] Optimum # of Clusters using Kmeans [Apr 2007]

If this information is useful, please help other people find it:
Share via:

2007-Apr-07 17:44 UTC

[R] Optimum # of Clusters using Kmeans

Dear R Users,

I am doing clustering and just wondering
(1) whether is it possible to find optimum number of clusters using kmeans
just like PAM using silhouette width.

asw <- numeric(20)
for (k in 2:20)
 asw[k] <- pam(A, k) $ silinfo $ avg.width
k.best <- which.max(asw)
cat("silhouette-optimal number of clusters:", k.best, "\n")

plot(1:20, asw, type= "h", main = "pam() clustering
assessment",
    xlab= "k  (# clusters)", ylab = "average silhouette
width")
axis(1, k.best, paste("best",k.best,sep="\n"), col =
"red", col.axis ="red")

(2) Another thing regarding pre-processing data. I have mixed data( Nominal,
numeric categorical etc). Before clustering, i convert all the nominal data
to binary and normlise them.
Is there any elegant way of doing this?

(3) Is there any function to nomlise data in R?

Thank you

	[[alternative HTML version deleted]]

R help - Apr 2007 - Optimum # of Clusters using Kmeans

[R] Optimum # of Clusters using Kmeans