Jovani T. de Souza
2020-Dec-10 21:20 UTC
[R] Use clusters.stats function from a hierarchical clustering in R
I would like a great help from you. I used the cluster.stats function that is part of the `fpc` package to compare the similarity of two custer solutions using a variety of validation criteria, as you can see in the code. However, I have two questions: 1 ? Is it possible to know which is the most viable cluster, 2 clusters or 5 clusters? If so, could you explain me better how I can know. 2? Does this package only compare two in two cluster solutions, or is it possible to compare two more cluster solutions at once? Thank you so much! Best Regards. library(rdist) library(geosphere) library(fpc) df<-structure(list(Industries = c(1,2,3,4,5,6), Latitude = c(-23.8, -23.8, -23.9, -23.7, -23.7,-23.7), Longitude = c(-49.5, -49.6, -49.7, -49.8, -49.6,-49.9), Waste = c(526, 350, 526, 469, 534, 346)), class "data.frame", row.names = c(NA, -6L)) df1<-df #clusters coordinates<-df[c("Latitude","Longitude")] d<-as.dist(distm(coordinates[,2:1])) fit.average<-hclust(d,method="average") clusters<-cutree(fit.average, k=2) df$cluster <- clusters clusters1<-cutree(fit.average, k=5) df1$cluster <- clusters1 cluster.stats(d,df$cluster,df1$cluster) [image: Mailtrack] <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&> Remetente notificado por Mailtrack <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&> 10/12/20 18:19:59 [[alternative HTML version deleted]]