Dear all, I need to process large amounts of data (two or three variables for 6,000 cases) cluster analysis. In the end I need to fill the source data to the obtained clusters. I need to trace the sequence of data fusion. In this case, I can fill in a cluster (with any level of linkage distance) by data. This procedure is implemented in the package Statistica, but this package can not work with large amounts of data. In an attachment, I give an example for small sample sizes. Figure this is a tree of clusters, and a text file and Excel file is "Amalgamation Schedule" (Jointing matrix) http://r.789695.n4.nabble.com/file/n4319741/Tree_Diagram_for_61_Cases.jpg http://r.789695.n4.nabble.com/file/n4319741/Amalgamation_Schedule_(test).txt Amalgamation_Schedule_(test).txt http://r.789695.n4.nabble.com/file/n4319741/Amalgamation_Schedule.xls Amalgamation_Schedule.xls My code: x <- read.table('test.csv', sep=',', header=TRUE) x <- x[-1] d <- dist(x, method = "ward?, diag = FALSE, upper = FALSE, p=2) hc <- hclust(d) plot(hc) Greatly sorry for my English. thank you -- View this message in context: http://r.789695.n4.nabble.com/How-to-build-a-Amalgamation-Schedule-help-tp4319741p4319741.html Sent from the R help mailing list archive at Nabble.com.
womak
2012-Jan-25 23:35 UTC
[R] Clusters: How to build a "Amalgamation Schedule" (sequence of jointing )?
Below is the function *hclust* hclust function (d, method = "complete", members = NULL) { METHODS <- c("ward", "single", "complete", "average", "mcquitty", "median", "centroid") method <- pmatch(method, METHODS) if (is.na(method)) stop("invalid clustering method") if (method == -1) stop("ambiguous clustering method") n <- as.integer(attr(d, "Size")) if (is.null(n)) stop("invalid dissimilarities") if (n < 2) stop("must have n >= 2 objects to cluster") len <- as.integer(n * (n - 1)/2) if (length(d) != len) (if (length(d) < len) stop else warning)("dissimilarities of improper length") if (is.null(members)) members <- rep(1, n) else if (length(members) != n) stop("invalid length of members") hcl <- .Fortran(C_hclust, n = n, len = len, method = as.integer(method), ia = integer(n), ib = integer(n), crit = double(n), members as.double(members), nn = integer(n), disnn = double(n), flag = logical(n), diss = as.double(d), PACKAGE = "stats") hcass <- .Fortran(C_hcass2, n = as.integer(n), ia = as.integer(hcl$ia), ib = as.integer(hcl$ib), order = integer(n), iia = integer(n), iib = integer(n), PACKAGE = "stats") tree <- list(merge = cbind(hcass$iia[1L:(n - 1)], hcass$iib[1L:(n - 1)]), height = hcl$crit[1L:(n - 1)], order = hcass$order, labels = attr(d, "Labels"), method = METHODS[method], call = match.call(), dist.method = attr(d, "method")) class(tree) <- "hclust" tree } If a function has no cycles, then, in my view amateur, all the action on the formation of the cluster tree are made in: /*hcl */<- *.Fortran*(C_hclust, n = n, len = len, method as.integer(method), ia = integer(n), ib = integer(n), crit = double(n), members = as.double(members), nn = integer(n), disnn = double(n), flag logical(n), diss = as.double(d), PACKAGE = "stats") /*hcass */<- *.Fortran*(C_hcass2, n = as.integer(n), ia as.integer(hcl$ia), ib = as.integer(hcl$ib), order = integer(n), iia integer(n), iib = integer(n), PACKAGE = "stats") where can I see some procedures /*C_hclust*/ ? */C_hcass2/* or they are present in the *R* already compiled? If it possible , then there is hope to save the merger of clusters in the file and get the "Amalgamation Schedule" (sequence of jointing). -- View this message in context: http://r.789695.n4.nabble.com/Clusters-How-to-build-a-Amalgamation-Schedule-sequence-of-jointing-tp4319741p4329061.html Sent from the R help mailing list archive at Nabble.com.
David Winsemius
2012-Jan-26 02:58 UTC
[R] Clusters: How to build a "Amalgamation Schedule" (sequence of jointing )?
On Jan 25, 2012, at 6:35 PM, womak wrote:> Below is the function *hclust* > hclust > function (d, method = "complete", members = NULL) > { snipped the function body.snipped the function body. (That was really very unnecessary, womack.)> If a function has no cycles, then, in my view amateur, all the > action on the > formation of the cluster tree are made in: > > /*hcl */<- *.Fortran*(C_hclust, n = n, len = len, method >Right. http://cran.r-project.org/ http://cran.r-project.org/src/base/R-2/R-2.14.1.tar.gz http://cran.r-project.org/doc/manuals/R-lang.html> > where can I see some procedures /*C_hclust*/ ? */C_hcass2/* or they > are > present in the *R* already compiled? If it possible , then there is > hope to > save the merger of clusters in the file and get the "Amalgamation > Schedule" > (sequence of jointing).I have no idea what you are asking here. In all too typical posting behavior for Nabble users, you have not included any context. Most of us are reading this with our mail clients, NOT through the nabble website. -- David Winsemius, MD West Hartford, CT