Dear all,
I need to process large amounts of data (two or three variables for 6,000
cases) cluster analysis. In the end I need to fill the source data to the
obtained clusters. I need to trace the sequence of data fusion. In this
case, I can fill in a cluster (with any level of linkage distance) by data.
This procedure is implemented in the package Statistica, but this package
can not work with large amounts of data.
In an attachment, I give an example for small sample sizes. Figure this is a
tree of clusters, and a text file and Excel file is "Amalgamation
Schedule"
(Jointing matrix)
http://r.789695.n4.nabble.com/file/n4319741/Tree_Diagram_for_61_Cases.jpg
http://r.789695.n4.nabble.com/file/n4319741/Amalgamation_Schedule_(test).txt
Amalgamation_Schedule_(test).txt
http://r.789695.n4.nabble.com/file/n4319741/Amalgamation_Schedule.xls
Amalgamation_Schedule.xls
My code:
x <- read.table('test.csv', sep=',', header=TRUE)
x <- x[-1]
d <- dist(x, method = "ward?, diag = FALSE, upper = FALSE, p=2)
hc <- hclust(d)
plot(hc)
Greatly sorry for my English.
thank you
--
View this message in context:
http://r.789695.n4.nabble.com/How-to-build-a-Amalgamation-Schedule-help-tp4319741p4319741.html
Sent from the R help mailing list archive at Nabble.com.
womak
2012-Jan-25 23:35 UTC
[R] Clusters: How to build a "Amalgamation Schedule" (sequence of jointing )?
Below is the function *hclust*
hclust
function (d, method = "complete", members = NULL)
{
METHODS <- c("ward", "single", "complete",
"average", "mcquitty",
"median", "centroid")
method <- pmatch(method, METHODS)
if (is.na(method))
stop("invalid clustering method")
if (method == -1)
stop("ambiguous clustering method")
n <- as.integer(attr(d, "Size"))
if (is.null(n))
stop("invalid dissimilarities")
if (n < 2)
stop("must have n >= 2 objects to cluster")
len <- as.integer(n * (n - 1)/2)
if (length(d) != len)
(if (length(d) < len)
stop
else warning)("dissimilarities of improper length")
if (is.null(members))
members <- rep(1, n)
else if (length(members) != n)
stop("invalid length of members")
hcl <- .Fortran(C_hclust, n = n, len = len, method = as.integer(method),
ia = integer(n), ib = integer(n), crit = double(n), members
as.double(members),
nn = integer(n), disnn = double(n), flag = logical(n),
diss = as.double(d), PACKAGE = "stats")
hcass <- .Fortran(C_hcass2, n = as.integer(n), ia = as.integer(hcl$ia),
ib = as.integer(hcl$ib), order = integer(n), iia = integer(n),
iib = integer(n), PACKAGE = "stats")
tree <- list(merge = cbind(hcass$iia[1L:(n - 1)], hcass$iib[1L:(n -
1)]), height = hcl$crit[1L:(n - 1)], order = hcass$order,
labels = attr(d, "Labels"), method = METHODS[method],
call = match.call(), dist.method = attr(d, "method"))
class(tree) <- "hclust"
tree
}
If a function has no cycles, then, in my view amateur, all the action on the
formation of the cluster tree are made in:
/*hcl */<- *.Fortran*(C_hclust, n = n, len = len, method as.integer(method),
ia = integer(n), ib = integer(n), crit = double(n),
members = as.double(members), nn = integer(n), disnn = double(n), flag
logical(n), diss = as.double(d), PACKAGE = "stats")
/*hcass */<- *.Fortran*(C_hcass2, n = as.integer(n), ia as.integer(hcl$ia),
ib = as.integer(hcl$ib), order = integer(n), iia integer(n), iib = integer(n),
PACKAGE = "stats")
where can I see some procedures /*C_hclust*/ ? */C_hcass2/* or they are
present in the *R* already compiled? If it possible , then there is hope to
save the merger of clusters in the file and get the "Amalgamation
Schedule"
(sequence of jointing).
--
View this message in context:
http://r.789695.n4.nabble.com/Clusters-How-to-build-a-Amalgamation-Schedule-sequence-of-jointing-tp4319741p4329061.html
Sent from the R help mailing list archive at Nabble.com.
David Winsemius
2012-Jan-26 02:58 UTC
[R] Clusters: How to build a "Amalgamation Schedule" (sequence of jointing )?
On Jan 25, 2012, at 6:35 PM, womak wrote:> Below is the function *hclust* > hclust > function (d, method = "complete", members = NULL) > { snipped the function body.snipped the function body. (That was really very unnecessary, womack.)> If a function has no cycles, then, in my view amateur, all the > action on the > formation of the cluster tree are made in: > > /*hcl */<- *.Fortran*(C_hclust, n = n, len = len, method >Right. http://cran.r-project.org/ http://cran.r-project.org/src/base/R-2/R-2.14.1.tar.gz http://cran.r-project.org/doc/manuals/R-lang.html> > where can I see some procedures /*C_hclust*/ ? */C_hcass2/* or they > are > present in the *R* already compiled? If it possible , then there is > hope to > save the merger of clusters in the file and get the "Amalgamation > Schedule" > (sequence of jointing).I have no idea what you are asking here. In all too typical posting behavior for Nabble users, you have not included any context. Most of us are reading this with our mail clients, NOT through the nabble website. -- David Winsemius, MD West Hartford, CT