search for: agglomerative

Displaying 20 results from an estimated 36 matches for "agglomerative".

2005 Jan 25
4
agglomerative coefficient in agnes (cluster)
I haven't read the book, but could anyone explain more about this parameter? help(agnes) says that ac measures the amount of clustering structure found. From the definition given in help(agnes.object), however, it seems that as long as the dissimilarity of the merger in the final step of the algorithm is large enough, the ac value will be close to 1. So what does ac really mean? Thank
2016 Mar 09
3
Introduction and Doubts
...ferent is different from what taught in theory.I am also working on R&D on "Hybrid Techniques for Intrusion Detection using Data Mining and Clustering on Newer Datasets". Taking initial look at the docsim folder in xapian-core. These are my insights The clustering used is Single Link Agglomerative Hierarchical clustering. Its Time Complexity is O(n^2) for n=number of documents. At first Choosing K-means seems to be viable solution as K-Means has O(n) Time Complexity. But it has various Shortcomings 1) The learning algorithm requires apriori specification of the number of cluster centers. 2)D...
2008 Feb 11
1
Dendrogram for agglomerative hierarchical clustering result
Hey group, I have a problem of drawing dendrogram as the result of my program written in C. My algorithm is a approximation algorithm for single linkage method. AS a result I will get the following data: [Average distance] [cluster A] [cluster B] For example: 42.593141 1 26 42.593141 4 6 42.593141 123 124 42.593141 4 113 74.244206 1 123 74.244206 4 133 74.244206 1 36 So far I have used C to
2011 Jun 09
1
k-nn hierarchical clustering
Hi there, is there any R-function for k-nearest neighbour agglomerative hierarchical clustering? By this I mean standard agglomerative hierarchical clustering as in hclust or agnes, but with the k-nearest neighbour distance between clusters used on the higher levels where there are at least k>1 distances between two clusters (single linkage is 1-nearest neighbour...
2001 Aug 21
1
difference between trees in R?
Hi. I am wondering if anybody has studied and/or written code in R to calculate the distance between 2 "trees". For example, if one does a hierarchical agglomerative clustering and say, a hierachical divisive clustering (represented as trees) and wishes to compute a metric on them. I am thinking of something like the symmetric difference as mentioned in Margush and McMorris (1982). My application is actually a bit different than that above so I'll describ...
2016 Mar 10
2
Introduction and Doubts
...o xapian project. sorry if that was against the rules The algorithm is not developed by me but after having much research on various clustering techniques. I found that there is a new algorithm called CLUBS(Clustering Using Binary Splitting) which gives better results than kmeans++ and hierarchical agglomerative clustering. It is faster and produces good results based on various metrics of cluster quality. the algorithm works in following way The first phase of the algorithm is divisive, as the original data set(in this case, set of search documents to cluster) is split recursively into miniclusters thro...
2003 Dec 03
3
non-uniqueness in cluster analysis
Hi, I'm clustering objects defined by categorical variables with a hierarchical algorithm - average linkage. My distance matrix (general dissimilarity coefficient) includes several distances with exactly the same values. As I see, a standard agglomerative procedure ignores this problems, simply selecting, above equal distances, the one that comes first. For this reason the analysis in output depends strongly on the orderings of the objects within the raw data matrix. Is there a standard procedure to deal with this? Thanks Bruno
2009 Mar 12
2
Time-Ordered Clustering
...orms constraint-based clusters? Ideally the package could perform "Time-Ordered Clustering", a technique applied in a recent journal article by Runger, Nelson, Harnish (using MS Excel). Quote, "in our specific implementation of constrained clustering, the clustering algorithm remains agglomerative and hierarchical, but observations or clusters are constrained to only join if they are adjacent in time." CRAN searches using variants of "cluster" and/or "constraint" and/or "time" etc. didn't yield anything I could recognize. Thank you, Paul Paul Prew E...
2006 Feb 28
1
creating dendrogram from cluster hierarchy
Dear R users, I have created data for hierarchical agglomerative cluster analysis which consist of the merging pairs and the agglomeration heights, e.g. something like my.merge <- matrix(c(-1,-2,-3,1), ncol=2, byrow=TRUE) my.height <- c(0.5, 1) I'd like to plot a corresponding dendrogram but I don't know how to convert my data to achieve this....
2007 Jul 23
1
Cluster prediction from factor/numeric datasets
Hi all, I have a dataset with numeric and factor columns of data which I developed a Gower Dissimilarity Matrix for (Daisy) and used Agglomerative Nesting (Agnes) to develop 20 clusters. I would like to use the 20 clusters to determine cluster membership for a new dataset (using predict) but cannot find a way to do this (no way to "predict" in the cluster package). I know I can use "predict" in cclust, kcca, and flexclus...
2008 Mar 20
2
How to plot the dendrogram or tree for kmeans ?
Hi, How to plot the dendrogram or tree for kmeans, like we do for hclust ? [[alternative HTML version deleted]]
2011 May 11
2
hierarchical clustering within a size limit
Hello List, I am trying to implement a hierarchical cluster using the hclust method agglomerative single linkage method with a small wrinkle. I would like to cluster a set of numbers on a number line only if they are within a distance of 500. I would then like to print out the members of this list. So far I can put a vector: > x<-c(2,10,200,300,600,700) into a distance matrix: > dist(...
2017 Mar 09
2
GSoC 2017 Project Proposal
Hello devs. I would like to propose how I plan to go about improving and getting a system that can be integrated into Xapian in this GSoC for the clustering branch. I have identified three areas of work which were not touched last time. 1) Automated Performance Analysis I had roughly implemented 2 evaluation techniques previously (Distance b/w document and centroids within clusters and
2007 Oct 16
0
doubts about Silhouette
...to try to explain myself. I have fitted a spline to my data, I have fitted a spline, filled in the missing data by replicating the spline coefficients associated to the last node. I obtained a number of dendograms by different combination of distance and link-method by calling DIST and AGNES. The agglomerative coefficient is very high (~ 0.99) for some combinations, and is generally around 0.5 for the remaining cases. As recommended, I ran the SILHOUETTE at different cuts (CUTREE) for some of the cases. Irregardless of the AC value the highest silhouette width I get is ~ 0.4 or lower, which is too low...
2004 Feb 04
1
Clustering with 'agnes'
...was wondering if anyone knew how I can identify cluster points after running the agnes function. For example, I created a dataset with points randomly scattered around (0,0), (0,1) and (1,0). After clustering, the dendrogram shows all the clustered points and I get the ordering and height and the agglomerative coefficient. But nowhere do I see the three actual points listed. Although agnes clusters until there is one main cluster, it is clear that at three clusters, each of the clusters consist of points around the three main points. I was wondering if there was any way in which I can have R give me the...
2005 Nov 02
1
x/y coordinates of dendrogram branches
Dear R-users, I need some help concerning the plotting of dendrograms for hierarchical agglomerative clustering. The agglomeration niveau of each step should be displayed at the branches of the dendrogram. For this I need the x/y coordinates of the branch-agglomerations of the dendrogram. The y-values are known (the heights of the agglomeration), but how can I get the x-values? > mydata &l...
2011 Jan 27
3
agnes clustering and NAs
Hello, In the documentation for agnes in the package 'cluster', it says that NAs are allowed, and sure enough it works for a small example like : > m <- matrix(c( 1, 1, 1, 2, 1, NA, 1, 1, 1, 2, 2, 2), nrow = 3, byrow = TRUE) > agnes(m) Call: agnes(x = m) Agglomerative coefficient: 0.1614168 Order of objects: [1] 1 2 3 Height (summary): Min. 1st Qu. Median Mean 3rd Qu. Max. 1.155 1.247 1.339 1.339 1.431 1.524 Available components: [1] "order" "height" "ac" "merge" "diss" "ca...
2008 Dec 22
3
Error: cannot allocate vector of size 1.8 Gb
> dim(data) [1] 22283 19 > dm=dist(data, method = "euclidean", diag = FALSE, upper = FALSE, p = 2) Error: cannot allocate vector of size 1.8 Gb Hi Guys, thank you in advance for helping. :-D Recently I ran into the "cannot allocate vector of size 1.8GB" error. I am pretty sure this is not a hardware limitation because it happens no matter I ran the R code in a
2002 Apr 29
2
cluster analyses
...ather large data sets and would like to cut the dendrograms to get a better view of specific components. I calculate the dissimilarity matrix using daisy() because I have a mixture of variable types: factors, ordered factors and numerical variables. If I want one dendrogram, I use agnes() for the agglomerative nesting and pltree() to draw the dendrogram. That way, I get the row names as labels, but I can't cut the tree. Alternatively, I use hclust() on the dissimilarity matrix from daisy(). This allows me to cut the dendrogram with cutree(), but I loose the labels, so that isn't much use. I can...
2007 Nov 14
0
Question about AGNES by Rousseeuw et al. in the package "cluster": How many clusters?
...I have difficulties understanding the ouput from AGNES. My question is: how to interpret the output, especially how do you I know which cluster solution is the best? In SPSS, an Agglomeration Schedule table is produced and I used to look at the biggest jump between the error coefficients for each agglomerative steps in order to find where to stop clustering. But with the Agnes output I don't know what I should be looking at. Thanks so much for your help, Aude Aude Plontz Refugee Health Research Centre, Melbourne, Australia Phone: at home: +61 (0)3 9917 2134 at La Trobe University, Bundoo...