search for: calinski

Displaying 9 results from an estimated 9 matches for "calinski".

2002 Feb 20
2
Clustering and Calinski's index
I have to solve a clustering problem. My first step is to determinate the number of clusters, that's why I 'm using the Calinski index ( [tr(b)/(k-1)]/[tr(w)/(k-1)] ) which i try to maximize to have the best number of clusters. A function is already implemented in R to calculate this index : clustIndex(cl,x, index="calinski") where cl is the result of a clustering method , for instance: cclust(x,k,itermax,ver...
2008 Apr 13
0
Calinsky and Harabasz Index for Cluster Determination with Diana in R
Hello all, I have a set of data points, which I have pair distances for. I managed to create dendrogram for this data set using diana() in R, however this only gives me the tree and not the clusters themselves. I am trying to determine clusters using Calinsky and Harabasz Index (CH Index). I, however, cannot find how to accomplish this using R. Is there anyone who could help me with this? I
2004 Mar 09
1
Package cclust error
...wrote: > > I have to solve a clustering problem. > My first step is to determinate the number of clusters, that's why I 'm using ...snip... >... number of clusters. > A function is already implemented in R to calculate this index : > > clustIndex(cl,x, index="calinski") Where is that from? It's not part of R -- package cclust, perhaps? > where cl is the result of a clustering method , for instance: > > cclust(x,k,itermax,verbose=TRUE,method="kmeans") > > My probleme is that I can't calculate the Calinski's index whe...
2009 Jun 26
1
50993 point distance matrix, too big to as.matrix, looking for another way to calculate point-level summary
...abun.dist <- dist(abun.mat[1:50993,1:235) test <- rowMeans(as.matrix(abun.dist)) Error in matrix(0, size, size) : too many elements specified ive been able to run a hclust() clustering procedure, due to the fact that hclust() makes a call to fortran code, but id like to be able to generate a calinski index for each of the clusters to assess the validity. Unfortunately, all the validation routines I have found are all native R code, and usually call as.matrix, resulting in the same error i receive above. What I'd like to figure out is how to just go through, one point at a time, and calculat...
2005 Sep 21
0
Problem with SAGx Library
Dear All: I am a newbie to R and as such i am posting this request for help. I am trying to use R to compute the "Calinski Harabasz (CH) Index ". The CH Index is available in the library SAGx. The version of R i am running is 2.2.1. I have my data in a CSV format which i read into R using the read.table() command. After the data has been read i am loading in the "SAGx" and "MASS" librarie...
2010 Apr 05
0
Agnes in Cluster Package and index.G1 in the clusterSim package questions
...d Harabasz (1974) stopping rule recommended in the Monte Carlo Simulation by Milligan and Cooper (1985). 2. 2. How do I obtain the clustering rules? (I.e. how do I assign my observations to the cluster from my final solution?). I was able to find the Package ‘clusterSim’ which Calculates Calinski-Harabasz pseudo F-statistic but I am having some difficulty substituting in the CL (A vector of integers indicating the cluster to which each object is allocated) argument in the formula: index.G1 (x,cl,d=NULL,centrotypes="centroids") Thanks in advance for any help, Sincerely, Panc...
2012 Apr 15
2
Cluster Analysis
Hi, I was wondering what the best equivalent to SAS's FASTCLUS and PROC CLUSTER would be. I need to be able to test the significance of the clusters by comparing the probability of obtaining an equal or greater pseudo F to the Bonferroni-corrected level. I will also need to plot r squared against the number of clusters. Thanks so much, Taisa [[alternative HTML version deleted]]
2017 Mar 09
2
GSoC 2017 Project Proposal
Hello devs. I would like to propose how I plan to go about improving and getting a system that can be integrated into Xapian in this GSoC for the clustering branch. I have identified three areas of work which were not touched last time. 1) Automated Performance Analysis I had roughly implemented 2 evaluation techniques previously (Distance b/w document and centroids within clusters and
2011 Aug 10
4
Clustering Large Applications..sort of
Hello all, I am using the clustering functions in R in order to work with large masses of binary time series data, however the clustering functions do not seem able to fit this size of practical problem. Library 'hclust' is good (though it may be sub par for this size of problem, thus doubly poor for this application) in that I do not want to make assumptions about the number of