thr3ads.net - similar to: "cluster validation"

Displaying 20 results from an estimated 20000 matches similar to: "cluster validation"

2006 May 02

Cluster validation methods

Hi All, Except the "Rand Index", "Dunn Index" and "Silhouette width", are there other cluster validation methods in R? Could you please also specify the function? Thanks! [[alternative HTML version deleted]]

plot and validation in clustering

2006 Mar 20

plot and validation in clustering

Hi there, I use function "kmeans" and "clara" to cluster one flow cytometry dataset. By using function "plot", the clusters got from "clara" can be graphed, while "kmeans" not. How can I get the plot of the clusters of "kmeans"? And, I hope to compare the two methods "kmeans" and "clara", or in other word, I

Cluster validation statistics in fpc

2004 Apr 26

Cluster validation statistics in fpc

Hi, this is to announce a new version (1.1-2) of my package fpc. Apart from the stuff already present in the older version (methods for fixed point clustering and clusterwise regression, somewhat bug-cleaned and with faster examples) there is now a function cluster.stats, which computes some distance-based statistics often used for cluster validation, description and decision about the number of

Cluster validation statistics in fpc

2004 Apr 26

Cluster validation statistics in fpc

Clustering and Rand Index

2006 Jan 07

Clustering and Rand Index

Dear WizaRds, I am trying to compute the (adjusted) Rand Index in order to comprehend the variable selection heuristic (VS-KM) according to Brusco/ Cradit 2001 (Psychometrika 66 No.2 p.249-270, 2001). Unfortunately, I am unable to correctly use cl_ensemble and cl_agreement (package: clue). Here is what I am trying to do: library(clue) ## Let p1..p4 be four partitions of the kind

clustering

2006 Feb 27

clustering

Hi there, Sorry for the double email. Does R have the packages for the following clustering methods? And if it does, what the commands for them? 1. SOM (Self-organization map) 2. Graph partitioning: 3. Neural network 4. Probability Binning Thank you very much! Linda [[alternative HTML version deleted]]

cluster/distance large matrix (fwd)

2010 Feb 11

cluster/distance large matrix (fwd)

On Thu, 11 Feb 2010, Christian Hennig wrote: >It is well know that hierarchical methods are problematic with too large >dissimilarity matrices; even if you resolve the memory problem, the number of >operations required is enormous. There is at least one exception to this. Single-linkage hierarchical clustering with a convex distance such as Euclidean distance is feasible for quite

about pam

2006 Mar 16

about pam

Hi there, In the description of command "pam", it mentions "For datasets larger than (say) 200 observations". Now my dataset is a "54732 by 5" dataframe named "test". When I try to run pam(test,4),it shows " cannot allocate vector of length 1497768547". Is it because the row too big that it can't handle? Thank you!

Survey - Cluster Sampling

2005 Jun 16

Survey - Cluster Sampling

Dear WizaRds, I am struggling to compute correctly a cluster sampling design. I want to do one stage clustering with different parametric changes: Let M be the total number of clusters in the population, and m the number sampled. Let N be the total of elements in the population and n the number sampled. y are the values sampled. This is my example data: clus1 <-

clusterwise regression from fpc (fixed point clustering) package

2008 Jan 14

clusterwise regression from fpc (fixed point clustering) package

hi there, whenever i try the clusterwise regression from the fpc package, there occurs the following problem: the first cluster is always designed in a way, that when i run a normal linear regression on the independent variables to describe the dependent variable (only on those respondents from the first cluster) - then the regression uses only one independent variable that describes the whole

Partition data into clusters

2008 Mar 18

Partition data into clusters

Greetings R-users, I have been using the fpc package in R to cluster my data. Speficically I am using kmeansruns clustering. I would like to know how I use R to partition data into clusters. What I am doing is as follows. # Use csv file as input ##################### wholeset = read.csv("Spellman800genesImputed.csv") # exclude first col (gene names) ##########################

reference paper about SOM

2006 Apr 01

reference paper about SOM

Hi All, I'm looking for some reference paper about SOM (self organizing map) algorithm. I tried the paper which is mentioned in the help page of function "som (package:som)": http://www.cis.hut.fi/research/papers/som_tr96.ps.Z But I can't open it for some reason. Could you please help me with it ? Thanks a lot! [[alternative HTML version deleted]]

use "factor" for categorical covariate in Cox PH model

2006 Jul 17

use "factor" for categorical covariate in Cox PH model

Hi All, I'm learning the R codes for Cox PH modeling. Could I ask you what the function of "factor" in modeling? Thank you! When dealing with the categorical covariates (for example 3 groups), it will come out different results if we add the command "factor" in front of the categorical covariate or not: if we don't add "factor", there is only one

cluster.stats

2008 Jun 13

cluster.stats

Dear list, I just tried to use the function cluster.stat in the package fpc. I just have a couple of questions about the syntax: cluster.stats(d,clustering,alt.clustering=NULL, silhouette=TRUE,G2=FALSE,G3=FALSE) 1) the distance object (d) is an object obtained by the function dist() on my own original matrix? 2) clustering is the clusters vector as result of one of the many clustering methods?

"partitioning cluster function"

2006 Apr 05

"partitioning cluster function"

Hi All, For the function "bclust"(e1071), the argument "base.method" is explained as "must be the name of a partitioning cluster function returning a list with the same components as the return value of 'kmeans'. In my understanding, there are three partitioning cluster functions in R, which are "clara, pam, fanny". Then I check each of them to

bug (?!) in "pam()" clustering from fpc package ?

2008 Dec 17

bug (?!) in "pam()" clustering from fpc package ?

Hello all. I wish to run k-means with "manhattan" distance. Since this is not supported by the function "kmeans", I turned to the "pam" function in the "fpc" package. Yet, when I tried to have the algorithm run with different starting points, I found that pam ignores and keep on starting the algorithm from the same starting-points (medoids). For my

speed of the cluster.stats function

2005 Jan 03

speed of the cluster.stats function

Hello list (happy new yeaR), Here's a copy of a message i just send to Christian Hennig (who wrote the fpc package). That may interrest some of you, and maybe someone could have a better solution than mine. Romain. ------------------------------------------------------------------------------------------ Mister Hennig, [[[ I'm writing in english because i don't know german

Clustering and Rand Index - VS-KM

2006 Jan 08

Clustering and Rand Index - VS-KM

Dear WizaRds, I have been trying to compute the adjusted Rand index as by Hubert/ Arabie, and could not correctly approach how to define a partition object as in my last request yesterday. With package fpc I try to work around the problem, using my original data: mat <- matrix( c(6,7,8,2,3,4,12,14,14, 14,15,13,3,1,2,3,4,2, 15,3,10,5,11,7,13,6,1, 15,4,10,6,12,8,12,7,1), ncol=9, byrow=T )

selecting optimal cluster validation score

2013 Nov 16

selecting optimal cluster validation score

Hi: I have calculated the Silhouette score and Dunn score after hierarchical clustering for 3 clusters: #Distance measure d <- dist(USArrests, method = "euclidean") #Hierarchical clustering hc <- hclust(dist(USArrests), "ave") #calculating silhouette value for 3 clusters sil<- silhouette(cutree(hc, k=3), d) #calculating Dunn index for 3 clusters clus <- cutree(hc,

cluster in R

2006 Oct 17

cluster in R

hi, is there some good summary on clustering methods in R? It seems there are many packages involving it. And I have two questions on clustering here: 1. Is there a way of evaluate the effecitives (or seperation) of clustering (rather than by visualization)? 2. Is there a search method (like genetic search) which can help find the best subset of attributes which gives best seperation? Thanks,

similar to: cluster validation