similar to: clustering on scaled dataset or not?

Displaying 20 results from an estimated 8000 matches similar to: "clustering on scaled dataset or not?"

2004 Jun 29
1
PAM clustering: using my own dissimilarity matrix
Hello, I would like to use my own dissimilarity matrix in a PAM clustering with method "pam" (cluster package) instead of a dissimilarity matrix created by daisy. I read data from a file containing the dissimilarity values using "read.csv". This creates a matrix (alternatively: an array or vector) which is not accepted by "pam": A call
2001 Jan 09
2
PAM clustering (using triangular matrix)
Hi, I'm trying to use a similarity matrix (triangular) as input for pam() or fanny() clustering algorithms. The problem is that this algorithms can only accept a dissimilarity matrix, normally generated by daisy(). However, daisy only accept 'data matrix or dataframe. Dissimilarities will be computed between the rows of x'. Is there any way to say to that your data are already a
2004 Feb 06
2
Converting a Dissimilarity Matrix
Hi all, I'm trying to perform a hierarchical clustering on some dissimilarity data that I have but the data matrix I have already contains the dissimilarity values. These values are calculated using a separate program. The dissimilarity matrix in complete with no missing values but the hclust, and agnes routines require it in the form produced by daisy or dist. Is there any of converting
2013 Dec 08
3
Why daisy() in cluster library failed to exclude NA when computing dissimilarity
Hi, According to daisy function from cluster documentation, it can compute dissimilarity when NA (missing) value(s) is present. http://stat.ethz.ch/R-manual/R-devel/library/cluster/html/daisy.html But why when I tried this code library(cluster) x <- c(1.115,NA,NA,0.971,NA) y <- c(NA,1.006,NA,NA,0.645) df <- as.data.frame(rbind(x,y)) daisy(df,metric="gower") It gave this
2007 Jul 23
1
Cluster prediction from factor/numeric datasets
Hi all, I have a dataset with numeric and factor columns of data which I developed a Gower Dissimilarity Matrix for (Daisy) and used Agglomerative Nesting (Agnes) to develop 20 clusters. I would like to use the 20 clusters to determine cluster membership for a new dataset (using predict) but cannot find a way to do this (no way to "predict" in the cluster package). I know I can use
2004 May 28
6
distance in the function kmeans
Hi, I want to know which distance is using in the function kmeans and if we can change this distance. Indeed, in the function pam, we can put a distance matrix in parameter (by the line "pam<-pam(dist(matrixdata),k=7)" ) but we can't do it in the function kmeans, we have to put the matrix of data directly ... Thanks in advance, Nicolas BOUGET
2002 May 20
1
R bug in cluster package (PR#1580)
I have apparently found an error in the "pam" function of the "cluster" library package. Please pardon me if this error has been pointed out or if this e-mail should be directed to someone else. The problem only started occurring with R version 1.5.0, which I started using about a week ago. The problem occurs when you try to use "pam" with the input being a
2006 Apr 07
1
fuzzy classification and dissimilarity matrix
Hello, I want to make a fuzzy classification from a dissimilarity matrix (calculated with daisy from package 'cluster'). I have tried to use fanny (package cluster) but I have the same problems than described in a previous message (http://tolstoy.newcastle.edu.au/R/help/05/05/4546.html) i.e. it always gives me two clusters in the results (even if k is different from 2) with the same
2011 Jun 16
1
Specify ID variable in daisy{cluster}
Hi All - I am using the daisy function from the cluster library to create a dissimilarity matrix. I'm going to use that matrix to run a cluster analysis. My participants are identified with the variable, hhid. However, when I try to keep hhid in the dataset that I use to create the dissimilarity matrix, daisy uses it to create the matrix rather than ignoring it as an ID variable. I need to
2013 Jul 22
1
about mix type clust algorithm
Hi: I have tried to find the appropriate clust algorithm for mixed type of data. The suggested way I see is: 1. use daisy to get the dissimilarity matrix 2. use PAM/hclust by providing the dissimilarity matrix, to get the clusters but by following this, when the data set grows bigger say 10,000 rows of data, the dissimilarity matrix will be O(n^2), and out of memory will occur. I am
2002 Apr 29
2
cluster analyses
I'm clustering rather large data sets and would like to cut the dendrograms to get a better view of specific components. I calculate the dissimilarity matrix using daisy() because I have a mixture of variable types: factors, ordered factors and numerical variables. If I want one dendrogram, I use agnes() for the agglomerative nesting and pltree() to draw the dendrogram. That way, I get the
2010 Aug 09
1
Need help on heatmap, K-means and hhierarchical clustering methods
Hi folks, I am new to the R software. I have been going through different materials to know more about R. I have the R software installed on my windows machine.I would like to know the R source code for the following problems on iris flower data set. I need to do the cluster analysis project with the iris data set. The goal is to cluster the flowers according to their Sepal.Length, Sepal.Width,
2007 Nov 28
2
Clustering
Hello all! I am performingsome clustering analysis on microarray data using agnes{cluster} and I have created my own dissimilarity matrix according to a distance measure different from "euclidean" or "manhattan" etc. My question is, if I choose for example method="complete", how are the distances between the elements calculated? Are they taken form the dissimilarity
2011 Jan 27
3
agnes clustering and NAs
Hello, In the documentation for agnes in the package 'cluster', it says that NAs are allowed, and sure enough it works for a small example like : > m <- matrix(c( 1, 1, 1, 2, 1, NA, 1, 1, 1, 2, 2, 2), nrow = 3, byrow = TRUE) > agnes(m) Call: agnes(x = m) Agglomerative coefficient: 0.1614168 Order of objects: [1] 1 2 3 Height (summary): Min. 1st Qu. Median Mean 3rd
2002 Jan 28
1
Cluster package broken in 1.4.0?
Greetings, I am reasonably experienced with R but I recently tried to do some clustering using the "cluster" package, in order to see if it would help. I only tried this once with the 1.3.1 version and it worked (I don't quite remember which method I used). Now, I tried with the 1.4.0 version and no clustering function seems to work with matrices that contain NAs, even though
2008 Nov 06
1
nls: Fitting two models at once?
Hello, I'm still a newbie user and struggling to automate some analyses from SigmaPlot using R. R is a great help for me so far! But the following problem makes me go nuts. I have two spectra, both have to be fitted to reference data. Problem: the both spectra are connected in some way: the stoichiometry of coefficients "cytf.v"/"cytb.v" is 1/2. {{In the SigmaPlot
2002 Dec 13
1
clustering dissimilarities
Hello. I know my dissimilarity matrix but not my original data. Is there any way i could use the clustering function Mclust or EMclust with this dissimilarity matrix? or at least some equivalent of these functions? As this is model based clustering i dont know if it is actually possible to do it without the original data thanks in advance for your help [[alternate HTML version deleted]]
2008 Mar 19
1
one/multi-dimensional scaling with incomplete dissimilarity matrix
Dear David, you asked this question a while ago on the R mailing list and got no answer. I have the same problem and was wondering if you had found a solution Cheers Loic Loic Thibaut, PhD candidate, ARC Centre of Excellence for Coral Reef Studies, School of Marine Biology, James Cook University, Townsville, Qld, 4811, Australia. Tel + 61 747 815 735, Fax: + 61 747 251 570, email:
2003 May 21
1
cluster- binary data.
Hi! I am trying to calculate a dissimilarity matrix using daisy. The matrix vectver is binary as i test with: > levels(as.factor(vectver)) [1] "0" "1" But the call to daisy gives me the following error message.: > dfl1 <- daisy(vectver, type = list(asymm = c(1:length(vectver[,1])))) Error in daisy(vectver, type = list(asymm = c(1:length(vectver[, 1])))) : at least
2017 Aug 17
0
PAM Clustering
Sorry, I never use pam. In the help, you can see that pam require a dataframe OR a dissimilarity matrix. If diss=FALSE then "euclidean" was use.So, I interpret that a matrix of dissimilarity is generated automatically. Problems may be in your data. Indeed pam(ruspini, 4)$diss write a dissimilaty matrix while pam(MYdata,10)$diss wite NULL 2017-08-17 16:03 GMT+02:00 Sema Atasever