thr3ads.net - similar to: "R bug in cluster package (PR#1580)"

Displaying 20 results from an estimated 3000 matches similar to: "R bug in cluster package (PR#1580)"

2017 Aug 17

PAM Clustering

Sorry, I never use pam. In the help, you can see that pam require a dataframe OR a dissimilarity matrix. If diss=FALSE then "euclidean" was use.So, I interpret that a matrix of dissimilarity is generated automatically. Problems may be in your data. Indeed pam(ruspini, 4)$diss write a dissimilaty matrix while pam(MYdata,10)$diss wite NULL 2017-08-17 16:03 GMT+02:00 Sema Atasever

PAM clustering: using my own dissimilarity matrix

2004 Jun 29

PAM clustering: using my own dissimilarity matrix

Hello, I would like to use my own dissimilarity matrix in a PAM clustering with method "pam" (cluster package) instead of a dissimilarity matrix created by daisy. I read data from a file containing the dissimilarity values using "read.csv". This creates a matrix (alternatively: an array or vector) which is not accepted by "pam": A call

PAM Clustering

2017 Aug 17

PAM Clustering

Dear Germano, Thank you for your fast reply, In the above code, *MYData *is the actual data set. Do not we need to convert *MYData to *the dissimilarity matrix using *pam(as.dist(**MYData**), k = 10, diss = TRUE*)* code line?* *Regards.* On Thu, Aug 17, 2017 at 2:58 PM, Germano Rossi <germano.rossi at gmail.com> wrote: > try this > > MYdata <-

Output of silhouette (cluster package)

2008 Jun 13

Output of silhouette (cluster package)

Dear R users, I am mailing you about the graphical output of silhouette (cluster package) From the example of silhouette in help(silhouette): > ar <- agnes(ruspini) > si3 <- silhouette(cutree(ar, k = 5), # k = 4 gave the same as pam() above + daisy(ruspini)) > plot(si3, nmax = 80, cex.names = 0.5) from which one may conclude that group 1 is composed by

question on silhouette colours

2011 Aug 25

question on silhouette colours

I'm fairly new to the silhouette functionality in the cluster package, so apologize if I'm asking something naive. If I run the 'agnes(ruspini)' example from the silhouette section of the cluster package vignette, and assign colours to clusters, two clusters have what appear to be incorrect colours in the silhouette plot. library(cluster) data(ruspini) ar<- agnes(ruspini)

Cluster package broken in 1.4.0?

2002 Jan 28

Cluster package broken in 1.4.0?

Greetings, I am reasonably experienced with R but I recently tried to do some clustering using the "cluster" package, in order to see if it would help. I only tried this once with the 1.3.1 version and it worked (I don't quite remember which method I used). Now, I tried with the 1.4.0 version and no clustering function seems to work with matrices that contain NAs, even though

cluster a distance(analogue)-object using agnes(cluster)

2008 Sep 02

cluster a distance(analogue)-object using agnes(cluster)

I try to perform a clustering using an existing dissimilarity matrix that I calculated using distance (analogue) I tried two different things. One of them worked and one not and I don`t understand why. Here the code: not working example library(cluster) library(analogue) iris2<-as.data.frame(iris) str(iris2) 'data.frame': 150 obs. of 5 variables: $ Sepal.Length: num 5.1 4.9 4.7

Why daisy() in cluster library failed to exclude NA when computing dissimilarity

2013 Dec 08

Why daisy() in cluster library failed to exclude NA when computing dissimilarity

Hi, According to daisy function from cluster documentation, it can compute dissimilarity when NA (missing) value(s) is present. http://stat.ethz.ch/R-manual/R-devel/library/cluster/html/daisy.html But why when I tried this code library(cluster) x <- c(1.115,NA,NA,0.971,NA) y <- c(NA,1.006,NA,NA,0.645) df <- as.data.frame(rbind(x,y)) daisy(df,metric="gower") It gave this

PAM clustering (using triangular matrix)

2001 Jan 09

PAM clustering (using triangular matrix)

Hi, I'm trying to use a similarity matrix (triangular) as input for pam() or fanny() clustering algorithms. The problem is that this algorithms can only accept a dissimilarity matrix, normally generated by daisy(). However, daisy only accept 'data matrix or dataframe. Dissimilarities will be computed between the rows of x'. Is there any way to say to that your data are already a

Converting a Dissimilarity Matrix

2004 Feb 06

Converting a Dissimilarity Matrix

Hi all, I'm trying to perform a hierarchical clustering on some dissimilarity data that I have but the data matrix I have already contains the dissimilarity values. These values are calculated using a separate program. The dissimilarity matrix in complete with no missing values but the hclust, and agnes routines require it in the form produced by daisy or dist. Is there any of converting

fuzzy classification and dissimilarity matrix

2006 Apr 07

fuzzy classification and dissimilarity matrix

Hello, I want to make a fuzzy classification from a dissimilarity matrix (calculated with daisy from package 'cluster'). I have tried to use fanny (package cluster) but I have the same problems than described in a previous message (http://tolstoy.newcastle.edu.au/R/help/05/05/4546.html) i.e. it always gives me two clusters in the results (even if k is different from 2) with the same

Specify ID variable in daisy{cluster}

2011 Jun 16

Specify ID variable in daisy{cluster}

Hi All - I am using the daisy function from the cluster library to create a dissimilarity matrix. I'm going to use that matrix to run a cluster analysis. My participants are identified with the variable, hhid. However, when I try to keep hhid in the dataset that I use to create the dissimilarity matrix, daisy uses it to create the matrix rather than ignoring it as an ID variable. I need to

cluster analyses

2002 Apr 29

cluster analyses

I'm clustering rather large data sets and would like to cut the dendrograms to get a better view of specific components. I calculate the dissimilarity matrix using daisy() because I have a mixture of variable types: factors, ordered factors and numerical variables. If I want one dendrogram, I use agnes() for the agglomerative nesting and pltree() to draw the dendrogram. That way, I get the

agnes clustering and NAs

2011 Jan 27

agnes clustering and NAs

Hello, In the documentation for agnes in the package 'cluster', it says that NAs are allowed, and sure enough it works for a small example like : > m <- matrix(c( 1, 1, 1, 2, 1, NA, 1, 1, 1, 2, 2, 2), nrow = 3, byrow = TRUE) > agnes(m) Call: agnes(x = m) Agglomerative coefficient: 0.1614168 Order of objects: [1] 1 2 3 Height (summary): Min. 1st Qu. Median Mean 3rd

about mix type clust algorithm

2013 Jul 22

about mix type clust algorithm

Hi: I have tried to find the appropriate clust algorithm for mixed type of data. The suggested way I see is: 1. use daisy to get the dissimilarity matrix 2. use PAM/hclust by providing the dissimilarity matrix, to get the clusters but by following this, when the data set grows bigger say 10,000 rows of data, the dissimilarity matrix will be O(n^2), and out of memory will occur. I am

cluster- binary data.

2003 May 21

cluster- binary data.

Hi! I am trying to calculate a dissimilarity matrix using daisy. The matrix vectver is binary as i test with: > levels(as.factor(vectver)) [1] "0" "1" But the call to daisy gives me the following error message.: > dfl1 <- daisy(vectver, type = list(asymm = c(1:length(vectver[,1])))) Error in daisy(vectver, type = list(asymm = c(1:length(vectver[, 1])))) : at least

Cluster prediction from factor/numeric datasets

2007 Jul 23

Cluster prediction from factor/numeric datasets

Hi all, I have a dataset with numeric and factor columns of data which I developed a Gower Dissimilarity Matrix for (Daisy) and used Agglomerative Nesting (Agnes) to develop 20 clusters. I would like to use the 20 clusters to determine cluster membership for a new dataset (using predict) but cannot find a way to do this (no way to "predict" in the cluster package). I know I can use

clustering on scaled dataset or not?

2010 Oct 28

clustering on scaled dataset or not?

Hi, just a general question: when we do hierarchical clustering, should we compute the dissimilarity matrix based on scaled dataset or non-scaled dataset? daisy() in cluster package allow standardizing the variables before calculating dissimilarity matrix; but dist() doesn't have that option at all. Appreciate if you can share your thoughts? Thanks John [[alternative HTML

How to access to sum of dissimilarities in CLARA

2005 May 30

How to access to sum of dissimilarities in CLARA

Dear All , Since dissimilarity is one of quality measures in clustering , I'm trying to access to the sum of dissimilarity as a whole measure. But after running my data using CLARA I obtain : 1128 dissimilarities, summarized : Min. 1st Qu. Median Mean 3rd Qu. Max. 0.033155 0.934630 2.257000 2.941600 4.876600 8.943700 But I can not find the sum of dissimilarity.How can i

Clustering

2007 Nov 28

Clustering

Hello all! I am performingsome clustering analysis on microarray data using agnes{cluster} and I have created my own dissimilarity matrix according to a distance measure different from "euclidean" or "manhattan" etc. My question is, if I choose for example method="complete", how are the distances between the elements calculated? Are they taken form the dissimilarity

similar to: R bug in cluster package (PR#1580)