similar to: Correct use of the cluster::daisy function

Displaying 20 results from an estimated 3000 matches similar to: "Correct use of the cluster::daisy function"

Why daisy() in cluster library failed to exclude NA when computing dissimilarity

2013 Dec 08

3

Why daisy() in cluster library failed to exclude NA when computing dissimilarity

Hi, According to daisy function from cluster documentation, it can compute dissimilarity when NA (missing) value(s) is present. http://stat.ethz.ch/R-manual/R-devel/library/cluster/html/daisy.html But why when I tried this code library(cluster) x <- c(1.115,NA,NA,0.971,NA) y <- c(NA,1.006,NA,NA,0.645) df <- as.data.frame(rbind(x,y)) daisy(df,metric="gower") It gave this

Specify ID variable in daisy{cluster}

2011 Jun 16

1

Specify ID variable in daisy{cluster}

Hi All - I am using the daisy function from the cluster library to create a dissimilarity matrix. I'm going to use that matrix to run a cluster analysis. My participants are identified with the variable, hhid. However, when I try to keep hhid in the dataset that I use to create the dissimilarity matrix, daisy uses it to create the matrix rather than ignoring it as an ID variable. I need to

more on the daisy function

2006 Jan 05

0

more on the daisy function

Dear R-helpers, First of all, a happy new year to everyone! I succesfully used the daisy function (from package cluster) to find which two rows from a dataframe differ by only one value, and I now want to come up with a simpler way to find _which_ value makes the difference between any such pair of two rows. Consider a very small example (the actual data counts thousands of rows): input

PAM clustering: using my own dissimilarity matrix

2004 Jun 29

1

PAM clustering: using my own dissimilarity matrix

Hello, I would like to use my own dissimilarity matrix in a PAM clustering with method "pam" (cluster package) instead of a dissimilarity matrix created by daisy. I read data from a file containing the dissimilarity values using "read.csv". This creates a matrix (alternatively: an array or vector) which is not accepted by "pam": A call

variable type assignment in daisy

2010 Nov 06

0

variable type assignment in daisy

Dear Rhelp, I did a daisy on 5 lifestyle variables, 3 of which were nominal and 2 were ordinal and assigned types “nominal” and “ordinal” for the variables, respectively. I got an output indicating their types as “I” for interval(?). Doing it on the Rdata example “flower” gave the same types in the output as the types they were assigned to. Why is this so? Below are the codes and outputs.

2006 Mar 20

1

type in daisy

Hi, I'm a PhD student and I want to use the function 'daisy' from the package 'cluster' to compute dissimilarities. My variables are of mixed types so I use the argument 'stand' in daisy to define the type of my variables. I have the following error message : Warning message: binary variable(s) 13, 16, 17, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,

PAM clustering (using triangular matrix)

2001 Jan 09

2

PAM clustering (using triangular matrix)

Hi, I'm trying to use a similarity matrix (triangular) as input for pam() or fanny() clustering algorithms. The problem is that this algorithms can only accept a dissimilarity matrix, normally generated by daisy(). However, daisy only accept 'data matrix or dataframe. Dissimilarities will be computed between the rows of x'. Is there any way to say to that your data are already a

daisy(): space allocation issue

2010 Aug 26

1

daisy(): space allocation issue

Hi, I'm trying to apply the function daisy() to a data.frame 10000x10 but I have not enough space (error message: cannot allocate vector of length 1476173280). I didn't imagine I was not able to work with a matrix of just 10000 observations... I have setted in Rgui --max-mem-size=2G (I'm not able to set more space..) How can I solve this issue? Separating observations depending on

daisy function in cluster- coerced NAs

2007 Feb 22

0

daisy function in cluster- coerced NAs

I am currently using the function daisy in package cluster to create a dissimilarity matrix because my multivariate dataset contain missing data and variables of various types including factors, symmetric and asymmetric binary and quantitative. This is a step prior to using pco within ecodist. There is a warning which comes twice ">NAs introduced by coercion" I've used

cauculating dissimilarities in R

2006 Sep 26

0

cauculating dissimilarities in R

Dear All, I?ve got a statistical question on calculating dissimilarities in R. I want to calculate the different types of dissimilarities on the ?flower? dataset found in the package ?cluster?. Flower is a data frame with 18 observations on 8 variables. Variable 1 and 2 are binary, variable 3 is asymmetric binary, variable 4 is nominal, variable 5 and 6 are ordered and variable 7 and 8 are

All possible combinations of functions within a function

2009 Nov 10

2

All possible combinations of functions within a function

Dear All, I wrote a function for cluster analysis to compute cophenetic correlations between dissimilarity matrices (using the VEGAN library) and cluster analyses of every possible clustering algorithm (SEE ATTACHED) http://old.nabble.com/file/p26288610/cor.coef.R cor.coef.R . As it is now, it is extremely long, and for the future I was hoping to find a more efficient way of doing this sort of

Cluster procedure using geographical neighborhood

2010 May 07

0

Cluster procedure using geographical neighborhood

Dear Dario Sacco, >>>>> "DS" == Dario Sacco <dario.sacco at unito.it> >>>>> on Thu, 06 May 2010 17:45:30 +0200 writes: DS> Dear Dr. Maechler, DS> I am an agronomist and a researcher at the University of Turin. I am DS> also teaching "Applied statistics", then I have some knowledge in DS> Statistics, but not

fuzzy classification and dissimilarity matrix

2006 Apr 07

1

fuzzy classification and dissimilarity matrix

Hello, I want to make a fuzzy classification from a dissimilarity matrix (calculated with daisy from package 'cluster'). I have tried to use fanny (package cluster) but I have the same problems than described in a previous message (http://tolstoy.newcastle.edu.au/R/help/05/05/4546.html) i.e. it always gives me two clusters in the results (even if k is different from 2) with the same

Converting a Dissimilarity Matrix

2004 Feb 06

2

Converting a Dissimilarity Matrix

Hi all, I'm trying to perform a hierarchical clustering on some dissimilarity data that I have but the data matrix I have already contains the dissimilarity values. These values are calculated using a separate program. The dissimilarity matrix in complete with no missing values but the hclust, and agnes routines require it in the form produced by daisy or dist. Is there any of converting

Gower distance between a individual and a population

2008 Oct 13

1

Gower distance between a individual and a population

Hi the list, I need to compute Gower distance between a specific individual and all the other individual. The function DAISY from package cluster compute all the pairwise dissimilarities of a population. If the population is N individuals, that is arround N^2 distances to compute. I need to compute the distance between a specific individual and all the other individual, that is only N

cluster analyses

2002 Apr 29

2

cluster analyses

I'm clustering rather large data sets and would like to cut the dendrograms to get a better view of specific components. I calculate the dissimilarity matrix using daisy() because I have a mixture of variable types: factors, ordered factors and numerical variables. If I want one dendrogram, I use agnes() for the agglomerative nesting and pltree() to draw the dendrogram. That way, I get the

cluster- binary data.

2003 May 21

1

cluster- binary data.

Hi! I am trying to calculate a dissimilarity matrix using daisy. The matrix vectver is binary as i test with: > levels(as.factor(vectver)) [1] "0" "1" But the call to daisy gives me the following error message.: > dfl1 <- daisy(vectver, type = list(asymm = c(1:length(vectver[,1])))) Error in daisy(vectver, type = list(asymm = c(1:length(vectver[, 1])))) : at least

R bug in cluster package (PR#1580)

2002 May 20

1

R bug in cluster package (PR#1580)

I have apparently found an error in the "pam" function of the "cluster" library package. Please pardon me if this error has been pointed out or if this e-mail should be directed to someone else. The problem only started occurring with R version 1.5.0, which I started using about a week ago. The problem occurs when you try to use "pam" with the input being a

correlation as distance/dissimilarity

2005 Sep 14

0

correlation as distance/dissimilarity

I've been asked (privately) >>>>> "CarlosJ" == jaramilloc <jaramilloc at si.edu> >>>>> on Wed, 14 Sep 2005 09:40:22 -0400 writes: .......... CarlosJ> In Kaufman & Rousseeuw 2000 book on Cluster Analysis, it says that CarlosJ> Daisy can compute Pearson correlation between variables and then CarlosJ> transform

error using daisy() in library(cluster). Bug?

2004 Aug 12

2

error using daisy() in library(cluster). Bug?

Hi, I'm using the cluster library to examine multivariate data. The data come from a connection to a postgres database, and I did a short R script to do the analisys. With the cluster version included in R1.8.0, daisy worked well for my data, but now, when I call daisy, I obtain the following messages: --------- Error in if (any(sx == 0)) { : missing value where TRUE/FALSE needed In