Hello, I would like to use my own dissimilarity matrix in a PAM clustering with method "pam" (cluster package) instead of a dissimilarity matrix created by daisy. I read data from a file containing the dissimilarity values using "read.csv". This creates a matrix (alternatively: an array or vector) which is not accepted by "pam": A call p<-pam(d,k=2,diss=TRUE) yields an error message "Error in pam(d, k = 2, diss = TRUE) : x is not of class dissimilarity and can not be converted to this class." How can I convert the matrix d into a dissimilarity matrix suitable for "pam"? I'm aware of a response by Friedrich Leisch to a similar question posed by Jose Quesada (quoted below). But as I understood the answer, the dissimilarity matrix there is calculated on the basis of (random) data. Thank you in advance. Hans __________________________________ />>>>> On Tue, 09 Jan 2001 15:42:30 -0700, / />>>>> Jose Quesada (JQ) wrote: / / > Hi, / / > I'm trying to use a similarity matrix (triangular) as input for pam() or / / > fanny() clustering algorithms. / / > The problem is that this algorithms can only accept a dissimilarity / / > matrix, normally generated by daisy(). / / > However, daisy only accept 'data matrix or dataframe. Dissimilarities / / > will be computed between the rows of x'. / / > Is there any way to say to that your data are already a similarity / / > matrix (triangular)? / / > In Kaufman and Rousseeuw's FORTRAN implementation (1990), they showed an / / > option like this one: / / > "Maybe you already have correlations coefficients between variables. / / > Your input data constist on a lower triangular matrix of pairwise / / > correlations. You wish to calculate dissimilarities between the / / > variables." / / > But I couldn't find this alternative in the R implementation. / / > I can not use foo <- as.dist(foo), neither daisy(foo...) because / / > "Dissimilarities will be computed between the rows of x", and this is / / > not / / > what I mean. / / > You can easily transform your similarities into dissimilarities like / / > this (also recommended in Kaufman and Rousseeuw ,1990): / / > foo <- (1 - abs(foo)) # where foo are similarities / / > But then pam() will complain like this: / / > " x is not of class dissimilarity and can not be converted to this / / > class." / / > Can anyone help me? I also appreciate any advice about other clustering / / > algorithms that can accept this type of input. / Hmm, I don't understand your problem, because proceeding as the docs describe it works for me ... If foo is a similarity matrix (with 1 meaning identical objects), then bar <- as.dist(1 - abs(foo)) fanny(bar, ...) works for me: ## create a random 12x12 similarity matrix, make it symmetric and set the ## diagonal to 1 /> x <- matrix(runif(144), nc=12) / /> x <- x+t(x) / /> diag(x) <- 1 / ## now proceed as described in the docs /> y <- as.dist(1-x) / /> fanny(y, 3) / iterations objective 42.000000 3.303235 Membership coefficients: [,1] [,2] [,3] 1 0.3333333 0.3333333 0.3333333 2 0.3333333 0.3333333 0.3333333 3 0.3333334 0.3333333 0.3333333 4 0.3333333 0.3333333 0.3333333 ...
Hi! If your x is your symmetric matrix containing the distances than cast it to an dist object using as.dist. ?as.dist. Sincerely Eryk *********** REPLY SEPARATOR *********** On 29.06.2004 at 18:28 Hans K侖rber wrote:>Hello, > >I would like to use my own dissimilarity matrix in a PAM clustering with >method "pam" (cluster package) instead of a dissimilarity matrix created >by daisy. > >I read data from a file containing the dissimilarity values using >"read.csv". This creates a matrix (alternatively: an array or vector) >which is not accepted by "pam": A call > > p<-pam(d,k=2,diss=TRUE) > >yields an error message "Error in pam(d, k = 2, diss = TRUE) : x is not >of class dissimilarity and can not be converted to this class." How can >I convert the matrix d into a dissimilarity matrix suitable for "pam"? > >I'm aware of a response by Friedrich Leisch to a similar question posed >by Jose Quesada (quoted below). But as I understood the answer, the >dissimilarity matrix there is calculated on the basis of (random) data. > >Thank you in advance. >Hans > >__________________________________ > >/>>>>> On Tue, 09 Jan 2001 15:42:30 -0700, / >/>>>>> Jose Quesada (JQ) wrote: / > >/ > Hi, / >/ > I'm trying to use a similarity matrix (triangular) as input for >pam() or / >/ > fanny() clustering algorithms. / >/ > The problem is that this algorithms can only accept a dissimilarity / >/ > matrix, normally generated by daisy(). / > >/ > However, daisy only accept 'data matrix or dataframe. Dissimilarities / >/ > will be computed between the rows of x'. / >/ > Is there any way to say to that your data are already a similarity / >/ > matrix (triangular)? / >/ > In Kaufman and Rousseeuw's FORTRAN implementation (1990), they >showed an / >/ > option like this one: / > >/ > "Maybe you already have correlations coefficients between variables. / >/ > Your input data constist on a lower triangular matrix of pairwise / >/ > correlations. You wish to calculate dissimilarities between the / >/ > variables." / > >/ > But I couldn't find this alternative in the R implementation. / > >/ > I can not use foo <- as.dist(foo), neither daisy(foo...) because / >/ > "Dissimilarities will be computed between the rows of x", and this is / >/ > not / >/ > what I mean. / > >/ > You can easily transform your similarities into dissimilarities like / >/ > this (also recommended in Kaufman and Rousseeuw ,1990): / > >/ > foo <- (1 - abs(foo)) # where foo are similarities / > >/ > But then pam() will complain like this: / > >/ > " x is not of class dissimilarity and can not be converted to this / >/ > class." / > >/ > Can anyone help me? I also appreciate any advice about other >clustering / >/ > algorithms that can accept this type of input. / > >Hmm, I don't understand your problem, because proceeding as the docs >describe it works for me ... > >If foo is a similarity matrix (with 1 meaning identical objects), then > >bar <- as.dist(1 - abs(foo)) >fanny(bar, ...) > >works for me: > >## create a random 12x12 similarity matrix, make it symmetric and set the >## diagonal to 1 >/> x <- matrix(runif(144), nc=12) / >/> x <- x+t(x) / >/> diag(x) <- 1 / > >## now proceed as described in the docs >/> y <- as.dist(1-x) / >/> fanny(y, 3) / >iterations objective > 42.000000 3.303235 >Membership coefficients: > [,1] [,2] [,3] >1 0.3333333 0.3333333 0.3333333 >2 0.3333333 0.3333333 0.3333333 >3 0.3333334 0.3333333 0.3333333 >4 0.3333333 0.3333333 0.3333333 >... > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://www.stat.math.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html