Hi, I'm trying to use a similarity matrix (triangular) as input for pam() or fanny() clustering algorithms. The problem is that this algorithms can only accept a dissimilarity matrix, normally generated by daisy(). However, daisy only accept 'data matrix or dataframe. Dissimilarities will be computed between the rows of x'. Is there any way to say to that your data are already a similarity matrix (triangular)? In Kaufman and Rousseeuw's FORTRAN implementation (1990), they showed an option like this one: "Maybe you already have correlations coefficients between variables. Your input data constist on a lower triangular matrix of pairwise correlations. You wish to calculate dissimilarities between the variables." But I couldn't find this alternative in the R implementation. I can not use foo <- as.dist(foo), neither daisy(foo...) because "Dissimilarities will be computed between the rows of x", and this is not what I mean. You can easily transform your similarities into dissimilarities like this (also recommended in Kaufman and Rousseeuw ,1990): foo <- (1 - abs(foo)) # where foo are similarities But then pam() will complain like this: " x is not of class dissimilarity and can not be converted to this class." Can anyone help me? I also appreciate any advice about other clustering algorithms that can accept this type of input. Thanks a lot in advance, Jose Quesada Dept. of Experimental Psychology, University of Granada, Spain. Visitor researcher at the institute of cognitive science University of Colorado, Boulder, Us. -------------- next part -------------- A non-text attachment was scrubbed... Name: quesadaj.vcf Type: text/x-vcard Size: 501 bytes Desc: Card for Jose Quesada Url : stat.ethz.ch/pipermail/r-help/attachments/20010109/a0e0927e/quesadaj.vcf
Hi, I'm trying to use a similarity matrix (triangular) as input for pam() or fanny() clustering algorithms. The problem is that this algorithms can only accept a dissimilarity matrix, normally generated by daisy(). However, daisy only accept 'data matrix or dataframe. Dissimilarities will be computed between the rows of x'. Is there any way to say to that your data are already a similarity matrix (triangular)? In Kaufman and Rousseeuw's FORTRAN implementation (1990), they showed an option like this one: "Maybe you already have correlations coefficients between variables. Your input data constist on a lower triangular matrix of pairwise correlations. You wish to calculate dissimilarities between the variables." But I couldn't find this alternative in the R implementation. I can not use foo <- as.dist(foo), neither daisy(foo...) because "Dissimilarities will be computed between the rows of x", and this is not what I mean. You can easily transform your similarities into dissimilarities like this (also recommended in Kaufman and Rousseeuw ,1990): foo <- (1 - abs(foo)) # where foo are similarities But then pam() will complain like this: " x is not of class dissimilarity and can not be converted to this class." Can anyone help me? I also appreciate any advice about other clustering algorithms that can accept this type of input. Thanks a lot in advance, Jose Quesada Dept. of Experimental Psychology, University of Granada, Spain. Visitor researcher at the institute of cognitive science University of Colorado, Boulder, Us. visitor researcher Institute of Cognitive Science University of Colorado (Boulder) <quesadaj at psych.colorado.edu> Muenzinger psychology building Campus Box 344 Univeristy of colorado at Boulder Boulder, CO 80309-0344 geneura.ugr.es/~jose lsa.colorado.edu/~quesadaj Home: 303 545 2082 Work: 303 492 4574 -------------- next part -------------- A non-text attachment was scrubbed... Name: quesadaj.vcf Type: text/x-vcard Size: 501 bytes Desc: Card for Jose Quesada Url : stat.ethz.ch/pipermail/r-help/attachments/20010109/987feb25/quesadaj.vcf
>>>>> On Tue, 09 Jan 2001 15:42:30 -0700, >>>>> Jose Quesada (JQ) wrote:> Hi, > I'm trying to use a similarity matrix (triangular) as input for pam() or > fanny() clustering algorithms. > The problem is that this algorithms can only accept a dissimilarity > matrix, normally generated by daisy(). > However, daisy only accept 'data matrix or dataframe. Dissimilarities > will be computed between the rows of x'. > Is there any way to say to that your data are already a similarity > matrix (triangular)? > In Kaufman and Rousseeuw's FORTRAN implementation (1990), they showed an > option like this one: > "Maybe you already have correlations coefficients between variables. > Your input data constist on a lower triangular matrix of pairwise > correlations. You wish to calculate dissimilarities between the > variables." > But I couldn't find this alternative in the R implementation. > I can not use foo <- as.dist(foo), neither daisy(foo...) because > "Dissimilarities will be computed between the rows of x", and this is > not > what I mean. > You can easily transform your similarities into dissimilarities like > this (also recommended in Kaufman and Rousseeuw ,1990): > foo <- (1 - abs(foo)) # where foo are similarities > But then pam() will complain like this: > " x is not of class dissimilarity and can not be converted to this > class." > Can anyone help me? I also appreciate any advice about other clustering > algorithms that can accept this type of input. Hmm, I don't understand your problem, because proceeding as the docs describe it works for me ... If foo is a similarity matrix (with 1 meaning identical objects), then bar <- as.dist(1 - abs(foo)) fanny(bar, ...) works for me: ## create a random 12x12 similarity matrix, make it symmetric and set the ## diagonal to 1> x <- matrix(runif(144), nc=12) > x <- x+t(x) > diag(x) <- 1## now proceed as described in the docs> y <- as.dist(1-x) > fanny(y, 3)iterations objective 42.000000 3.303235 Membership coefficients: [,1] [,2] [,3] 1 0.3333333 0.3333333 0.3333333 2 0.3333333 0.3333333 0.3333333 3 0.3333334 0.3333333 0.3333333 4 0.3333333 0.3333333 0.3333333 ... -- ------------------------------------------------------------------- Friedrich Leisch Institut f?r Statistik Tel: (+43 1) 58801 10715 Technische Universit?t Wien Fax: (+43 1) 58801 10798 Wiedner Hauptstra?e 8-10/1071 Friedrich.Leisch at ci.tuwien.ac.at A-1040 Wien, Austria ci.tuwien.ac.at/~leisch ------------------------------------------------------------------- -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._