Dylan Beaudette
2006-Apr-10 02:28 UTC
[R] passing known medoids to clara() in the cluster package
Greetings, I have had good success using the clara() function to perform a simple cluster analysis on a large dataset (1 million+ records with 9 variables). Since the clara function is a wrapper to pam(), which will accept known medoid data - I am wondering if this too is possible with clara() ... The documentation does not suggest that this is possible. Essentially I am trying to implement a "supervised classification" of numerous geographic data layers. The "unsupervised" approach using clara() works well, but I feel the output classes would be more meaningful if I were able to let clara() know about the classes that I have in mind. Is this at all feasible, or am I trying to accomplish something that is not possible? Cheers, -- Dylan Beaudette Soils and Biogeochemistry Graduate Group University of California at Davis 530.754.7341
Martin Maechler
2006-Apr-10 06:46 UTC
[R] passing known medoids to clara() in the cluster package
>>>>> "DylanB" == Dylan Beaudette <dylan.beaudette at gmail.com> >>>>> on Sun, 9 Apr 2006 19:28:44 -0700 writes:DylanB> Greetings, I have had good success using the clara() DylanB> function to perform a simple cluster analysis on a DylanB> large dataset (1 million+ records with 9 variables). DylanB> Since the clara function is a wrapper to pam(), DylanB> which will accept known medoid data - I am wondering DylanB> if this too is possible with clara() ... The DylanB> documentation does not suggest that this is DylanB> possible. indeed, it doesn't -- because it's not yet possible. I (as maintainer of "cluster") had added the ``known medoid'' option to pam() a while ago last June (for cluster version 1.10.0), and had left a note my TODO file to do the same for clara(). Unfortunately it's not true that clara() was a wrapper to pam() as you state above. Given your wish and clear "use case" situation, I'm more motivated to approach this particular 'TODO' item! Martin Maechler, ETH Zurich DylanB> Essentially I am trying to implement a "supervised DylanB> classification" of numerous geographic data DylanB> layers. The "unsupervised" approach using clara() DylanB> works well, but I feel the output classes would be DylanB> more meaningful if I were able to let clara() know DylanB> about the classes that I have in mind. DylanB> Is this at all feasible, or am I trying to DylanB> accomplish something that is not possible?
Hello Martin, I am not sure that PAM works with pre-defined medoids. The avg.width is always the same and when i check the medoids that were used for the calculation they are the same as before, i.e. before defining my own medoids. Please advise BAHKO -- View this message in context: http://r.789695.n4.nabble.com/passing-known-medoids-to-clara-in-the-cluster-package-tp797643p4632089.html Sent from the R help mailing list archive at Nabble.com.