Bruno L. Giordano
2008-Dec-26 18:58 UTC
[R] Package: clue; Function: consensus; Question: element weights for criterion functions
Dear R community and Kurt, I am using consensus.r from clue to compute hard consensus partitions (method: "hard/euclidean"; same considerations apply for "hard/manhattan"). The function gives in output a hard consensus partition that minimizes a criterion measure Crit, namely the sum of the weighted dissimilarities between the input partitions on the one hand, and the consensus partition on the other. I would like to have an exact value for Crit (consensus.r only shows approximate values when verbose=TRUE). To this purpose, I manually compute Crit using the cl_dissimilarity function from the same package. The problem I have is that the Crit I compute manually differs widely from that computed by criterion.r (compare "Minimum" with "Manually computed criterion" after running the sample code). I figure the difference arises because my criterion measure does not weight the cases. However, I wonder: 1. why does the consensus partition search algorithm have to weight the cases? Is the weighting part of the heuristic? 2. how can I recover these weights (hacking consensus.r is not really straightforward, since it includes many subfunctions)? 3. is there something wrong in my R code (pasted below)? Thank you for any suggestion, and for this very helpful package. Best, Bruno p.s. in the code, you can set nnruns to 1 if you do not want to load your machine. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Bruno L. Giordano, Ph.D. Music Perception and Cognition Laboratory CIRMMT http://www.cirmmt.mcgill.ca/ Schulich School of Music, McGill University 555 Sherbrooke Street West Montr?al, QC H3A 1E3 Canada Office: +1 514 398 4535 ext. 00900 http://www.music.mcgill.ca/~bruno/ ###################### CODE ############################# library(clue) ##### partitions p1=c(1,2,3,4,5,6,7,8,9,1,1,2,2,3,3,4,4,1,1,2,2,3,3,4,4) p2=c(1,1,2,2,3,3,4,4,1,1,2,2,3,3,4,4,1,1,2,2,3,3,4,4,1) p3=c(1,1,2,2,3,3,4,4,1,1,2,2,3,3,4,4,1,1,2,2,3,3,4,5,5) ##### partitions ensemble p=as.cl_ensemble(list(as.cl_partition(p1),as.cl_partition(p2),as.cl_partition(p3))) ##### control parameters nclasses=9 nnruns=50 contr=list(verbose=TRUE,nruns=nnruns,k=nclasses) meth="hard/euclidean" ##### fit consensus partition pcons=cl_consensus(p,method=meth,control=contr) print(pcons) ##### compute unweighted criterion measure Crit=rep(0,length(p)) for (i in 1:length(p)) { tmp=cl_dissimilarity(pcons,p[[i]],method="euclidean") Crit[i]=tmp[[1]] } cat("Manually computed criterion: ") cat(as.character(sum(Crit)))