alessandro.semeria@cramont.it
2003-Apr-28 16:53 UTC
[R] plot(pam.object) error with R-1.7.0 on Red-Hat 8.0 i686
I don't know if there is some fault in compiling or a bug of the new R-1.7.0 version: cl.pam.2 <- pam(as.dist(1-cor(mel.data)),2) plot(cl.pam.2) perform a right partitioning and silhouette plot on the old R-1.6.2 instead "Error in clusplot.default(x$diss,...... ; x is not numeric" is the output on the new R-1.7.0. Same platform: RH8.0 i686. Some suggestions? A.S. ---------------------------- Alessandro Semeria Models and Simulations Laboratory The Environment Research Center - Montecatini (Edison Group), Via Ciro Menotti 48, 48023 Marina di Ravenna (RA), Italy Tel. +39 544 536811 Fax. +39 544 538663 E-mail: asemeria at cramont.it
alessandro.semeria@cramont.it
2003-Apr-29 13:13 UTC
[R] plot(pam.object) error with R-1.7.0 on Red-Hat 8.0 i686
Hello Martin! Here the script to create mel.data from data set freely available attached to"Molecular Classification of Cutaneous Malignant Melanoma by Gene Expression Profiling" (Bittner et al., 2000): (melanoma.csv is an ASCII comma separeted containing data and with first raw= conditions names) library(mva) cond<- scan(file="melanoma.csv",what=character(38), sep=",",nlines=1) mel<- matrix(scan("melanoma.csv",sep=",",skip=1),ncol=38,byrow=T) dim(mel) mel.bittner <- mel[1:3613,1:31] cond.bittner <- cond[1:31] mel.bittner[mel.bittner<0.02] <- 0.02 mel.bittner[mel.bittner>50] <- 50 mel.bittner <- log2(mel.bittner) mel.bittner.median <- apply(mel.bittner,2,median) mel.bittner.mean <- apply(mel.bittner,2,mean) mel.bittner.sd <- apply(mel.bittner,2,sd) mel.data <- sweep(data.matrix(mel.bittner),2,mel.bittner.median) Thanks! A.S. ---------------------------- Alessandro Semeria Models and Simulations Laboratory The Environment Research Center - Montecatini (Edison Group), Via Ciro Menotti 48, 48023 Marina di Ravenna (RA), Italy Tel. +39 544 536811 Fax. +39 544 538663 E-mail: asemeria at cramont.it
alessandro.semeria@cramont.it
2003-Apr-29 14:15 UTC
[R] plot(pam.object) error with R-1.7.0 on Red-Hat 8.0 i686
Frome here http://www.nature.com/nature/journal/v406/n6795/extref/406536ai3.xls I copied cells from 2F to 8069AQ (only values and string, no formatting) of CUTANEOUSMELANOMA sheet to a new file with excell97 on winXP than I saved it as melanoma.csv (as comma separeted value). Now I'll try to set verbose level option to the maximum to look something. Thanke more! A.S. ---------------------------- Alessandro Semeria Models and Simulations Laboratory The Environment Research Center - Montecatini (Edison Group), Via Ciro Menotti 48, 48023 Marina di Ravenna (RA), Italy Tel. +39 544 536811 Fax. +39 544 538663 E-mail: asemeria at cramont.it
Martin Maechler
2003-Apr-29 16:39 UTC
[R] plot(pam.object) error with R-1.7.0 on Red-Hat 8.0 i686
>>>>> "alessandro" == alessandro semeria <alessandro.semeria at cramont.it> >>>>> on Mon, 28 Apr 2003 18:53:11 +0200 writes:alessandro> I don't know if there is some fault in compiling alessandro> or a bug of the new R-1.7.0 version: > cl.pam.2 <- pam(as.dist(1-cor(mel.data)),2) > plot(cl.pam.2) alessandro> perform a right partitioning and silhouette plot alessandro> on the old R-1.6.2 instead "Error in alessandro> clusplot.default(x$diss,...... ; x is not alessandro> numeric" is the output on the new R-1.7.0. Same alessandro> platform: RH8.0 i686. Some suggestions? yes. This is a bug in the cluster package I'm maintaining. (I'll file a bug report myself). One workaround is to say cl.pam.2 <- pam(as.dist(1-cor(mel.data)), k=2, keep.diss = TRUE) ## ^^^^^^^^^^^^^^^^^^ plot(cl.pam.2) {this is a workaround and unnecessary and *not* recommended for the next release of the cluster package !} Note that the `keep.diss' is a new argument to pam() {and agnes() and others}, which is not anymore always TRUE by default -- for a good reason: This allows quite a bit larger datasets than previously, and also improves on speed (for memory allocation!!) in these cases. BUT as you found above, I have overlooked to adapt plot() method for pam() results to this case. PS: I think I was able to entirely reproduce what you did, but the clustering (of the 31 variables) is pretty "bad": 30 variables in 1st cluster; and variable "M93.007" alone in the 2nd cluster. If you look at silhouette(cl.pam.2) {which is implicitly used in the above plot() statement} you'll realize that the silhouette width is really not useful for this case {all widths are 0}. Martin Maechler <maechler at stat.math.ethz.ch> http://stat.ethz.ch/~maechler/ Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-3408 fax: ...-1228 <><
alessandro.semeria@cramont.it
2003-Apr-30 12:17 UTC
[R] plot(pam.object) error with R-1.7.0 on Red-Hat 8.0 i686
Hello Martin! I modified script with your suggestion "cl.pam.2 <- pam(as.dist(1-cor(mel.data)), k=2, keep.diss = TRUE)" obtaining the same "bad" cluster as your, then I performed pam without setting "keep.diss " and looking to cl.pam.2$cluster I found the same "good" parition as in R-1.6.2. Then I think that your workaround is good to plot something but "destroy" the job of pam! A.S. ---------------------------- Alessandro Semeria Models and Simulations Laboratory The Environment Research Center - Montecatini (Edison Group), Via Ciro Menotti 48, 48023 Marina di Ravenna (RA), Italy Tel. +39 544 536811 Fax. +39 544 538663 E-mail: asemeria at cramont.it