similar to: [cluster package question] What is the "sum of the dissimilarities" in the pam command ?

Displaying 20 results from an estimated 8000 matches similar to: "[cluster package question] What is the "sum of the dissimilarities" in the pam command ?"

2008 Dec 17
1
bug (?!) in "pam()" clustering from fpc package ?
Hello all. I wish to run k-means with "manhattan" distance. Since this is not supported by the function "kmeans", I turned to the "pam" function in the "fpc" package. Yet, when I tried to have the algorithm run with different starting points, I found that pam ignores and keep on starting the algorithm from the same starting-points (medoids). For my
2006 Apr 10
2
passing known medoids to clara() in the cluster package
Greetings, I have had good success using the clara() function to perform a simple cluster analysis on a large dataset (1 million+ records with 9 variables). Since the clara function is a wrapper to pam(), which will accept known medoid data - I am wondering if this too is possible with clara() ... The documentation does not suggest that this is possible. Essentially I am trying to
2008 Feb 22
2
Looping and Pasting
Hello R-community: Much of the time I want to use loops to look at graphs, etc. For example, I have 25 plots, for which the names are m.1$medoids, m.2$medoids, ..., m.25$medoids. I want to index the object number (1:25) as below (just to show concept). for (i in 1:25){ plot(m.i$medoids) } I've tried the following, with negative results for ...
2015 Apr 29
2
cantidad de datos
Hola. Yo en vez de utilizar análisis cluster que impliquen distancias, probaría con un kmedias o con un pam (partition around medoids) pero utilizando muestras, la función clara de la librería cluster puede ayudarte. Pego el details de la ayuda de 'clara' Details clara is fully described in chapter 3 of Kaufman and Rousseeuw (1990). Compared to other partitioning methods such as pam,
2011 May 16
1
pam() clustering for large data sets
Hello everyone, I need to do k-medoids clustering for data which consists of 50,000 observations. I have computed distances between the observations separately and tried to use those with pam(). I got the "cannot allocate vector of length" error and I realize this job is too memory intensive. I am at a bit of a loss on what to do at this point. I can't use clara(), because I
2010 Dec 27
1
Any functions to manipulate (merge, cut, remove) hclust objects? (maybe through phylo?)
Hello all, I'm now working with hclust objects and was hoping to perform some basic editing on them like: - Joining = the merging of two hclust objects (so they will share one root) - Splicing = So to cut/extract a branch out of an hclust object - that by itself will be an hclust object. I noticed I could extract one element of an hclust object by turning it into a dendrogram,
2015 Apr 29
2
cantidad de datos
El inconveniente con un K-medias, es que se tiene que se tiene que pre definir el número de segmentos, pero eso es algo con lo q no cuento. La solución de Javier me parece q sería la única opción. Atte. Ricardo Alva Valiente -----Mensaje original----- De: R-help-es [mailto:r-help-es-bounces en r-project.org] En nombre de javier.ruben.marcuzzi en gmail.com Enviado el: miércoles, 29 de abril de
2004 Jun 29
1
give PAM my own medoids
Hello, When using PAM (partitioning around medoids), I would like to skip the build-step and give the fonction my own medoids. Do you know if it is possible, and how ? Thank you very much. Isabel
2005 Sep 12
4
Document clustering for R
I'm working on a project related to document clustering. I know that R has clustering algorithms such as clara, but only supports two distance metrics: euclidian and manhattan, which are not very useful for clustering documents. I was wondering how easy it would be to extend the clustering package in R to support other distance metrics, such as cosine distance, or if there was an API for
2005 Jun 07
1
Specifying medoids in PAM?
I am using the PAM algorithm in the CLUSTER library. When I allow PAM to seed the medoids using the default __build__ algorithm things work well: > pam(stats.table, metric="euclidean", stand=TRUE, k=5) But I have some clusters from a Hierarchical analysis that I would like to use as seeds for the PAM algorithm. I can't figure what the mediod argument wants. When I put in the
2015 Apr 29
2
cantidad de datos
Buen aporte?excelente!! Atte. Ricardo Alva Valiente De: Jose Luis Cañadas Reche [mailto:canadasreche en gmail.com] Enviado el: miércoles, 29 de abril de 2015 12:51 PM Para: Alva Valiente, Ricardo (RIAV); 'javier.ruben.marcuzzi en gmail.com'; R-help-es en r-project.org Asunto: Re: [R-es] cantidad de datos Podrías hacer varios kmedias con diferente número de clusters y comprobar como
2005 May 30
2
How to access to sum of dissimilarities in CLARA
Dear All , Since dissimilarity is one of quality measures in clustering , I'm trying to access to the sum of dissimilarity as a whole measure. But after running my data using CLARA I obtain : 1128 dissimilarities, summarized : Min. 1st Qu. Median Mean 3rd Qu. Max. 0.033155 0.934630 2.257000 2.941600 4.876600 8.943700 But I can not find the sum of dissimilarity.How can i
2007 Nov 28
2
Clustering
Hello all! I am performingsome clustering analysis on microarray data using agnes{cluster} and I have created my own dissimilarity matrix according to a distance measure different from "euclidean" or "manhattan" etc. My question is, if I choose for example method="complete", how are the distances between the elements calculated? Are they taken form the dissimilarity
2009 Jul 26
1
Is there an R implementation for the "Barnard's exact test" (a substitute for fisher.test) ?
Hello R help members. I came across today with an article on Barnard's exact test (http://www.cytel.com/Papers/twobinomials.pdf), that is supposed to give a more powerful fisher.test - Because it doesn't assume that we know the row and column totals are in advance. Any pointers to such a function ? Thanks, Tal -- ---------------------------------------------- My contact information:
2009 Jul 01
1
Are there any bloggers amoung us going to useR 2009 ?
*(note*: This is an R community question, not a statistical nor coding question. Since this is my first time writing such a post, I hope no one will take offence of it.) Hello all, I will be attending useR 2009 next week, and was wondering if there are any of you who are *bloggers *intending to participate and report on useR 2009? If so - I would love to know your blogs URL so as to follow you.
2009 Feb 21
1
variable/model selction (step/stepAIC) for biglm ?
Hello dear R mailing list members. I have recently became curious of the possibility applying model selection algorithms (even as simple as AIC) to regressions of large datasets. I searched as best as I could, but couldn't find any reference or wrapper for using step or stepAIC to packages such as biglm. Any ideas or directions of how to implement such a concept ? Best, Tal --
2009 Aug 10
3
Bug in "seq" (or a "feature") ?
(I use R 2.9.1 with win XP) If I run this code: seq(-0.1,.9, by = .05)[seq(-0.1,.9, by = .05) <= 0.5] I get this output: [1] -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 Why is 0.50 not in the results ? (It seems that it gives a slightly bigger number then 0.5 but I don't understand why it does that) Where as if I try: seq(-0.1,.9, by = .05)[seq(-0.1,.9, by = .05) <=
2009 Feb 18
0
Index-G1 error
I am using some functions from package clusterSim to evaluate the best clusters layout. Here is the features vector I am using to cluater 12 signals: > alpha.vec [1] 0.8540039 0.8558350 0.8006592 0.8066406 0.8322754 0.8991699 0.8212891 [8] 0.8815918 0.9050293 0.9174194 0.8613281 0.8425293 In the following I pasted an excerpt of my program:
2006 Apr 07
2
cclust causes R to crash when using manhattan kmeans
Dear R users, When I run the following code, R crashes: require(cclust) x <- matrix(c(0,0,0,1.5,1,-1), ncol=2, byrow=TRUE) cclust(x, centers=x[2:3,], dist="manhattan", method="kmeans") While this works: cclust(x, centers=x[2:3,], dist="euclidean", method="kmeans") I'm posting this here because I am not sure if it is a bug. I've been searching
2009 Aug 20
4
simple randomization question: How to perform "sample" in chunks
Hello dear R-help group. My task looks simple, but I can't seem to find a "smart" (e.g: non loop) solution to it. Task: I wish to randomize a data.frame by one column, while keeping the inner-order in the second column as is. So for example, let's say I have the following data.frame: xx <-data.frame(a= c(1,2,2,3,3,3,4,4,4,4) , b =