thr3ads.net - similar to: "Specifying medoids in PAM?"

Displaying 20 results from an estimated 400 matches similar to: "Specifying medoids in PAM?"

2005 May 04

Calculate median from counts and values

I am tangled with a syntax question. I want to calculate basic statistics for a large dataset provided in weights and values and I can't figure out an elegant way to expand the data. For example here are the counts: > counts n4 n3 n2 n1 p0 p1 p2 p3 p4 1 0 0 0 1 1 3 16 55 24 2 0 0 0 0 2 8 28 47 15 3 1 17 17 13 4 5 12 24 8 ... and the values: > values

give PAM my own medoids

2004 Jun 29

give PAM my own medoids

Hello, When using PAM (partitioning around medoids), I would like to skip the build-step and give the fonction my own medoids. Do you know if it is possible, and how ? Thank you very much. Isabel

Getting individual co-ordinate points in k medoids cluster

2024 Sep 17

Getting individual co-ordinate points in k medoids cluster

Hello I am using k medoids in R to generate sets of clusters for datasets through time. I can plot the individual clusters OK but what I cannot find is a way of pulling out the co-ordinates of the individual points in the cluster diagrams - none of the kmed$... info sets seems to be this. Beneath is an example of a k medoid prog using the built in US arrests dataset - this is not the data I am

passing known medoids to clara() in the cluster package

2006 Apr 10

passing known medoids to clara() in the cluster package

Greetings, I have had good success using the clara() function to perform a simple cluster analysis on a large dataset (1 million+ records with 9 variables). Since the clara function is a wrapper to pam(), which will accept known medoid data - I am wondering if this too is possible with clara() ... The documentation does not suggest that this is possible. Essentially I am trying to

Looping and Pasting

2008 Feb 22

Looping and Pasting

Hello R-community: Much of the time I want to use loops to look at graphs, etc. For example, I have 25 plots, for which the names are m.1$medoids, m.2$medoids, ..., m.25$medoids. I want to index the object number (1:25) as below (just to show concept). for (i in 1:25){ plot(m.i$medoids) } I've tried the following, with negative results for ...

bug (?!) in "pam()" clustering from fpc package ?

2008 Dec 17

bug (?!) in "pam()" clustering from fpc package ?

Hello all. I wish to run k-means with "manhattan" distance. Since this is not supported by the function "kmeans", I turned to the "pam" function in the "fpc" package. Yet, when I tried to have the algorithm run with different starting points, I found that pam ignores and keep on starting the algorithm from the same starting-points (medoids). For my

[cluster package question] What is the "sum of the dissimilarities" in the pam command ?

2009 Mar 29

[cluster package question] What is the "sum of the dissimilarities" in the pam command ?

Hello Martin Maechler and All, A simple question (I hope): How can I compute the "sum of the dissimilarities" that appears in the pam command (from the cluster package) ? Is it the "manhattan" distance (such as the one implemented by "dist") ? I am asking since I am running clustering on a dataset. I found 7 medoids with the pam command, and from it I have the

Naive knn question

2009 Jun 29

Naive knn question

Dear list, I have two dissimilarity matrices, one for a training data set which I then clustered using PAM. The second is a diss matrix for a validation data set (an independent field sample). I have been trying to use knn to distinguish distances between the validation data set and the 6 mediods of the training data defined by using PAM. I continue to get error messages in regards to either the

Exporting data to a text file

2008 Aug 01

Exporting data to a text file

HI R users With clara function I get a data frame (maybe this is not the exact word, I'm new to R) with the following variables: > names(myclara) [1] "sample" "medoids" "i.med" "clustering" "objective" [6] "clusinfo" "diss" "call" "silinfo" "data" I want to

pam() clustering for large data sets

2011 May 16

pam() clustering for large data sets

Hello everyone, I need to do k-medoids clustering for data which consists of 50,000 observations. I have computed distances between the observations separately and tried to use those with pam(). I got the "cannot allocate vector of length" error and I realize this job is too memory intensive. I am at a bit of a loss on what to do at this point. I can't use clara(), because I

Clustering Large Applications..sort of

2011 Aug 10

Clustering Large Applications..sort of

Hello all, I am using the clustering functions in R in order to work with large masses of binary time series data, however the clustering functions do not seem able to fit this size of practical problem. Library 'hclust' is good (though it may be sub par for this size of problem, thus doubly poor for this application) in that I do not want to make assumptions about the number of

re-vertical conversion of data entries

2010 Oct 25

re-vertical conversion of data entries

Dear R user, Can you please help me. How do I convert part of a cluster analysis output under the heading “Clustering vector” as shown below, showing the clusters to which each respondent belongs to: [1] 1 1 2 2 1 2 1 2 1 1 2 2 1 2 2 2 2 1 1 1 1 2 2 1 2 2 1 2 2 2 2 2 2 2 2 1 2 [38] 2 1 1 2 2 2 2 2 1 2 1 2 2 2 2 1 2 1 2 2 1 2 2 2 2 2 2 1 2 1 2 2 2 1 1 2 2 [75] 2 1 2 2 2 2 2 2 2 1 1 2

cantidad de datos

2015 Apr 29

cantidad de datos

Hola. Yo en vez de utilizar análisis cluster que impliquen distancias, probaría con un kmedias o con un pam (partition around medoids) pero utilizando muestras, la función clara de la librería cluster puede ayudarte. Pego el details de la ayuda de 'clara' Details clara is fully described in chapter 3 of Kaufman and Rousseeuw (1990). Compared to other partitioning methods such as pam,

Cluster analysis, factor variables, large data set

2011 Mar 31

Cluster analysis, factor variables, large data set

Dear R helpers, I have a large data set with 36 variables and about 50.000 cases. The variabels represent labour market status during 36 months, there are 8 different variable values (e.g. Full-time Employment, Student,...) Only cases with at least one change in labour market status is included in the data set. To analyse sub sets of the data, I have used daisy in the cluster-package to create

Index-G1 error

2009 Feb 18

Index-G1 error

I am using some functions from package clusterSim to evaluate the best clusters layout. Here is the features vector I am using to cluater 12 signals: > alpha.vec [1] 0.8540039 0.8558350 0.8006592 0.8066406 0.8322754 0.8991699 0.8212891 [8] 0.8815918 0.9050293 0.9174194 0.8613281 0.8425293 In the following I pasted an excerpt of my program:

cantidad de datos

2015 Apr 29

cantidad de datos

El inconveniente con un K-medias, es que se tiene que se tiene que pre definir el número de segmentos, pero eso es algo con lo q no cuento. La solución de Javier me parece q sería la única opción. Atte. Ricardo Alva Valiente -----Mensaje original----- De: R-help-es [mailto:r-help-es-bounces en r-project.org] En nombre de javier.ruben.marcuzzi en gmail.com Enviado el: miércoles, 29 de abril de

Rose diagrams in R?

2001 Nov 23

Rose diagrams in R?

I am looking for a function (or package) to plot histograms of directional data such as wind direction. I believe these are called rose diagrams. Is there an R script for this? If not, can it be constructed in a function calling primitive graphic calls (lines, circles, boxes or polygons)? The stars function is not quite right. -- David Finlayson Geomorphogist and GIS Specialist NearPRISM -

classification algorithms with distance matrix

2010 Jun 07

classification algorithms with distance matrix

Dear all, I have a problem when using some classification functions (Kmeans, PAM, FANNY...) with a distance matrix, and i would to understand how it proceeds for the positioning of centroids after one execution step. In fact, in the classical formulation of the algorithm, after each step, to re-position the center, it calculates the distance between any elements of the old cluster and its

cantidad de datos

2015 Apr 29

cantidad de datos

Buen aporte?excelente!! Atte. Ricardo Alva Valiente De: Jose Luis Cañadas Reche [mailto:canadasreche en gmail.com] Enviado el: miércoles, 29 de abril de 2015 12:51 PM Para: Alva Valiente, Ricardo (RIAV); 'javier.ruben.marcuzzi en gmail.com'; R-help-es en r-project.org Asunto: Re: [R-es] cantidad de datos Podrías hacer varios kmedias con diferente número de clusters y comprobar como

"partitioning cluster function"

2006 Apr 05

"partitioning cluster function"

Hi All, For the function "bclust"(e1071), the argument "base.method" is explained as "must be the name of a partitioning cluster function returning a list with the same components as the return value of 'kmeans'. In my understanding, there are three partitioning cluster functions in R, which are "clara, pam, fanny". Then I check each of them to

similar to: Specifying medoids in PAM?