thr3ads.net - similar to: "passing known medoids to clara() in the cluster package"

Displaying 20 results from an estimated 1000 matches similar to: "passing known medoids to clara() in the cluster package"

Getting individual co-ordinate points in k medoids cluster

2024 Sep 17

Getting individual co-ordinate points in k medoids cluster

Hello I am using k medoids in R to generate sets of clusters for datasets through time. I can plot the individual clusters OK but what I cannot find is a way of pulling out the co-ordinates of the individual points in the cluster diagrams - none of the kmed$... info sets seems to be this. Beneath is an example of a k medoid prog using the built in US arrests dataset - this is not the data I am

Using pam, agnes or clara as prediction models?

2004 Jan 14

Using pam, agnes or clara as prediction models?

Hello list, I am new to R, so if the question is rather silly, please ignore it. I was wondering wether it would be possible to use the models generated by pam, clara and the like as predictors? Scanning through the available documentation shed no light (for me) upon the subject. Regards, Renald

cantidad de datos

2015 Apr 29

cantidad de datos

Hola. Yo en vez de utilizar análisis cluster que impliquen distancias, probaría con un kmedias o con un pam (partition around medoids) pero utilizando muestras, la función clara de la librería cluster puede ayudarte. Pego el details de la ayuda de 'clara' Details clara is fully described in chapter 3 of Kaufman and Rousseeuw (1990). Compared to other partitioning methods such as pam,

pam() clustering for large data sets

2011 May 16

pam() clustering for large data sets

Hello everyone, I need to do k-medoids clustering for data which consists of 50,000 observations. I have computed distances between the observations separately and tried to use those with pam(). I got the "cannot allocate vector of length" error and I realize this job is too memory intensive. I am at a bit of a loss on what to do at this point. I can't use clara(), because I

cantidad de datos

2015 Apr 29

cantidad de datos

El inconveniente con un K-medias, es que se tiene que se tiene que pre definir el número de segmentos, pero eso es algo con lo q no cuento. La solución de Javier me parece q sería la única opción. Atte. Ricardo Alva Valiente -----Mensaje original----- De: R-help-es [mailto:r-help-es-bounces en r-project.org] En nombre de javier.ruben.marcuzzi en gmail.com Enviado el: miércoles, 29 de abril de

give PAM my own medoids

2004 Jun 29

give PAM my own medoids

Hello, When using PAM (partitioning around medoids), I would like to skip the build-step and give the fonction my own medoids. Do you know if it is possible, and how ? Thank you very much. Isabel

Specifying medoids in PAM?

2005 Jun 07

Specifying medoids in PAM?

I am using the PAM algorithm in the CLUSTER library. When I allow PAM to seed the medoids using the default __build__ algorithm things work well: > pam(stats.table, metric="euclidean", stand=TRUE, k=5) But I have some clusters from a Hierarchical analysis that I would like to use as seeds for the PAM algorithm. I can't figure what the mediod argument wants. When I put in the

[cluster package question] What is the "sum of the dissimilarities" in the pam command ?

2009 Mar 29

[cluster package question] What is the "sum of the dissimilarities" in the pam command ?

Hello Martin Maechler and All, A simple question (I hope): How can I compute the "sum of the dissimilarities" that appears in the pam command (from the cluster package) ? Is it the "manhattan" distance (such as the one implemented by "dist") ? I am asking since I am running clustering on a dataset. I found 7 medoids with the pam command, and from it I have the

cantidad de datos

2015 Apr 29

cantidad de datos

Buen aporte?excelente!! Atte. Ricardo Alva Valiente De: Jose Luis Cañadas Reche [mailto:canadasreche en gmail.com] Enviado el: miércoles, 29 de abril de 2015 12:51 PM Para: Alva Valiente, Ricardo (RIAV); 'javier.ruben.marcuzzi en gmail.com'; R-help-es en r-project.org Asunto: Re: [R-es] cantidad de datos Podrías hacer varios kmedias con diferente número de clusters y comprobar como

memory problem

2006 Dec 01

memory problem

hi to all, frustated for this error, to day i buy a 1 GB memory slot for my laptop now it have 1,28GB instead the old 512, but i've the same error :-( damn!damn!....how can i do? repeat for a little area (about 20X20 km and res=20m) it work fine! have you any suggestion? is ther a method for look if this error depend from my ram or other....? thanks foe any suggestion! i need your help.

DICE Coefficient of similarity measure

2010 Apr 24

DICE Coefficient of similarity measure

Hi, I wanted the DICE coefficient (similarity measure for binary variables) to be calculated in R and found that the "igraph" package has the option of "similarity.dice" to do this. But, for this command, the input object should be an igraph object. But, I have a dataframe of columns containing 1's and 0's. Can I convert this dataframe into an igraph object, so that

Cluster analysis, factor variables, large data set

2011 Mar 31

Cluster analysis, factor variables, large data set

Dear R helpers, I have a large data set with 36 variables and about 50.000 cases. The variabels represent labour market status during 36 months, there are 8 different variable values (e.g. Full-time Employment, Student,...) Only cases with at least one change in labour market status is included in the data set. To analyse sub sets of the data, I have used daisy in the cluster-package to create

Looping and Pasting

2008 Feb 22

Looping and Pasting

Hello R-community: Much of the time I want to use loops to look at graphs, etc. For example, I have 25 plots, for which the names are m.1$medoids, m.2$medoids, ..., m.25$medoids. I want to index the object number (1:25) as below (just to show concept). for (i in 1:25){ plot(m.i$medoids) } I've tried the following, with negative results for ...

Exporting data to a text file

2008 Aug 01

Exporting data to a text file

HI R users With clara function I get a data frame (maybe this is not the exact word, I'm new to R) with the following variables: > names(myclara) [1] "sample" "medoids" "i.med" "clustering" "objective" [6] "clusinfo" "diss" "call" "silinfo" "data" I want to

Enormous Datasets

2004 Nov 18

Enormous Datasets

Dear List, I have some projects where I use enormous datasets. For instance, the 5% PUMS microdata from the Census Bureau. After deleting cases I may have a dataset with 7 million+ rows and 50+ columns. Will R handle a datafile of this size? If so, how? Thank you in advance, Tom Volscho ************************************ Thomas W. Volscho Graduate Student Dept. of Sociology U-2068

Clustering Large Applications..sort of

2011 Aug 10

Clustering Large Applications..sort of

Hello all, I am using the clustering functions in R in order to work with large masses of binary time series data, however the clustering functions do not seem able to fit this size of practical problem. Library 'hclust' is good (though it may be sub par for this size of problem, thus doubly poor for this application) in that I do not want to make assumptions about the number of

classification algorithms with distance matrix

2010 Jun 07

classification algorithms with distance matrix

Dear all, I have a problem when using some classification functions (Kmeans, PAM, FANNY...) with a distance matrix, and i would to understand how it proceeds for the positioning of centroids after one execution step. In fact, in the classical formulation of the algorithm, after each step, to re-position the center, it calculates the distance between any elements of the old cluster and its

clara - memory limit

2005 Aug 03

clara - memory limit

Dear all, I'm trying to estimate clusters from a very large dataset using clara but the program stops with a memory error. The (very simple) code and the error: mydata<-read.dbf(file="fnorsel_4px.dbf") my.clara.7k<-clara(mydata,k=7) >Error: cannot allocate vector of size 465108 Kb The dataset contains >3,000,000 rows and 15 columns. I'm using a windows computer

CLARA and determining the right number of clusters

2008 Sep 30

CLARA and determining the right number of clusters

Hi everyone I have a question about clustering. I've managed using CLARA to get a clustering analysis of a large data set. But now I want to find which is the right number of clusters. The clara.object gives some information like the ratio between maximal and minimal dissimilarity that says (maybe if lower than 1??) if a cluster is well-separated from the other. I've also read something

CLARA

2003 Nov 17

CLARA

I need informations about the clara routine. The on-line doc say that the argument stand is a logical, indicating if the measurements in x are standardized before calculating the dissimilarities. Measurements are standardized for each variable (column), by subtracting the variable's mean value and dividing by the variable's mean absolute deviation. If we note STAND = TRUE, I suppose that

similar to: passing known medoids to clara() in the cluster package