similar to: Using pam, agnes or clara as prediction models?

Displaying 20 results from an estimated 2000 matches similar to: "Using pam, agnes or clara as prediction models?"

2006 Apr 10
passing known medoids to clara() in the cluster package
Greetings, I have had good success using the clara() function to perform a simple cluster analysis on a large dataset (1 million+ records with 9 variables). Since the clara function is a wrapper to pam(), which will accept known medoid data - I am wondering if this too is possible with clara() ... The documentation does not suggest that this is possible. Essentially I am trying to
2006 Jan 26
cluster analysis: "error in vector("double", length): given vector size is too big {Fehler in vector("double", length) : angegebene Vektorgröße ist zu groß}
Dear R Specialists, when trying to cluster a data.frame with about 80.000 rows and 25 columns I get the above error message. I tried hclust (using dist), agnes (entering the data.frame directly) and pam (entering the data.frame directly). What I actually do not want to do is generate a random sample from the data. The machine I run R on is a Windows 2000 Server (Pentium 4) with 2 GB of
2008 Aug 01
Exporting data to a text file
HI R users With clara function I get a data frame (maybe this is not the exact word, I'm new to R) with the following variables: > names(myclara) [1] "sample" "medoids" "" "clustering" "objective" [6] "clusinfo" "diss" "call" "silinfo" "data" I want to
2011 Aug 10
Clustering Large Applications..sort of
Hello all, I am using the clustering functions in R in order to work with large masses of binary time series data, however the clustering functions do not seem able to fit this size of practical problem. Library 'hclust' is good (though it may be sub par for this size of problem, thus doubly poor for this application) in that I do not want to make assumptions about the number of
2011 May 16
pam() clustering for large data sets
Hello everyone, I need to do k-medoids clustering for data which consists of 50,000 observations. I have computed distances between the observations separately and tried to use those with pam(). I got the "cannot allocate vector of length" error and I realize this job is too memory intensive. I am at a bit of a loss on what to do at this point. I can't use clara(), because I
2011 Jun 27
New to R, trying to use agnes, but can't load my ditance matrix
Hi, I'm mighty new to R. I'm using it on Windows. I'm trying to cluster using a distance matrix I created from the data on my own and called it D10.dist. I loaded the cluster package. Then tried the following command... > agnes("E:D10.dist", diss = TRUE, metric = "euclidean", stand = FALSE, > method = "average", par.method, keep.diss = n < 1000,
2011 Jan 27
agnes clustering and NAs
Hello, In the documentation for agnes in the package 'cluster', it says that NAs are allowed, and sure enough it works for a small example like : > m <- matrix(c( 1, 1, 1, 2, 1, NA, 1, 1, 1, 2, 2, 2), nrow = 3, byrow = TRUE) > agnes(m) Call: agnes(x = m) Agglomerative coefficient: 0.1614168 Order of objects: [1] 1 2 3 Height (summary): Min. 1st Qu. Median Mean 3rd
2008 Feb 22
Looping and Pasting
Hello R-community: Much of the time I want to use loops to look at graphs, etc. For example, I have 25 plots, for which the names are m.1$medoids, m.2$medoids, ..., m.25$medoids. I want to index the object number (1:25) as below (just to show concept). for (i in 1:25){ plot(m.i$medoids) } I've tried the following, with negative results for ...
2015 Apr 29
cantidad de datos
Hola. Yo en vez de utilizar análisis cluster que impliquen distancias, probaría con un kmedias o con un pam (partition around medoids) pero utilizando muestras, la función clara de la librería cluster puede ayudarte. Pego el details de la ayuda de 'clara' Details clara is fully described in chapter 3 of Kaufman and Rousseeuw (1990). Compared to other partitioning methods such as pam,
2011 Mar 31
Cluster analysis, factor variables, large data set
Dear R helpers, I have a large data set with 36 variables and about 50.000 cases. The variabels represent labour market status during 36 months, there are 8 different variable values (e.g. Full-time Employment, Student,...) Only cases with at least one change in labour market status is included in the data set. To analyse sub sets of the data, I have used daisy in the cluster-package to create
2010 Apr 24
DICE Coefficient of similarity measure
Hi, I wanted the DICE coefficient (similarity measure for binary variables) to be calculated in R and found that the "igraph" package has the option of "similarity.dice" to do this. But, for this command, the input object should be an igraph object. But, I have a dataframe of columns containing 1's and 0's. Can I convert this dataframe into an igraph object, so that
2015 Apr 29
cantidad de datos
El inconveniente con un K-medias, es que se tiene que se tiene que pre definir el número de segmentos, pero eso es algo con lo q no cuento. La solución de Javier me parece q sería la única opción. Atte. Ricardo Alva Valiente -----Mensaje original----- De: R-help-es [mailto:r-help-es-bounces en] En nombre de javier.ruben.marcuzzi en Enviado el: miércoles, 29 de abril de
2008 Dec 17
bug (?!) in "pam()" clustering from fpc package ?
Hello all. I wish to run k-means with "manhattan" distance. Since this is not supported by the function "kmeans", I turned to the "pam" function in the "fpc" package. Yet, when I tried to have the algorithm run with different starting points, I found that pam ignores and keep on starting the algorithm from the same starting-points (medoids). For my
2004 Feb 04
Clustering with 'agnes'
Hello, I had a question regarding clustering using the agnes() function from the 'cluster' package. I was wondering if anyone knew how I can identify cluster points after running the agnes function. For example, I created a dataset with points randomly scattered around (0,0), (0,1) and (1,0). After clustering, the dendrogram shows all the clustered points and I get the ordering and
2006 Oct 23
Agnes Help
Hi, I'm trying to use the cluster package and I'm having some trouble... I always get the message: > myagnes <- agnes("datafile.dat") > Error: could not find function "agnes" the package cluster is listed in the library() command, and I can reach the help files from Agnes as well I know that this can be some really easy thing to fix, but right now I have no
2011 Aug 31
agnes not working
Hello! I created a distances matrix for 13 objects using daisy (see the attached file). I am trying to clusteranalyse it using agnes but it's not working. What might be the problem: mydistances<-read.csv("Results of daisy.csv") mycluster<-agnes(mydistances, method="ward") I am getting: Error in agnes(mydistances, method = "ward") : NA/NaN/Inf in foreign
2003 Dec 11
cutree with agnes
Hi, this is rather a (presumed) bug report than a question because I can solve my personal statistical problem by working with hclust instead of agnes. I have done a complete linkage clustering on a dist object dm with 30 objects with agnes (R 1.8.0 on RedHat) and I want to obtain the partition that results from a cut at height=0.4. I run > cl1a <- agnes(dm, method="complete")
2003 Dec 11
cutree with agnes
Hi, this is rather a (presumed) bug report than a question because I can solve my personal statistical problem by working with hclust instead of agnes. I have done a complete linkage clustering on a dist object dm with 30 objects with agnes (R 1.8.0 on RedHat) and I want to obtain the partition that results from a cut at height=0.4. I run > cl1a <- agnes(dm, method="complete")
2010 Nov 16
Plotting an agnes tree with images instead of labels?
Hi, I'd like to plot a tree with images of molecular structures instead of labels (words). I think this is possible because someone who worked in my office before I arrived did this. However I'm not sure if this person made the image manually or plotted it only with R. Thanks in advance for your help. -- View this message in context:
2009 Jun 22
coloring agnes plots
Hi all, I am creating dendrograms using agnes and was wondering if it is possible to add color to the leaves (and just the leaves). For example, in the documentation, they have an example using the "votes.repub" data set. If I wanted to make the word "Washington" green (and only Washington), is it possible and if so how? I can apply "summary" to the object