similar to: Massive clustering job?

Displaying 20 results from an estimated 10000 matches similar to: "Massive clustering job?"

2011 Jun 09
1
k-nn hierarchical clustering
Hi there, is there any R-function for k-nearest neighbour agglomerative hierarchical clustering? By this I mean standard agglomerative hierarchical clustering as in hclust or agnes, but with the k-nearest neighbour distance between clusters used on the higher levels where there are at least k>1 distances between two clusters (single linkage is 1-nearest neighbour clustering)? Best regards,
2003 Aug 11
2
cluster analysis
I'like to do cluster analysis by using mahalanobis distance. Could you tell me how to do?
2003 May 07
1
-means, hybrid clustering or similar implementations on R
Hi, I would like to know if someone knows an extended implementation of k-means in R to find appropriate number of clusters for a given k-dimensional data. Also, I am working on clustering for forecasting, if someone is interested or has knowledge on implementational details please mail me, I would appreciate it. Regards Skanda Kallur "Cogito, ergo sum" (I think, therefore I
2009 Jun 11
1
Cluster analysis, defining center seeds or number of clusters
I use kmeans to classify spectral events in high and low 1/3 octave bands: #Do cluster analysis CyclA<-data.frame(LlowA,LhghA) CntrA<-matrix(c(0.9,0.8,0.8,0.75,0.65,0.65), nrow = 3, ncol=2, byrow=TRUE) ClstA<-kmeans(CyclA,centers=CntrA,nstart=50,algorithm="MacQueen") This works well when the actual data shows 1,2 or 3 groups that are not "too close" in a cross plot.
2004 Dec 09
1
more clustering questions
Sorry to bother you kind folks again with my questions. I am trying to learn as much as I can about all this, and I will admit that I don't have the proper background, but I hope that someone can at least point me in the correct direction. I have created a test matrix for what I want to do: s1 s2 s3 s4 s5 s1 10 5 0 8 7 s2 5 10 0 0 5 s3 0 0 10 0 0 s4 8 0 0 10 0 s5 7
2003 Dec 03
3
non-uniqueness in cluster analysis
Hi, I'm clustering objects defined by categorical variables with a hierarchical algorithm - average linkage. My distance matrix (general dissimilarity coefficient) includes several distances with exactly the same values. As I see, a standard agglomerative procedure ignores this problems, simply selecting, above equal distances, the one that comes first. For this reason the analysis in output
2011 May 16
1
pam() clustering for large data sets
Hello everyone, I need to do k-medoids clustering for data which consists of 50,000 observations. I have computed distances between the observations separately and tried to use those with pam(). I got the "cannot allocate vector of length" error and I realize this job is too memory intensive. I am at a bit of a loss on what to do at this point. I can't use clara(), because I
2004 Oct 13
3
data(eurodist) and PCA ??
If I perform PCA on the 'eurodist' data, should I get an accurate geographic layout of the cities with biplot? (barring inversions, i.e. their is no way to define north.. but you get the idea...) I have a complex distance matrix, and I am thinking about how to cluster it and how to visualize the quality of the resulting clusters. If I could 'see' the clusters in space I could
2004 Oct 21
5
Cluster Analysis: Density-Based Method
Hi people, Does anybody know some Density-Based Method for clustering implemented in R? Thanks, Fernando Prass _______________________________________________________
2005 Sep 12
4
Document clustering for R
I'm working on a project related to document clustering. I know that R has clustering algorithms such as clara, but only supports two distance metrics: euclidian and manhattan, which are not very useful for clustering documents. I was wondering how easy it would be to extend the clustering package in R to support other distance metrics, such as cosine distance, or if there was an API for
2008 Dec 17
1
bug (?!) in "pam()" clustering from fpc package ?
Hello all. I wish to run k-means with "manhattan" distance. Since this is not supported by the function "kmeans", I turned to the "pam" function in the "fpc" package. Yet, when I tried to have the algorithm run with different starting points, I found that pam ignores and keep on starting the algorithm from the same starting-points (medoids). For my
2003 Apr 24
1
estimating number of clusters ("Null or more")
Hi all, once more about the old subj :-) My data has too much various distribution families and for every particular experiment I need just to decide whether the data is "quite homogeneous" or it has two or more clusters. I've revisited the following libraries: amap, clust, cclust, mclust, multiv, normix, survey. And I didn't find any ready-to-use general
2004 May 04
1
spdep question
Dear list, (also sent to Roger Bivand, but perhaps somebody of you can help me also) I am trying to use package spdep for fitting an SAR model with errorsarlm. However, I am not sure how to make a valid nb object out of my neighborhood. As far as I have seen, there is no documentation for nb.object. I have done the following: class(pschmid$nb) <- "nb" # pschmid is a prab object as
2005 Aug 08
2
selecting outliers
Hi everybody, I'd like to know if there's an easy way for extracting outliers record from a dataset, in order to perform further analysis on them. Thanks Alessandro
2003 Jan 30
2
Validation of clustering
Hi, I'm using the library cluster to cluster a set of figures (method CLARA). Somebody that it work with clustering would know informs what I make to evaluate the clustering? Tks VM, Francisco. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Francisco JĂșnior, Computer Science - UFPE-Brazil "One life has more value that the world whole" ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2003 Apr 23
1
clustering
Dear R-users, I have a two - dimensional data set which needs to be clustered into groups: I'm searching for groups of points which show a positive correlation (in a twodimensional plot of the data set), but I do not have any knowledge about how many groups there might be. Do you know of a clustering algorithm in R (or in general) which can use a-priori information about the cluster's
2005 Mar 04
2
Clustering of Binary data in R
Good afternoon! I would like to ask you about similarity measures and clustering in R for Binary data. Would you please kindly help me and let me know about that commands in R? Thanks in advance for your kind attentions. I look forward to hearing from you as soon as possible. Best regards, Sima
2008 Mar 12
4
Distances between two datasets of x and y co-ordinates
Hi all I am trying to determine the distances between two datasets of x and y points. The number of points in dataset One is very small i.e. perhaps 5-10. The number of points in dataset Two is likely to be very large i.e. 20,000-30,000. My initial approach was to append the first dataset to the second and then carry out the calculation: dists <- as.matrix(dist(gis data from 2 * datasets))
2008 Jun 13
3
cluster.stats
Dear list, I just tried to use the function cluster.stat in the package fpc. I just have a couple of questions about the syntax: cluster.stats(d,clustering,alt.clustering=NULL, silhouette=TRUE,G2=FALSE,G3=FALSE) 1) the distance object (d) is an object obtained by the function dist() on my own original matrix? 2) clustering is the clusters vector as result of one of the many clustering methods?
2003 Jun 17
2
Clustering quality measure
Hi all, I am running a series of experiments where after manipulating my data I run several clustering algorithms (agnes, diana and a clustering method of my own) on the data. I wanted to determine which clustering method did the best job, so therefore I had defined my own quality measure using two criteria: compactness of the data within the clusters themselves and the amount of seperation