Displaying 20 results from an estimated 10000 matches similar to: "Massive clustering job?"
2011 Jun 09
1
k-nn hierarchical clustering
Hi there,
is there any R-function for k-nearest neighbour agglomerative hierarchical
clustering?
By this I mean standard agglomerative hierarchical clustering as in hclust
or agnes, but with the k-nearest neighbour distance between clusters used
on the higher levels where there are at least k>1 distances between two
clusters (single linkage is 1-nearest neighbour clustering)?
Best regards,
2003 Aug 11
2
cluster analysis
I'like to do cluster analysis by using mahalanobis distance.
Could you tell me how to do?
2003 May 07
1
-means, hybrid clustering or similar implementations on R
Hi,
I would like to know if someone knows an extended implementation of k-means in R to find appropriate number of clusters for a given k-dimensional data.
Also, I am working on clustering for forecasting, if someone is interested or has knowledge on implementational details please mail me, I would appreciate it.
Regards
Skanda Kallur
"Cogito, ergo sum" (I think, therefore I
2009 Jun 11
1
Cluster analysis, defining center seeds or number of clusters
I use kmeans to classify spectral events in high and low 1/3 octave bands:
#Do cluster analysis
CyclA<-data.frame(LlowA,LhghA)
CntrA<-matrix(c(0.9,0.8,0.8,0.75,0.65,0.65), nrow = 3, ncol=2, byrow=TRUE)
ClstA<-kmeans(CyclA,centers=CntrA,nstart=50,algorithm="MacQueen")
This works well when the actual data shows 1,2 or 3 groups that are not
"too close" in a cross plot.
2004 Dec 09
1
more clustering questions
Sorry to bother you kind folks again with my questions. I am trying to
learn as much as I can about all this, and I will admit that I don't
have the proper background, but I hope that someone can at least point
me in the correct direction.
I have created a test matrix for what I want to do:
s1 s2 s3 s4 s5
s1 10 5 0 8 7
s2 5 10 0 0 5
s3 0 0 10 0 0
s4 8 0 0 10 0
s5 7
2003 Dec 03
3
non-uniqueness in cluster analysis
Hi,
I'm clustering objects defined by categorical variables with a hierarchical
algorithm - average linkage.
My distance matrix (general dissimilarity coefficient) includes several
distances with exactly the same values.
As I see, a standard agglomerative procedure ignores this problems, simply
selecting, above equal distances, the one that comes first.
For this reason the analysis in output
2011 May 16
1
pam() clustering for large data sets
Hello everyone,
I need to do k-medoids clustering for data which consists of 50,000
observations. I have computed distances between the observations
separately and tried to use those with pam().
I got the "cannot allocate vector of length" error and I realize this
job is too memory intensive. I am at a bit of a loss on what to do at
this point.
I can't use clara(), because I
2004 Oct 13
3
data(eurodist) and PCA ??
If I perform PCA on the 'eurodist' data, should I get an accurate
geographic layout of the cities with biplot?
(barring inversions, i.e. their is no way to define north.. but you get
the idea...)
I have a complex distance matrix, and I am thinking about how to cluster
it and how to visualize the quality of the resulting clusters.
If I could 'see' the clusters in space I could
2004 Oct 21
5
Cluster Analysis: Density-Based Method
Hi people,
Does anybody know some Density-Based Method for clustering implemented in R?
Thanks,
Fernando Prass
_______________________________________________________
2005 Sep 12
4
Document clustering for R
I'm working on a project related to document clustering. I know that R
has clustering algorithms such as clara, but only supports two distance
metrics: euclidian and manhattan, which are not very useful for
clustering documents. I was wondering how easy it would be to extend the
clustering package in R to support other distance metrics, such as
cosine distance, or if there was an API for
2008 Dec 17
1
bug (?!) in "pam()" clustering from fpc package ?
Hello all.
I wish to run k-means with "manhattan" distance.
Since this is not supported by the function "kmeans", I turned to the "pam"
function in the "fpc" package.
Yet, when I tried to have the algorithm run with different starting points,
I found that pam ignores and keep on starting the algorithm from the same
starting-points (medoids).
For my
2003 Apr 24
1
estimating number of clusters ("Null or more")
Hi all,
once more about the old subj :-)
My data has too much various distribution families and for every
particular experiment
I need just to decide whether the data is "quite homogeneous" or it has
two or more
clusters. I've revisited the following libraries:
amap, clust, cclust, mclust, multiv, normix, survey.
And I didn't find any ready-to-use general
2004 May 04
1
spdep question
Dear list,
(also sent to Roger Bivand, but perhaps somebody of you can help me also)
I am trying to use package spdep for fitting an SAR model with errorsarlm.
However, I am not sure how to make a valid nb object out of my
neighborhood. As far as I have seen, there is no documentation for
nb.object.
I have done the following:
class(pschmid$nb) <- "nb"
# pschmid is a prab object as
2005 Aug 08
2
selecting outliers
Hi everybody,
I'd like to know if there's an easy way for extracting
outliers record from a dataset, in order to perform
further analysis on them.
Thanks
Alessandro
2003 Jan 30
2
Validation of clustering
Hi,
I'm using the library cluster to cluster a set of figures (method CLARA).
Somebody that it work with clustering would know informs what I make to
evaluate the clustering?
Tks VM,
Francisco.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Francisco JĂșnior,
Computer Science - UFPE-Brazil
"One life has more value that the
world whole"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2003 Apr 23
1
clustering
Dear R-users,
I have a two - dimensional data set which needs to be clustered into
groups:
I'm searching for groups of points which show a positive
correlation (in a twodimensional plot of the data set), but I do not have
any knowledge about how many groups there might be.
Do you know of a clustering algorithm in R (or
in general) which can use a-priori information about the cluster's
2005 Mar 04
2
Clustering of Binary data in R
Good afternoon!
I would like to ask you about similarity measures and clustering in R for Binary data.
Would you please kindly help me and let me know about that commands in R?
Thanks in advance for your kind attentions.
I look forward to hearing from you as soon as possible.
Best regards,
Sima
2008 Mar 12
4
Distances between two datasets of x and y co-ordinates
Hi all
I am trying to determine the distances between two datasets of x and y
points. The number of points in dataset One is very small i.e. perhaps
5-10. The number of points in dataset Two is likely to be very large
i.e. 20,000-30,000. My initial approach was to append the first dataset
to the second and then carry out the calculation:
dists <- as.matrix(dist(gis data from 2 * datasets))
2008 Jun 13
3
cluster.stats
Dear list,
I just tried to use the function cluster.stat in the package fpc.
I just have a couple of questions about the syntax:
cluster.stats(d,clustering,alt.clustering=NULL,
silhouette=TRUE,G2=FALSE,G3=FALSE)
1) the distance object (d) is an object obtained by the function dist() on
my own original matrix?
2) clustering is the clusters vector as result of one of the many clustering
methods?
2003 Jun 17
2
Clustering quality measure
Hi all,
I am running a series of experiments where after manipulating my data I
run several clustering algorithms (agnes, diana and a clustering method
of my own) on the data. I wanted to determine which clustering method
did the best job, so therefore I had defined my own quality measure
using two criteria: compactness of the data within the clusters
themselves and the amount of seperation