Displaying 20 results from an estimated 2000 matches similar to: "How-Understand-Kmeans-Cluster!!"
2011 Apr 06
2
Help in kmeans
Hi All,
I was using the following command for performing kmeans for Iris dataset.
Kmeans_model<-kmeans(dataFrame[,c(1,2,3,4)],centers=3)
This was giving proper results for me. But, in my application we generate
the R commands dynamically and there was a requirement that the column names
will be sent instead of column indices to the R commands.Hence, to
incorporate this, i tried using the R
2003 Jun 05
1
kmeans (again)
Regarding a previous question concerning the kmeans function I've tried the
same example and I also get a strange result (at least according to what is
said in the help of the function kmeans). Apparently, the function is
disregarding the initial cluster centers one gives it. According to the help
of the function:
centers: Either the number of clusters or a set of initial cluster
2013 Mar 13
1
Empty cluster / segfault using vanilla kmeans with version 2.15.2
Hello,
here is a working reproducible example which crashes R using kmeans or
gives empty clusters using the nstart option with R 15.2.
library(cluster)
kmeans(ruspini,4)
kmeans(ruspini,4,nstart=2)
kmeans(ruspini,4,nstart=4)
kmeans(ruspini,4,nstart=10)
?kmeans
either we got empty always clusters and or, after some further commands
an segfault.
regards,
Detlef Groth
------------
[R] Empty
2004 May 11
1
AW: Probleme with Kmeans...
Sorry, to solve your question I had tried:
data(faithful)
kmeans(faithful[c(1:20),1],10)
Error: empty cluster: try a better set of initial centers
But when I run this a second time it will be ok.
It seems, that kmeans has problems to initialize good starting points, because of the random choose of these starting initial points.
With kmeans(data,k,centers=c(...) the problem can be solved.
2006 Jul 09
2
distance in kmeans algorithm?
Hello.
Is it possible to choose the distance in the kmeans algorithm?
I have m vectors of n components and I want to cluster them using kmeans
algorithm but I want to use the Mahalanobis distance or another distance.
How can I do it in R?
If I use kmeans, I have no option to choose the distance.
Thanks in advance,
Arnau.
2003 Apr 14
2
kmeans clustering
Hi,
I am using kmeans to cluster a dataset.
I test this example:
> data<-matrix(scan("data100.txt"),100,37,byrow=T)
(my dataset is 100 rows and 37 columns--clustering rows)
> c1<-kmeans(data,3,20)
> c1
$cluster
[1] 1 1 1 1 1 1 1 3 3 3 1 3 1 3 3 1 1 1 1 3 1 3 3 1 1 1 3 3 1 1 3 1 1 1 1 3
3
[38] 3 1 1 1 3 1 1 1 1 3 3 3 1 1 1 1 1 1 3 1 3 1 1 3 1 1 1 1 3 1 1 1 1 1 1 3
2006 Apr 07
2
cclust causes R to crash when using manhattan kmeans
Dear R users,
When I run the following code, R crashes:
require(cclust)
x <- matrix(c(0,0,0,1.5,1,-1), ncol=2, byrow=TRUE)
cclust(x, centers=x[2:3,], dist="manhattan", method="kmeans")
While this works:
cclust(x, centers=x[2:3,], dist="euclidean", method="kmeans")
I'm posting this here because I am not sure if it is a bug.
I've been searching
2012 Feb 27
2
kmeans: how to retrieve clusters
Hello,
I'd like to classify data with kmeans algorithm. In my case, I should get 2
clusters in output. Here is my data
colCandInd colCandMed
1 82 2950.5
2 83 1831.5
3 1192 2899.0
4 1193 2103.5
The first cluster is the two first lines
the 2nd cluster is the two last lines
Here is the code:
x = colCandList$colCandInd
y = colCandList$colCandMed
m = matrix(c(x, y),
2009 Jul 20
2
kmeans.big.matrix
Hi,
I'm playing around with the 'bigmemory' package, and I have finally
managed to create some really big matrices. However, only now I
realize that there may not be functions made for what I want to do
with the matrices...
I would like to perform a cluster analysis based on a big.matrix.
Googling around I have found indications that a certain
kmeans.big.matrix() function should
2003 Jun 06
1
Kmeans again
Dear helpers
I'm sorry to insist but I still think there is something wrong with the function kmeans. For instance, let's try the same small example:
> dados<-matrix(c(-1,0,2,2.5,7,9,0,3,0,6,1,4),6,2)
I will choose observations 3 and 4 for initial centers and just one iteration. The results are
> A<-kmeans(dados,dados[c(3,4),],1)
> A
$cluster
[1] 1 1 1 1 2 2
$centers
2005 Jun 14
1
KMEANS output...
Using R 2.1.0 on Windows
2 questions:
1. Is there a way to parse the output from kmeans within R?
2. If the answer to 1. is convoluted or impossible, how do you save the
output from kmeans in a plain text file for further processing outside R?
Example:
> ktx<-kmeans(x,12, nstart = 200)
I would like to parse ktx within R to extract cluster sizes, sum-of-squares
values, etc., OR save ktx in
2010 May 05
2
custom metric for dist for use with hclust/kmeans
Hi guys,
I've been using the kmeans and hclust functions for some time now and
was wondering if I could specify a custom metric when passing my data
frame into hclust as a distance matrix. Actually, kmeans doesn't even
take a distance matrix; it takes the data frame directly. I was
wondering if there's a way or if there's a package that lets you
create distance matrices from
2003 Jun 03
1
kmeans
Dear helpers
I was working with kmeans from package mva and found some strange situations. When I run several times the kmeans algorithm with the same dataset I get the same partition. I simulated a little example with 6 observations and run kmeans giving the centers and making just one iteration. I expected that the algorithm just allocated the observations to the nearest center but think this
2001 Mar 13
1
kmeans cluster stability
I'm doing kmeans partitioning on a small (n=26) dataset that has 5
variables. I noticed that if I repeatedly run the same command, the
cluster centers change and the cluster membership changes.
Using RW1022 under Windows NT & Windows 2000
>kmeans(pottery[,1:5], 4, 20)
[...snip]
$size
[1] 7 3 9 7
[...snip]
$size
[1] 7 10 4 5
[...snip]
$size
[1] 6 10 5 5
yields a different
2005 Mar 31
2
Using kmeans given cluster centroids and data with NAs
Hello,
I have used the functions agnes and cutree to cluster my data (4977
objects x 22 variables) into 8 clusters. I would like to refine the
solution using a k-means or similar algorithm, setting the initial
cluster centres as the group means from agnes. However my data matrix
has NA's in it and the function kmeans does not appear to accept this?
> dim(centres)
[1] 8 22
> dim(data)
2003 Nov 10
1
kmeans error (bug?)
Hello,
I have been getting the following intermittent error from kmeans:
>str(cavint.p.r)
num [1:1967, 1:13] 0.691 0.123 0.388 0.268 0.485 ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:1967] "6" "49" "87" "102" ...
..$ : chr [1:13] "HYD" "NEG" "POS" "OXY" ...
> set.seed(34)
>
2016 Aug 17
2
KMeans - Evaluation Results
I've gone through the link that you sent me and I currently understand how
this helps and works to some extent, but I am not too sure of how I should
start with converting the current interface to PIMPL design. I'm not used
to this design pattern so its taking some time to sink in :)
Say I start with the Clusterer class, I create a ClustererImpl class which
is the internal class that
2010 Feb 24
1
Sparse KMeans/KDE/Nearest Neighbors?
hi,
I have a dataset (the netflix dataset) which is basically ~18k columns and
well variable number of rows but let's assume 25 thousand for now. The
dataset is very sparse. I was wondering how to do kmeans/nearest neighbors
or kernel density estimation on it.
I tired using the spMatrix function in "Matrix" package. I think I'm able to
create the matrix but as soon as I pass
2006 Aug 07
5
kmeans and incom,plete distance matrix concern
Hi there
I have been using R to perform kmeans on a dataset. The data is fed in using read.table and then a matrix (x) is created
i.e:
[
mat <- matrix(0, nlevels(DF$V1), nlevels(DF$V2),
dimnames = list(levels(DF$V1), levels(DF$V2)))
mat[cbind(DF$V1, DF$V2)] <- DF$V3
This matrix is then taken and a distance matrix (y) created using dist() before performing the kmeans clustering.
My query
2008 Jul 03
1
Otpmial initial centroid in kmeans
Helo there. I am using kmeans of base package to cluster my customers. As
the results of kmeans is dependent on the initial centroid, may I know:
1) how can we specify the centroid in the R function? (I don't want random
starting pt)
2) how to determine the optimal (if not, a good) centroid to start with? (I
am not after the fixed seed solution as it only ensure that the