Displaying 20 results from an estimated 10000 matches similar to: "custom metric for dist for use with hclust/kmeans"
2004 May 28
6
distance in the function kmeans
Hi,
I want to know which distance is using in the function kmeans
and if we can change this distance.
Indeed, in the function pam, we can put a distance matrix in
parameter (by the line "pam<-pam(dist(matrixdata),k=7)" ) but
we can't do it in the function kmeans, we have to put the
matrix of data directly ...
Thanks in advance,
Nicolas BOUGET
2006 Aug 07
5
kmeans and incom,plete distance matrix concern
Hi there
I have been using R to perform kmeans on a dataset. The data is fed in using read.table and then a matrix (x) is created
i.e:
[
mat <- matrix(0, nlevels(DF$V1), nlevels(DF$V2),
dimnames = list(levels(DF$V1), levels(DF$V2)))
mat[cbind(DF$V1, DF$V2)] <- DF$V3
This matrix is then taken and a distance matrix (y) created using dist() before performing the kmeans clustering.
My query
2001 Aug 01
2
clustering question ... hclust & kmeans
I am using R 1.3.0 on Windows 2000.
For an experiment, I am wanting to find the most diverse 400 items to
study in a possible 3200 items. Diversity here is based on a few
hundred attributes. For this, I would like to do a clustering analysis
and find 400 clusters (i.e. different from each other in some way
hopefully). From each of these 400 clusters, I will pick a
representative. I expect
2006 Jul 09
2
distance in kmeans algorithm?
Hello.
Is it possible to choose the distance in the kmeans algorithm?
I have m vectors of n components and I want to cluster them using kmeans
algorithm but I want to use the Mahalanobis distance or another distance.
How can I do it in R?
If I use kmeans, I have no option to choose the distance.
Thanks in advance,
Arnau.
2006 Apr 07
2
cclust causes R to crash when using manhattan kmeans
Dear R users,
When I run the following code, R crashes:
require(cclust)
x <- matrix(c(0,0,0,1.5,1,-1), ncol=2, byrow=TRUE)
cclust(x, centers=x[2:3,], dist="manhattan", method="kmeans")
While this works:
cclust(x, centers=x[2:3,], dist="euclidean", method="kmeans")
I'm posting this here because I am not sure if it is a bug.
I've been searching
2001 Mar 13
1
kmeans cluster stability
I'm doing kmeans partitioning on a small (n=26) dataset that has 5
variables. I noticed that if I repeatedly run the same command, the
cluster centers change and the cluster membership changes.
Using RW1022 under Windows NT & Windows 2000
>kmeans(pottery[,1:5], 4, 20)
[...snip]
$size
[1] 7 3 9 7
[...snip]
$size
[1] 7 10 4 5
[...snip]
$size
[1] 6 10 5 5
yields a different
2008 Mar 20
2
How to plot the dendrogram or tree for kmeans ?
Hi,
How to plot the dendrogram or tree for kmeans, like we do for hclust ?
[[alternative HTML version deleted]]
1999 Oct 07
1
[Fwd: Libraries loading, but not really?] - it really IS a problem :-(
kalish at psy.uwa.edu.au wrote:
>
> I'm a newbie at R, and can't get libraries to really work.
> I did this:
> > library(help = mva)
> cancor Canonical Correlations
> cmdscale Classical (Metric) Multidimensional Scaling
> dist Distance Matrix Computation
> hclust Hierarchical Clustering
2006 Apr 03
2
about arguments in "bclust"
Hi All,
Just want to make sure, in function "bclust", do the following argument
only have one option?
argument "dist.method" has one option "Euclidian";
argument "hclust.method" has one option "average";
argument "base.method" has one option "kmeans".
Thank you!
[[alternative HTML version deleted]]
2006 Mar 29
6
which function to use to do classification
Dear All,
I have a data, suppose it is an N*M matrix data. All I want is to classify it into, let see, 3 classes. Which method(s) do you think is(are) appropriate for this purpose? Any reference will be welcome! Thanks!
Best,
Baoqiang Cao
2004 May 10
3
Colouring hclust() trees
I have a data set with 6 variables and 251 cases.
The people who supplied me with this data set believe that it falls
naturally into three groups, and have given me a rule for determining
group number from these 6 variables.
If I do
scaled.stuff <- scale(stuff, TRUE, c(...the design ranges...))
stuff.dist <- dist(scaled.stuff)
stuff.hc <- hclust(stuff.dist)
2005 Apr 01
4
error in kmeans
I am trying to generate kmean of 10 clusters for a 165 x 165 matrix.
i do not see any errors known to me. But I get this error on running the
script
Error: empty cluster: try a better set of initial centers
the commands are
M <-matrix(scan("R_mutual",n = 165 * 165),165,165,byrow = T)
cl <- kmeans(M,centers=10,20)
len = dim(M)[1]
....
....
I ran the same script last night and
2004 Oct 11
1
plot hclust - canberra dist + median linkage
Gives strange results.
I get 'weird' dendrograms with canberra / binary distance metric and
median / centroid cluster methods.
Is this just my data?
Dan
2013 Oct 08
1
R function for Bisecting K-means algorithm
Hi All,
Can someone please tell me* R function for Bisecting K-means algorithm*. I
have used *kmeans() *function but not getting good results.
Please help.
--
Thanks and Regards,
Vivek Kumar Singh
Research Assistant,
School of Computing,
National University of Singapore
Mobile:(0065) 82721535
[[alternative HTML version deleted]]
2008 May 12
2
k means
Hi the devel list,
I am using K means with a non standard distance. As far as I see, the
function kmeans is able to deal with 4 differents algorithm, but not
with a user define distance.
In addition, kmeans is not able to deal with missing value whereas
there is several solution that k-means can use to deal with them ; one
is using a distance that takes the missing value in account, like a
2001 Apr 25
1
problems with a large data set
Hello,
I have trouble with a data set that comprises 2136 lines of 20 columns.
I would like to do a hierarchical clustering and I tried the following:
ages.hclust <- hclust(dist(ages, method="euclidean"), "ward")
but I get the following error message:
Error: cannot allocate vector of size 17797 Kb
When I try to do the dist() alone first without the hclust(), I get the
2016 Jul 26
3
K MEANS clustering
Hello,
I've been working on the KMeans clustering algorithm recently and since the
past week, I have been stuck on a problem which I'm not able to find a
solution to.
Since we are representing documents as Tf-idf vectors, they are really
sparse vectors (a usual corpus can have around 5000 terms). So it gets
really difficult to represent these sparse vectors in a way that would be
2011 Apr 06
2
Help in kmeans
Hi All,
I was using the following command for performing kmeans for Iris dataset.
Kmeans_model<-kmeans(dataFrame[,c(1,2,3,4)],centers=3)
This was giving proper results for me. But, in my application we generate
the R commands dynamically and there was a requirement that the column names
will be sent instead of column indices to the R commands.Hence, to
incorporate this, i tried using the R
2008 Aug 21
1
[dist]how to analise a large matrix?
Hi all,
I have a matrix of about 100.000?x 4?that I need?to classify using
euclidean metric. For that I am using dist?or daisy functions, but I
am afraid that the message: Error in vector("double", length) : vector
size specified is too large, means too much lines.
Can anyone suggest me how should I analyse this matrix?
Thanks in advance,
Diogo Andr? Alagador
MNCN,CSIC, Madrid, Spain
2003 Jun 05
1
kmeans (again)
Regarding a previous question concerning the kmeans function I've tried the
same example and I also get a strange result (at least according to what is
said in the help of the function kmeans). Apparently, the function is
disregarding the initial cluster centers one gives it. According to the help
of the function:
centers: Either the number of clusters or a set of initial cluster