thr3ads.net - similar to: "custom metric for dist for use with hclust/kmeans"

Displaying 20 results from an estimated 10000 matches similar to: "custom metric for dist for use with hclust/kmeans"

2004 May 28

distance in the function kmeans

Hi, I want to know which distance is using in the function kmeans and if we can change this distance. Indeed, in the function pam, we can put a distance matrix in parameter (by the line "pam<-pam(dist(matrixdata),k=7)" ) but we can't do it in the function kmeans, we have to put the matrix of data directly ... Thanks in advance, Nicolas BOUGET

kmeans and incom,plete distance matrix concern

2006 Aug 07

kmeans and incom,plete distance matrix concern

Hi there I have been using R to perform kmeans on a dataset. The data is fed in using read.table and then a matrix (x) is created i.e: [ mat <- matrix(0, nlevels(DF$V1), nlevels(DF$V2), dimnames = list(levels(DF$V1), levels(DF$V2))) mat[cbind(DF$V1, DF$V2)] <- DF$V3 This matrix is then taken and a distance matrix (y) created using dist() before performing the kmeans clustering. My query

clustering question ... hclust & kmeans

2001 Aug 01

clustering question ... hclust & kmeans

I am using R 1.3.0 on Windows 2000. For an experiment, I am wanting to find the most diverse 400 items to study in a possible 3200 items. Diversity here is based on a few hundred attributes. For this, I would like to do a clustering analysis and find 400 clusters (i.e. different from each other in some way hopefully). From each of these 400 clusters, I will pick a representative. I expect

distance in kmeans algorithm?

2006 Jul 09

distance in kmeans algorithm?

Hello. Is it possible to choose the distance in the kmeans algorithm? I have m vectors of n components and I want to cluster them using kmeans algorithm but I want to use the Mahalanobis distance or another distance. How can I do it in R? If I use kmeans, I have no option to choose the distance. Thanks in advance, Arnau.

cclust causes R to crash when using manhattan kmeans

2006 Apr 07

cclust causes R to crash when using manhattan kmeans

Dear R users, When I run the following code, R crashes: require(cclust) x <- matrix(c(0,0,0,1.5,1,-1), ncol=2, byrow=TRUE) cclust(x, centers=x[2:3,], dist="manhattan", method="kmeans") While this works: cclust(x, centers=x[2:3,], dist="euclidean", method="kmeans") I'm posting this here because I am not sure if it is a bug. I've been searching

kmeans cluster stability

2001 Mar 13

kmeans cluster stability

I'm doing kmeans partitioning on a small (n=26) dataset that has 5 variables. I noticed that if I repeatedly run the same command, the cluster centers change and the cluster membership changes. Using RW1022 under Windows NT & Windows 2000 >kmeans(pottery[,1:5], 4, 20) [...snip] $size [1] 7 3 9 7 [...snip] $size [1] 7 10 4 5 [...snip] $size [1] 6 10 5 5 yields a different

How to plot the dendrogram or tree for kmeans ?

2008 Mar 20

How to plot the dendrogram or tree for kmeans ?

Hi, How to plot the dendrogram or tree for kmeans, like we do for hclust ? [[alternative HTML version deleted]]

[Fwd: Libraries loading, but not really?] - it really IS a problem :-(

1999 Oct 07

[Fwd: Libraries loading, but not really?] - it really IS a problem :-(

kalish at psy.uwa.edu.au wrote: > > I'm a newbie at R, and can't get libraries to really work. > I did this: > > library(help = mva) > cancor Canonical Correlations > cmdscale Classical (Metric) Multidimensional Scaling > dist Distance Matrix Computation > hclust Hierarchical Clustering

about arguments in "bclust"

2006 Apr 03

about arguments in "bclust"

Hi All, Just want to make sure, in function "bclust", do the following argument only have one option? argument "dist.method" has one option "Euclidian"; argument "hclust.method" has one option "average"; argument "base.method" has one option "kmeans". Thank you! [[alternative HTML version deleted]]

which function to use to do classification

2006 Mar 29

which function to use to do classification

Dear All, I have a data, suppose it is an N*M matrix data. All I want is to classify it into, let see, 3 classes. Which method(s) do you think is(are) appropriate for this purpose? Any reference will be welcome! Thanks! Best, Baoqiang Cao

Colouring hclust() trees

2004 May 10

Colouring hclust() trees

I have a data set with 6 variables and 251 cases. The people who supplied me with this data set believe that it falls naturally into three groups, and have given me a rule for determining group number from these 6 variables. If I do scaled.stuff <- scale(stuff, TRUE, c(...the design ranges...)) stuff.dist <- dist(scaled.stuff) stuff.hc <- hclust(stuff.dist)

error in kmeans

2005 Apr 01

error in kmeans

I am trying to generate kmean of 10 clusters for a 165 x 165 matrix. i do not see any errors known to me. But I get this error on running the script Error: empty cluster: try a better set of initial centers the commands are M <-matrix(scan("R_mutual",n = 165 * 165),165,165,byrow = T) cl <- kmeans(M,centers=10,20) len = dim(M)[1] .... .... I ran the same script last night and

plot hclust - canberra dist + median linkage

2004 Oct 11

plot hclust - canberra dist + median linkage

Gives strange results. I get 'weird' dendrograms with canberra / binary distance metric and median / centroid cluster methods. Is this just my data? Dan

R function for Bisecting K-means algorithm

2013 Oct 08

R function for Bisecting K-means algorithm

Hi All, Can someone please tell me* R function for Bisecting K-means algorithm*. I have used *kmeans() *function but not getting good results. Please help. -- Thanks and Regards, Vivek Kumar Singh Research Assistant, School of Computing, National University of Singapore Mobile:(0065) 82721535 [[alternative HTML version deleted]]

k means

2008 May 12

k means

Hi the devel list, I am using K means with a non standard distance. As far as I see, the function kmeans is able to deal with 4 differents algorithm, but not with a user define distance. In addition, kmeans is not able to deal with missing value whereas there is several solution that k-means can use to deal with them ; one is using a distance that takes the missing value in account, like a

problems with a large data set

2001 Apr 25

problems with a large data set

Hello, I have trouble with a data set that comprises 2136 lines of 20 columns. I would like to do a hierarchical clustering and I tried the following: ages.hclust <- hclust(dist(ages, method="euclidean"), "ward") but I get the following error message: Error: cannot allocate vector of size 17797 Kb When I try to do the dist() alone first without the hclust(), I get the

K MEANS clustering

2016 Jul 26

K MEANS clustering

Hello, I've been working on the KMeans clustering algorithm recently and since the past week, I have been stuck on a problem which I'm not able to find a solution to. Since we are representing documents as Tf-idf vectors, they are really sparse vectors (a usual corpus can have around 5000 terms). So it gets really difficult to represent these sparse vectors in a way that would be

Help in kmeans

2011 Apr 06

Help in kmeans

Hi All, I was using the following command for performing kmeans for Iris dataset. Kmeans_model<-kmeans(dataFrame[,c(1,2,3,4)],centers=3) This was giving proper results for me. But, in my application we generate the R commands dynamically and there was a requirement that the column names will be sent instead of column indices to the R commands.Hence, to incorporate this, i tried using the R

[dist]how to analise a large matrix?

2008 Aug 21

[dist]how to analise a large matrix?

Hi all, I have a matrix of about 100.000?x 4?that I need?to classify using euclidean metric. For that I am using dist?or daisy functions, but I am afraid that the message: Error in vector("double", length) : vector size specified is too large, means too much lines. Can anyone suggest me how should I analyse this matrix? Thanks in advance, Diogo Andr? Alagador MNCN,CSIC, Madrid, Spain

kmeans (again)

2003 Jun 05

kmeans (again)

Regarding a previous question concerning the kmeans function I've tried the same example and I also get a strange result (at least according to what is said in the help of the function kmeans). Apparently, the function is disregarding the initial cluster centers one gives it. According to the help of the function: centers: Either the number of clusters or a set of initial cluster

similar to: custom metric for dist for use with hclust/kmeans