similar to: kmeans Clustering

Displaying 20 results from an estimated 1000 matches similar to: "kmeans Clustering"

2006 Mar 25
2
pairwise combinatons of variables
Dear WizaRds, although this might be a trivial question to the community, I was unable to find anything solving my problem in the help files on CRAN. Please help. Suppose I have 4 variables and want to use all possible combinations: 1,2 1,3 1,4 2,3 2,4 3,4 for a further kmeans partitioning. I tried permutations() of package e1071, but this is not what I need. Thank you for your help and
2006 Jan 07
1
Clustering and Rand Index
Dear WizaRds, I am trying to compute the (adjusted) Rand Index in order to comprehend the variable selection heuristic (VS-KM) according to Brusco/ Cradit 2001 (Psychometrika 66 No.2 p.249-270, 2001). Unfortunately, I am unable to correctly use cl_ensemble and cl_agreement (package: clue). Here is what I am trying to do: library(clue) ## Let p1..p4 be four partitions of the kind
2006 Jan 08
1
Clustering and Rand Index - VS-KM
Dear WizaRds, I have been trying to compute the adjusted Rand index as by Hubert/ Arabie, and could not correctly approach how to define a partition object as in my last request yesterday. With package fpc I try to work around the problem, using my original data: mat <- matrix( c(6,7,8,2,3,4,12,14,14, 14,15,13,3,1,2,3,4,2, 15,3,10,5,11,7,13,6,1, 15,4,10,6,12,8,12,7,1), ncol=9, byrow=T )
2008 Jul 03
1
Otpmial initial centroid in kmeans
Helo there. I am using kmeans of base package to cluster my customers. As the results of kmeans is dependent on the initial centroid, may I know: 1) how can we specify the centroid in the R function? (I don't want random starting pt) 2) how to determine the optimal (if not, a good) centroid to start with? (I am not after the fixed seed solution as it only ensure that the
2016 Aug 19
2
KMeans - Evaluation Results
On 18 Aug 2016, at 23:59, Richhiey Thomas <richhiey.thomas at gmail.com> wrote: > I've currently added a few classes which don't really belong to the public API (currently) into private headers and used PIMPL with the Cluster class. I'm having difficulty reading your changes, because you aren't keeping to one complete change per commit. So for instance you've added a
2016 Aug 15
2
KMeans - Evaluation Results
Hello, I've recently finished with an implementation of KMeans with two initialization techniques, random initialization and KMeans++. I would like to share my findings after evaluating the same. I have tested this implementation of KMeans with a BBC news article dataset. I am currently working on evaluating the same with FIRE datasets. Currently, clustering more than 500 documents
2005 Mar 31
2
Using kmeans given cluster centroids and data with NAs
Hello, I have used the functions agnes and cutree to cluster my data (4977 objects x 22 variables) into 8 clusters. I would like to refine the solution using a k-means or similar algorithm, setting the initial cluster centres as the group means from agnes. However my data matrix has NA's in it and the function kmeans does not appear to accept this? > dim(centres) [1] 8 22 > dim(data)
2016 Aug 17
2
KMeans - Evaluation Results
On Wed, Aug 17, 2016 at 7:23 PM, James Aylett <james-xapian at tartarus.org> wrote: > >> How long does 200?300 documents take to cluster? How does it grow as > more documents are included in the MSet? We'd expect an MSet of 1000 > documents to take longer to cluster than one with 100, but the important > thing is _how_ the time increases as the number of documents
2006 Jul 09
2
distance in kmeans algorithm?
Hello. Is it possible to choose the distance in the kmeans algorithm? I have m vectors of n components and I want to cluster them using kmeans algorithm but I want to use the Mahalanobis distance or another distance. How can I do it in R? If I use kmeans, I have no option to choose the distance. Thanks in advance, Arnau.
2004 May 28
6
distance in the function kmeans
Hi, I want to know which distance is using in the function kmeans and if we can change this distance. Indeed, in the function pam, we can put a distance matrix in parameter (by the line "pam<-pam(dist(matrixdata),k=7)" ) but we can't do it in the function kmeans, we have to put the matrix of data directly ... Thanks in advance, Nicolas BOUGET
2006 Aug 07
5
kmeans and incom,plete distance matrix concern
Hi there I have been using R to perform kmeans on a dataset. The data is fed in using read.table and then a matrix (x) is created i.e: [ mat <- matrix(0, nlevels(DF$V1), nlevels(DF$V2), dimnames = list(levels(DF$V1), levels(DF$V2))) mat[cbind(DF$V1, DF$V2)] <- DF$V3 This matrix is then taken and a distance matrix (y) created using dist() before performing the kmeans clustering. My query
2016 Aug 18
3
KMeans - Evaluation Results
> > > > Actually, you're doing something slightly unusual there: making the > internal member public. Protected would be better, and private is I think > most usual; library clients aren't going to have access to the Internal > class declaration, so they can't call things on it. This means it's > actually difficult right now to subclass Feature. > > I
2016 Jul 26
3
K MEANS clustering
Hello, I've been working on the KMeans clustering algorithm recently and since the past week, I have been stuck on a problem which I'm not able to find a solution to. Since we are representing documents as Tf-idf vectors, they are really sparse vectors (a usual corpus can have around 5000 terms). So it gets really difficult to represent these sparse vectors in a way that would be
2006 Jul 06
3
Comparing two matrices [Broadcast]
It might be a bit faster to do matrix indexing: R> tbm <- as.matrix(tb) # turn it into a character matrix R> tmat[cbind(match(tbm[,2], rownames(tmat)), match(tbm[,1], colnames(tmat)))] <- 1 > tmat Apple Orange Mango Grape Star A 1 1 1 0 0 O 1 1 0 0 0 M 0 0 1 0 0 G 0 0 0 0 0 S 1 1 1 0
2010 Dec 02
1
Arrange elements on a matrix according to rowSums + short 'apply' Q
Greetings, My goal is to create a Markov transition matrix (probability of moving from one state to another) with the 'highest traffic' portion of the matrix occupying the top-left section. Consider the following sample: inputData <- c( c(5, 3, 1, 6, 7), c(9, 7, 3, 10, 11), c(1, 2, 3, 4, 5), c(2, 4, 6, 8, 10), c(9, 5, 2, 1, 1) ) MAT <- matrix(inputData,
2002 Mar 18
2
persp(): add second plane (second, long question)
Thank you for your replies so far. Sorry for bothering you again, but I'm still not able to get what I need as I don't understand all parts of the replies (just using R for easy things....). Is there a code for plane3d() like some of you sent me for points3d()? I was not able to get that out of the scatterplot3d package... What I can do is to get the x,y and z-range for the xlim,ylim and
2006 Jul 06
3
Comparing two matrices
hi: I have matrix with dimensions(200 X 20,000). I have another file, a tab-delim file where first column variables are row names and second column variables are column names. For instance: > tmat Apple Orange Mango Grape Star A 0 0 0 0 0 O 0 0 0 0 0 M 0 0 0 0 0 G 0 0 0 0 0 S 0 0 0 0 0
2011 Sep 17
0
Warning in 'probtrans'-function ('mstate'-package)
Dear all, in order to estimate transition-specific probabilities in a multi-state model i applied the 'probtrans()' function from the 'mstate'-package. Now, i am at loss with the following message (see attached example): Warning message: In probtrans(msf.0, predt = 0) : Negative diagonal elements of (I+dA); the estimate may not be meaningful. I am not very familiar with matrix
2010 Jun 07
1
classification algorithms with distance matrix
Dear all, I have a problem when using some classification functions (Kmeans, PAM, FANNY...) with a distance matrix, and i would to understand how it proceeds for the positioning of centroids after one execution step. In fact, in the classical formulation of the algorithm, after each step, to re-position the center, it calculates the distance between any elements of the old cluster and its
2016 Aug 17
2
KMeans - Evaluation Results
> How long does 200?300 documents take to cluster? How does it grow as more > documents are included in the MSet? We'd expect an MSet of 1000 documents > to take longer to cluster than one with 100, but the important thing is > _how_ the time increases as the number of documents grows. > > Currently, the number of seconds taken for clustering a set of documents for varying