thr3ads.net - similar to: "kmeans Clustering"

Displaying 20 results from an estimated 1000 matches similar to: "kmeans Clustering"

2006 Mar 25

pairwise combinatons of variables

Dear WizaRds, although this might be a trivial question to the community, I was unable to find anything solving my problem in the help files on CRAN. Please help. Suppose I have 4 variables and want to use all possible combinations: 1,2 1,3 1,4 2,3 2,4 3,4 for a further kmeans partitioning. I tried permutations() of package e1071, but this is not what I need. Thank you for your help and

Clustering and Rand Index

2006 Jan 07

Clustering and Rand Index

Dear WizaRds, I am trying to compute the (adjusted) Rand Index in order to comprehend the variable selection heuristic (VS-KM) according to Brusco/ Cradit 2001 (Psychometrika 66 No.2 p.249-270, 2001). Unfortunately, I am unable to correctly use cl_ensemble and cl_agreement (package: clue). Here is what I am trying to do: library(clue) ## Let p1..p4 be four partitions of the kind

Clustering and Rand Index - VS-KM

2006 Jan 08

Clustering and Rand Index - VS-KM

Dear WizaRds, I have been trying to compute the adjusted Rand index as by Hubert/ Arabie, and could not correctly approach how to define a partition object as in my last request yesterday. With package fpc I try to work around the problem, using my original data: mat <- matrix( c(6,7,8,2,3,4,12,14,14, 14,15,13,3,1,2,3,4,2, 15,3,10,5,11,7,13,6,1, 15,4,10,6,12,8,12,7,1), ncol=9, byrow=T )

Otpmial initial centroid in kmeans

2008 Jul 03

Otpmial initial centroid in kmeans

Helo there. I am using kmeans of base package to cluster my customers. As the results of kmeans is dependent on the initial centroid, may I know: 1) how can we specify the centroid in the R function? (I don't want random starting pt) 2) how to determine the optimal (if not, a good) centroid to start with? (I am not after the fixed seed solution as it only ensure that the

KMeans - Evaluation Results

2016 Aug 19

KMeans - Evaluation Results

On 18 Aug 2016, at 23:59, Richhiey Thomas <richhiey.thomas at gmail.com> wrote: > I've currently added a few classes which don't really belong to the public API (currently) into private headers and used PIMPL with the Cluster class. I'm having difficulty reading your changes, because you aren't keeping to one complete change per commit. So for instance you've added a

KMeans - Evaluation Results

2016 Aug 15

KMeans - Evaluation Results

Hello, I've recently finished with an implementation of KMeans with two initialization techniques, random initialization and KMeans++. I would like to share my findings after evaluating the same. I have tested this implementation of KMeans with a BBC news article dataset. I am currently working on evaluating the same with FIRE datasets. Currently, clustering more than 500 documents

Using kmeans given cluster centroids and data with NAs

2005 Mar 31

Using kmeans given cluster centroids and data with NAs

Hello, I have used the functions agnes and cutree to cluster my data (4977 objects x 22 variables) into 8 clusters. I would like to refine the solution using a k-means or similar algorithm, setting the initial cluster centres as the group means from agnes. However my data matrix has NA's in it and the function kmeans does not appear to accept this? > dim(centres) [1] 8 22 > dim(data)

KMeans - Evaluation Results

2016 Aug 17

KMeans - Evaluation Results

On Wed, Aug 17, 2016 at 7:23 PM, James Aylett <james-xapian at tartarus.org> wrote: > >> How long does 200?300 documents take to cluster? How does it grow as > more documents are included in the MSet? We'd expect an MSet of 1000 > documents to take longer to cluster than one with 100, but the important > thing is _how_ the time increases as the number of documents

distance in kmeans algorithm?

2006 Jul 09

distance in kmeans algorithm?

Hello. Is it possible to choose the distance in the kmeans algorithm? I have m vectors of n components and I want to cluster them using kmeans algorithm but I want to use the Mahalanobis distance or another distance. How can I do it in R? If I use kmeans, I have no option to choose the distance. Thanks in advance, Arnau.

distance in the function kmeans

2004 May 28

distance in the function kmeans

Hi, I want to know which distance is using in the function kmeans and if we can change this distance. Indeed, in the function pam, we can put a distance matrix in parameter (by the line "pam<-pam(dist(matrixdata),k=7)" ) but we can't do it in the function kmeans, we have to put the matrix of data directly ... Thanks in advance, Nicolas BOUGET

kmeans and incom,plete distance matrix concern

2006 Aug 07

kmeans and incom,plete distance matrix concern

Hi there I have been using R to perform kmeans on a dataset. The data is fed in using read.table and then a matrix (x) is created i.e: [ mat <- matrix(0, nlevels(DF$V1), nlevels(DF$V2), dimnames = list(levels(DF$V1), levels(DF$V2))) mat[cbind(DF$V1, DF$V2)] <- DF$V3 This matrix is then taken and a distance matrix (y) created using dist() before performing the kmeans clustering. My query

KMeans - Evaluation Results

2016 Aug 18

KMeans - Evaluation Results

> > > > Actually, you're doing something slightly unusual there: making the > internal member public. Protected would be better, and private is I think > most usual; library clients aren't going to have access to the Internal > class declaration, so they can't call things on it. This means it's > actually difficult right now to subclass Feature. > > I

K MEANS clustering

2016 Jul 26

K MEANS clustering

Hello, I've been working on the KMeans clustering algorithm recently and since the past week, I have been stuck on a problem which I'm not able to find a solution to. Since we are representing documents as Tf-idf vectors, they are really sparse vectors (a usual corpus can have around 5000 terms). So it gets really difficult to represent these sparse vectors in a way that would be

Comparing two matrices [Broadcast]

2006 Jul 06

Comparing two matrices [Broadcast]

It might be a bit faster to do matrix indexing: R> tbm <- as.matrix(tb) # turn it into a character matrix R> tmat[cbind(match(tbm[,2], rownames(tmat)), match(tbm[,1], colnames(tmat)))] <- 1 > tmat Apple Orange Mango Grape Star A 1 1 1 0 0 O 1 1 0 0 0 M 0 0 1 0 0 G 0 0 0 0 0 S 1 1 1 0

Arrange elements on a matrix according to rowSums + short 'apply' Q

2010 Dec 02

Arrange elements on a matrix according to rowSums + short 'apply' Q

Greetings, My goal is to create a Markov transition matrix (probability of moving from one state to another) with the 'highest traffic' portion of the matrix occupying the top-left section. Consider the following sample: inputData <- c( c(5, 3, 1, 6, 7), c(9, 7, 3, 10, 11), c(1, 2, 3, 4, 5), c(2, 4, 6, 8, 10), c(9, 5, 2, 1, 1) ) MAT <- matrix(inputData,

persp(): add second plane (second, long question)

2002 Mar 18

persp(): add second plane (second, long question)

Thank you for your replies so far. Sorry for bothering you again, but I'm still not able to get what I need as I don't understand all parts of the replies (just using R for easy things....). Is there a code for plane3d() like some of you sent me for points3d()? I was not able to get that out of the scatterplot3d package... What I can do is to get the x,y and z-range for the xlim,ylim and

Comparing two matrices

2006 Jul 06

Comparing two matrices

hi: I have matrix with dimensions(200 X 20,000). I have another file, a tab-delim file where first column variables are row names and second column variables are column names. For instance: > tmat Apple Orange Mango Grape Star A 0 0 0 0 0 O 0 0 0 0 0 M 0 0 0 0 0 G 0 0 0 0 0 S 0 0 0 0 0

Warning in 'probtrans'-function ('mstate'-package)

2011 Sep 17

Warning in 'probtrans'-function ('mstate'-package)

Dear all, in order to estimate transition-specific probabilities in a multi-state model i applied the 'probtrans()' function from the 'mstate'-package. Now, i am at loss with the following message (see attached example): Warning message: In probtrans(msf.0, predt = 0) : Negative diagonal elements of (I+dA); the estimate may not be meaningful. I am not very familiar with matrix

classification algorithms with distance matrix

2010 Jun 07

classification algorithms with distance matrix

Dear all, I have a problem when using some classification functions (Kmeans, PAM, FANNY...) with a distance matrix, and i would to understand how it proceeds for the positioning of centroids after one execution step. In fact, in the classical formulation of the algorithm, after each step, to re-position the center, it calculates the distance between any elements of the old cluster and its

KMeans - Evaluation Results

2016 Aug 17

KMeans - Evaluation Results

> How long does 200?300 documents take to cluster? How does it grow as more > documents are included in the MSet? We'd expect an MSet of 1000 documents > to take longer to cluster than one with 100, but the important thing is > _how_ the time increases as the number of documents grows. > > Currently, the number of seconds taken for clustering a set of documents for varying

similar to: kmeans Clustering