thr3ads.net - similar to: "pairwise combinatons of variables"

Displaying 20 results from an estimated 1000 matches similar to: "pairwise combinatons of variables"

2006 Mar 23

kmeans Clustering

Dear WizaRds, My goal is to program the VS-KM algorithm by Brusco and Cradit 01 and I have come to a complete stop in my efforts. Maybe anybody is willing to follow my thoughts and offer some help. In a first step, I want to use a single variable for the partitioning process. As the center-matrix I use the objects that belong to the cluster I found with the hierarchial Ward algorithm. Then,

Clustering and Rand Index

2006 Jan 07

Clustering and Rand Index

Dear WizaRds, I am trying to compute the (adjusted) Rand Index in order to comprehend the variable selection heuristic (VS-KM) according to Brusco/ Cradit 2001 (Psychometrika 66 No.2 p.249-270, 2001). Unfortunately, I am unable to correctly use cl_ensemble and cl_agreement (package: clue). Here is what I am trying to do: library(clue) ## Let p1..p4 be four partitions of the kind

Clustering and Rand Index - VS-KM

2006 Jan 08

Clustering and Rand Index - VS-KM

Dear WizaRds, I have been trying to compute the adjusted Rand index as by Hubert/ Arabie, and could not correctly approach how to define a partition object as in my last request yesterday. With package fpc I try to work around the problem, using my original data: mat <- matrix( c(6,7,8,2,3,4,12,14,14, 14,15,13,3,1,2,3,4,2, 15,3,10,5,11,7,13,6,1, 15,4,10,6,12,8,12,7,1), ncol=9, byrow=T )

Comparing two matrices [Broadcast]

2006 Jul 06

Comparing two matrices [Broadcast]

It might be a bit faster to do matrix indexing: R> tbm <- as.matrix(tb) # turn it into a character matrix R> tmat[cbind(match(tbm[,2], rownames(tmat)), match(tbm[,1], colnames(tmat)))] <- 1 > tmat Apple Orange Mango Grape Star A 1 1 1 0 0 O 1 1 0 0 0 M 0 0 1 0 0 G 0 0 0 0 0 S 1 1 1 0

Comparing two matrices

2006 Jul 06

Comparing two matrices

hi: I have matrix with dimensions(200 X 20,000). I have another file, a tab-delim file where first column variables are row names and second column variables are column names. For instance: > tmat Apple Orange Mango Grape Star A 0 0 0 0 0 O 0 0 0 0 0 M 0 0 0 0 0 G 0 0 0 0 0 S 0 0 0 0 0

Arrange elements on a matrix according to rowSums + short 'apply' Q

2010 Dec 02

Arrange elements on a matrix according to rowSums + short 'apply' Q

Greetings, My goal is to create a Markov transition matrix (probability of moving from one state to another) with the 'highest traffic' portion of the matrix occupying the top-left section. Consider the following sample: inputData <- c( c(5, 3, 1, 6, 7), c(9, 7, 3, 10, 11), c(1, 2, 3, 4, 5), c(2, 4, 6, 8, 10), c(9, 5, 2, 1, 1) ) MAT <- matrix(inputData,

persp(): add second plane (second, long question)

2002 Mar 18

persp(): add second plane (second, long question)

Thank you for your replies so far. Sorry for bothering you again, but I'm still not able to get what I need as I don't understand all parts of the replies (just using R for easy things....). Is there a code for plane3d() like some of you sent me for points3d()? I was not able to get that out of the scatterplot3d package... What I can do is to get the x,y and z-range for the xlim,ylim and

K MEANS clustering

2016 Jul 26

K MEANS clustering

Hello, I've been working on the KMeans clustering algorithm recently and since the past week, I have been stuck on a problem which I'm not able to find a solution to. Since we are representing documents as Tf-idf vectors, they are really sparse vectors (a usual corpus can have around 5000 terms). So it gets really difficult to represent these sparse vectors in a way that would be

Otpmial initial centroid in kmeans

2008 Jul 03

Otpmial initial centroid in kmeans

Helo there. I am using kmeans of base package to cluster my customers. As the results of kmeans is dependent on the initial centroid, may I know: 1) how can we specify the centroid in the R function? (I don't want random starting pt) 2) how to determine the optimal (if not, a good) centroid to start with? (I am not after the fixed seed solution as it only ensure that the

model.frame deficiency

2010 Oct 07

model.frame deficiency

The model.frame function has trouble with a certain type of really long formula. Here is a test: tname <- paste('var', 1:50, sep='') tmat <- matrix(rnorm(500), ncol=50, dimnames=list(NULL, tname)) tdata <- data.frame(tmat) temp1 <- paste( paste(tname, tname, sep='='), collapse=', ') temp2 <- paste("~1 + cbind(", temp1, ")")

KMeans - Evaluation Results

2016 Aug 19

KMeans - Evaluation Results

On 18 Aug 2016, at 23:59, Richhiey Thomas <richhiey.thomas at gmail.com> wrote: > I've currently added a few classes which don't really belong to the public API (currently) into private headers and used PIMPL with the Cluster class. I'm having difficulty reading your changes, because you aren't keeping to one complete change per commit. So for instance you've added a

finding centroids of clusters created with hclust

2006 May 08

finding centroids of clusters created with hclust

Hello, Can someone point me to documentation or ideas on how to calculate the centroids of clusters identified with hclust ? I would like to be able to chose the number of clusters (in the style of cutree) and then get the centroids of these clusters. This seems like a quite obvious task to me, but I haven't been able to put my hands on a relevant command. Thank you, Moritz

classification algorithms with distance matrix

2010 Jun 07

classification algorithms with distance matrix

Dear all, I have a problem when using some classification functions (Kmeans, PAM, FANNY...) with a distance matrix, and i would to understand how it proceeds for the positioning of centroids after one execution step. In fact, in the classical formulation of the algorithm, after each step, to re-position the center, it calculates the distance between any elements of the old cluster and its

KMeans - Evaluation Results

2016 Aug 15

KMeans - Evaluation Results

Hello, I've recently finished with an implementation of KMeans with two initialization techniques, random initialization and KMeans++. I would like to share my findings after evaluating the same. I have tested this implementation of KMeans with a BBC news article dataset. I am currently working on evaluating the same with FIRE datasets. Currently, clustering more than 500 documents

Using kmeans given cluster centroids and data with NAs

2005 Mar 31

Using kmeans given cluster centroids and data with NAs

Hello, I have used the functions agnes and cutree to cluster my data (4977 objects x 22 variables) into 8 clusters. I would like to refine the solution using a k-means or similar algorithm, setting the initial cluster centres as the group means from agnes. However my data matrix has NA's in it and the function kmeans does not appear to accept this? > dim(centres) [1] 8 22 > dim(data)

GSOC-2016 Project : Clustering of search results

2016 Mar 06

GSOC-2016 Project : Clustering of search results

On Sun, Mar 6, 2016 at 7:17 AM, James Aylett <james-xapian at tartarus.org> wrote: > On Sat, Mar 05, 2016 at 10:58:43PM +0530, Richhiey Thomas wrote: > > K-Means or something related certainly seems like a viable approach, > so what you'll need to do is to come up with a proposal of how you'd > implement this in Xapian (either with reference to the previous work, >

dist

2001 Nov 19

dist

Hi list! I'm computing multivar. distances from a set of centroids to a (large) set of individuals. I'm now just using rbind to create a matrix (x) with the centroid and the individuals, then run as.matrix(dist(x)) and finally select the appropriate columns, as I'm not interested on the distances among individuals. Therefore, this procedure implies a waste of computing time. Is there

Multivariate dispersion & distances

2013 Nov 28

Multivariate dispersion & distances

Dear All, I'm using betadisper {vegan} and I'm interested not only in the dispersion within the group but also the distances between the groups. With betadisper I get distances to group centroids but is it possible to get distances to other groups centroids? It might be possible to do it by hand by the formula given in the description of the betadisper (below) but I'm a bit confused

New var

2017 Jun 04

New var

Since the number of choices is small (6), how about this? Starting with Jeff's initial DFM: DFM <- structure(list(obs = 1:6, start = structure(c(16467, 14710, 13152, 13787, 15126, 12696), class = "Date"), end = structure(c(17167, 14975, 13636, 13879, 15340, 12753), class = "Date"), D = c(700, 265, 484, 92, 214, 57), bin = structure(c(6L, 3L, 5L, 1L, 3L, 1L), .Label

cmdscale in package mva (PR#1027)

2001 Jul 17

cmdscale in package mva (PR#1027)

Full_Name: Laurent Gautier Version: 1.3.0-patched OS: IRIX 6.5 Submission from: (NULL) (130.225.67.199) Hello, The function La.eigen, called by cmdscale in the package mva behaves an unexplicable way (for me). The following lines show what happened. I tried the very same on linux, and it worked fine. >a <- matrix(c(1,2,3,2),3,3) >a [,1] [,2] [,3] [1,] 1 2 3 [2,]

similar to: pairwise combinatons of variables