thr3ads.net - similar to: "How to access to Data within a cluster"

Displaying 20 results from an estimated 10000 matches similar to: "How to access to Data within a cluster"

How to access to sum of dissimilarities in CLARA

2005 May 30

How to access to sum of dissimilarities in CLARA

Dear All , Since dissimilarity is one of quality measures in clustering , I'm trying to access to the sum of dissimilarity as a whole measure. But after running my data using CLARA I obtain : 1128 dissimilarities, summarized : Min. 1st Qu. Median Mean 3rd Qu. Max. 0.033155 0.934630 2.257000 2.941600 4.876600 8.943700 But I can not find the sum of dissimilarity.How can i

"partitioning cluster function"

2006 Apr 05

"partitioning cluster function"

Hi All, For the function "bclust"(e1071), the argument "base.method" is explained as "must be the name of a partitioning cluster function returning a list with the same components as the return value of 'kmeans'. In my understanding, there are three partitioning cluster functions in R, which are "clara, pam, fanny". Then I check each of them to

Setting a minimum number of observations within an individual cluster

2007 Jun 13

Setting a minimum number of observations within an individual cluster

Hi I'm trying to cluster a continuous dataset with a varying number of clusters and with a restriction that each cluster must have more than 'x' number of observations. I have tried the clara function, using silhouette to give me the neighbouring cluster mediod of each observation, then merging an observation from a cluster with less than 'x' obs. into its' neighbour,

CLARA

2003 Nov 17

CLARA

I need informations about the clara routine. The on-line doc say that the argument stand is a logical, indicating if the measurements in x are standardized before calculating the dissimilarities. Measurements are standardized for each variable (column), by subtracting the variable's mean value and dividing by the variable's mean absolute deviation. If we note STAND = TRUE, I suppose that

Exporting data to a text file

2008 Aug 01

Exporting data to a text file

HI R users With clara function I get a data frame (maybe this is not the exact word, I'm new to R) with the following variables: > names(myclara) [1] "sample" "medoids" "i.med" "clustering" "objective" [6] "clusinfo" "diss" "call" "silinfo" "data" I want to

question about similarities cluster using hierclust

2004 Jun 10

question about similarities cluster using hierclust

my major is bioinformatics, and i'm trying to cluster ( agglomerate the closest pari of observations ) in R. i have already got my own similarities metric, but do not know how to clust it based on similarities instead of dissimilarities. since the help document of hierclust mentions the parameter "sim", which seems good to me, but it doesn't appear in the code of hierclust()

CLARA and determining the right number of clusters

2008 Sep 30

CLARA and determining the right number of clusters

Hi everyone I have a question about clustering. I've managed using CLARA to get a clustering analysis of a large data set. But now I want to find which is the right number of clusters. The clara.object gives some information like the ratio between maximal and minimal dissimilarity that says (maybe if lower than 1??) if a cluster is well-separated from the other. I've also read something

Document clustering for R

2005 Sep 12

Document clustering for R

I'm working on a project related to document clustering. I know that R has clustering algorithms such as clara, but only supports two distance metrics: euclidian and manhattan, which are not very useful for clustering documents. I was wondering how easy it would be to extend the clustering package in R to support other distance metrics, such as cosine distance, or if there was an API for

cluster-gruop-match with other attributes after na.omit

2004 Feb 28

cluster-gruop-match with other attributes after na.omit

Hi, i want a cluster-analysis with clara, but getting an error because in cldat are NA's. Error in clara(cldat[, 1:3], 4) : Each of the random samples contains objects between which no distance can be computed. cldatx <- subset(cldat,select=c(A,B,C)) cldaty <- na.omit(cldatx) Now , clara works but cldat has ~193.000 obs and cldatx without NA's ~75.000 obs. How could i match

Filters in waveslim

2006 Jan 08

Filters in waveslim

Dear R Users, For running wavelet functions using dwt( ), modwt( ), and mra( ), a wavelet filter algorithm is applied. For all these functions, default is "la8" and other possibility is "haar". In related documents, another possibilities like as symlet and coiflet ... are not cited. Besides "la8" and "haar", which wavelet filters can be used? Thank

Creating a new vector ( another problem)

2006 Nov 20

Creating a new vector ( another problem)

Dear R Users, Suppose we are interested for generating a new vector ( x ) from a current vector (y) of length 1000 so that x includes the sum of every 5 values in y respectively from the first to the end of length y. The same length of y for x is desired, so that other 4 positions (indices) in x are filled out with NA. For generating such a new vector, I have no idea. I tried in some

creating new vector

2006 Mar 10

creating new vector

Hi R Users, I don't know how much is difficult my problem and even it is possible to solve in R or not. Given a vector with 2000 observations. I want to creat a new vector from that vector so that new vector be the sum of every 5 observations sequently. That is , each new observation is sum of every 5 sequent observations, from the first observation of previous vector to the end. So

passing known medoids to clara() in the cluster package

2006 Apr 10

passing known medoids to clara() in the cluster package

Greetings, I have had good success using the clara() function to perform a simple cluster analysis on a large dataset (1 million+ records with 9 variables). Since the clara function is a wrapper to pam(), which will accept known medoid data - I am wondering if this too is possible with clara() ... The documentation does not suggest that this is possible. Essentially I am trying to

Clustering

2007 Nov 28

Clustering

Hello all! I am performingsome clustering analysis on microarray data using agnes{cluster} and I have created my own dissimilarity matrix according to a distance measure different from "euclidean" or "manhattan" etc. My question is, if I choose for example method="complete", how are the distances between the elements calculated? Are they taken form the dissimilarity

Clustering algorithms don't find obvious clusters

2010 Jun 11

Clustering algorithms don't find obvious clusters

I have a directed graph which is represented as a matrix on the form 0 4 0 1 6 0 0 0 0 1 0 5 0 0 4 0 Each row correspond to an author (A, B, C, D) and the values says how many times this author have cited the other authors. Hence the first row says that author A have cited author B four times and author D one time. Thus the matrix represents two groups of authors: (A,B) and (C,D) who cites

Cluster analysis, defining center seeds or number of clusters

2009 Jun 11

Cluster analysis, defining center seeds or number of clusters

I use kmeans to classify spectral events in high and low 1/3 octave bands: #Do cluster analysis CyclA<-data.frame(LlowA,LhghA) CntrA<-matrix(c(0.9,0.8,0.8,0.75,0.65,0.65), nrow = 3, ncol=2, byrow=TRUE) ClstA<-kmeans(CyclA,centers=CntrA,nstart=50,algorithm="MacQueen") This works well when the actual data shows 1,2 or 3 groups that are not "too close" in a cross plot.

Clustering large data matrix

2008 Mar 06

Clustering large data matrix

Hello, I have a large data matrix (68x13112), each row corresponding to one observation (patients) and each column corresponding to the variables (points within an NMR spectrum). I would like to carry out some kind of clustering on these data to see how many clusters are there. I have tried the function clara() from the package cluster. If I use the matrix as is, I can perform the clara

Cluster analysis, factor variables, large data set

2011 Mar 31

Cluster analysis, factor variables, large data set

Dear R helpers, I have a large data set with 36 variables and about 50.000 cases. The variabels represent labour market status during 36 months, there are 8 different variable values (e.g. Full-time Employment, Student,...) Only cases with at least one change in labour market status is included in the data set. To analyse sub sets of the data, I have used daisy in the cluster-package to create

How to name variables in a single plot

2005 Jun 01

How to name variables in a single plot

Dear R Friends , I want to name my variables( more than 2 variables in a single plot) within a plot to distinct them from each other, but I cann't. How it is possible? I don't mean x and y axis using xlab or ylab. At the below , it follows some lines, only as an example that you could try please, if it is possible. I really thanks for your attention. Amir library(graphics) y<-

Slack variable in OR

2007 Aug 14

Slack variable in OR

Hi dear R users, Is it basically correct that a problem is ( linearly on nonlinearly ) modeled so that the slack variable is bounded by an upper bound ? If so, how it can be handled and coded practically ? for example: x1+ x2 =< b so ----> x1 + x2 + s=b s=b- x1 - x2 b- x1 - x2 =< upper value But algorithms can not calculate b- x1 - x2 , because

similar to: How to access to Data within a cluster