thr3ads.net - similar to: "Is it possible to obtain an agglomeration schedule with R cluster analyis"

Displaying 20 results from an estimated 6000 matches similar to: "Is it possible to obtain an agglomeration schedule with R cluster analyis"

x/y coordinates of dendrogram branches

2005 Nov 02

x/y coordinates of dendrogram branches

Dear R-users, I need some help concerning the plotting of dendrograms for hierarchical agglomerative clustering. The agglomeration niveau of each step should be displayed at the branches of the dendrogram. For this I need the x/y coordinates of the branch-agglomerations of the dendrogram. The y-values are known (the heights of the agglomeration), but how can I get the x-values? > mydata

creating dendrogram from cluster hierarchy

2006 Feb 28

creating dendrogram from cluster hierarchy

Dear R users, I have created data for hierarchical agglomerative cluster analysis which consist of the merging pairs and the agglomeration heights, e.g. something like my.merge <- matrix(c(-1,-2,-3,1), ncol=2, byrow=TRUE) my.height <- c(0.5, 1) I'd like to plot a corresponding dendrogram but I don't know how to convert my data to achieve this. Is it possible to create a

amap : hclust agglomeration

2003 Dec 03

amap : hclust agglomeration

Hi, I'm trying to understand the complete linkage method in hclust. Can anyone provide a breakdown of the formula (p9 of the pdf documentation) or tell me what the "sup" operator does/means? thanks in advance Tom [[alternative HTML version deleted]]

question about centroid-linkage (cluster analysis)

2009 Dec 10

question about centroid-linkage (cluster analysis)

Dear R community, I would be greatful if somebody could shed light on the following. I have created a set of 6 points to check how centroid agglomeration works in cluster analysis: > Y <- data.frame(x=c(-1,1,1,-1,10,12),y=c(1,1,-1,-1,0,0)) It is quite intuitive to understand that the last clusters to be joined will be {1,2,3,4} with {5,6}. Now, the centroid for the first cluster has

HCLUST subroutine question -- FORTRAN DO loops

2006 Mar 09

HCLUST subroutine question -- FORTRAN DO loops

Shown below is most of the FORTRAN subroutine named HCLUST. My question concerns the DO loop labeled as '10'. What happened to its CONTINUE statement? I will assume that after FLAG(I)=.TRUE. is executed that control returns to DO 10 I=1,N. Am I correct? Dave ---------------------------- C Initializations C DO 10 I=1,N C We do not initialize MEMBR in order to be able to

question about similarities cluster using hierclust

2004 Jun 10

question about similarities cluster using hierclust

my major is bioinformatics, and i'm trying to cluster ( agglomerate the closest pari of observations ) in R. i have already got my own similarities metric, but do not know how to clust it based on similarities instead of dissimilarities. since the help document of hierclust mentions the parameter "sim", which seems good to me, but it doesn't appear in the code of hierclust()

Cluster procedure using geographical neighborhood

2010 May 07

Cluster procedure using geographical neighborhood

Dear Dario Sacco, >>>>> "DS" == Dario Sacco <dario.sacco at unito.it> >>>>> on Thu, 06 May 2010 17:45:30 +0200 writes: DS> Dear Dr. Maechler, DS> I am an agronomist and a researcher at the University of Turin. I am DS> also teaching "Applied statistics", then I have some knowledge in DS> Statistics, but not

method default for hclust function

2013 Dec 12

method default for hclust function

I could not figure out what was the default when I ran hclust() without specifying the method. For example: I just have a code like: hclust(dist(data)) Any input would be appreciated:) [[alternative HTML version deleted]]

saving rounded numbers as a new variable in a dataframe

2006 Jun 19

saving rounded numbers as a new variable in a dataframe

A basic question, but one that eludes me. I have created a new variable $numurder, which I have rounded off. I want to save the rounded off version of this variable to an existing datafile called 'ngri.csv' . numurder <-c((murder*no.of.cases)/100) [[1]] [1] 48.952 112.073 182.160 974.610 122.140 663.432 150.856 18.988 137.925 198.045 68.930 203.148 30.056 100.955

non-uniqueness in cluster analysis

2003 Dec 03

non-uniqueness in cluster analysis

Hi, I'm clustering objects defined by categorical variables with a hierarchical algorithm - average linkage. My distance matrix (general dissimilarity coefficient) includes several distances with exactly the same values. As I see, a standard agglomerative procedure ignores this problems, simply selecting, above equal distances, the one that comes first. For this reason the analysis in output

improving a bar graph

2007 Dec 16

improving a bar graph

Hello, Below is the code for a basic bar graph. I was seeking advice regarding the following: (a) For each time period there are values from 16 people. How I can change the colour value so that each person has a different colour, which recurs across each of the three graphs/tie epriods? (b) I have seen much more sophisticated examples using lattice (e.g each person has a separate

cophenetic matrix

2001 Jun 12

cophenetic matrix

Hello, I analyse some free-sorting data so I use hierarchical clustering. I want to compare my proximity matrix with the tree representation to evalute the fitting. (stress, cophenetic correlation (pearson's correlation)...) "The cophenetic similarity of two objects a and b is defined as the similarity level at wich objects a and b become members of the same cluster during the course of

Question: how to obtain the clusters of genes (basically the ones in the row dendrograms) from an object obtained by heatmap.2 function

2010 Sep 17

Question: how to obtain the clusters of genes (basically the ones in the row dendrograms) from an object obtained by heatmap.2 function

Hello R-Helpers, I have a question about extracting the clusters of genes after we make the heatmap (say ht4) using the heatmap.2 function. Basically, I want to get the clusters which are shown as row dendrogram in the heatmap. I understand that ht4$rowDendrogram is an object of dendrogram and it containes details of all the nodes and branches, but lets say I want to know the number of clusters

Cluster analysis: hclust manipulation possible?

2009 Nov 16

Cluster analysis: hclust manipulation possible?

I am doing cluster analysis [hclust(Dist, method="average")] on data that potentially contains redundant objects. As expected, the inclusion of redundant objects affects the clustering result, i.e., the data a1, = a2, = a3, b, c, d, e1, = e2 is likely to cluster differently from the same data without the redundancy, i.e., a1, b, c, d, e1. This is apparent when the outcome is visualized

cluster analysis: mean values for each variable and cluster

2009 Feb 20

cluster analysis: mean values for each variable and cluster

Hi all! I'm new to R and don't know many about it. Because it is free, I managed to learn it a little bit. Here is my problem: I did a cluster analysis on 30 observations and 16 variables (monde, figaro, liberation, etc.). Here is the .txt data file:

How to build a "Amalgamation Schedule"? help!

2012 Jan 23

How to build a "Amalgamation Schedule"? help!

Dear all, I need to process large amounts of data (two or three variables for 6,000 cases) cluster analysis. In the end I need to fill the source data to the obtained clusters. I need to trace the sequence of data fusion. In this case, I can fill in a cluster (with any level of linkage distance) by data. This procedure is implemented in the package Statistica, but this package can not work with

Customizing Cluster Analysis plots created with hclust

2004 Jul 01

Customizing Cluster Analysis plots created with hclust

I am trying to cluster stock prices through time using hclust. To help with the interpretation of the output I would like to change the colour of the lines and the labels based on which sector a stock is in. Is it possible to customize a plot of the output of hclust in this way? Any help much appreciated Regards, > Tom Joy > [[alternative HTML version deleted]]

Cluster analysis: dissimilar results between R and SPSS

2010 Apr 26

Cluster analysis: dissimilar results between R and SPSS

Hello everyone! My data is composed of 277 individuals measured on 8 binary variables (1=yes, 2=no). I did two similar cluster analyses, one on SPSS 18.0 and one on R 2.9.2. The objective is to have the means for each variable per retained cluster. 1) the R analysis ran as followed: > call data > dist=dist(data,method="euclidean") >

cluster analyses

2002 Apr 29

cluster analyses

I'm clustering rather large data sets and would like to cut the dendrograms to get a better view of specific components. I calculate the dissimilarity matrix using daisy() because I have a mixture of variable types: factors, ordered factors and numerical variables. If I want one dendrogram, I use agnes() for the agglomerative nesting and pltree() to draw the dendrogram. That way, I get the

cluster analysis

2004 Oct 15

cluster analysis

Hello. I wonder if anyone can help me with this. I'm performing cluster analysis by using hclust in stats package. My data are contained in a data frame with 10 columns, named "drops". Firs I create a distance matrix using dist: distanxe <- dist(drops) Then I perform cluster analysis via hclust: clusters <- hclust(distanze) At this point I want to view the tree

similar to: Is it possible to obtain an agglomeration schedule with R cluster analyis