Displaying 20 results from an estimated 5000 matches similar to: "Hierarchical clustering with centroid method"
2013 Mar 28
2
hierarchical clustering with pearson's coefficient
Hello,
I want to use pearson's correlation as distance between observations and
then use any centroid based linkage distance (ex. Ward's distance)
When linkage distances are formed as the Lance-Williams recursive
formulation, they just require the initial distance between observations.
See here: http://en.wikipedia.org/wiki/Ward%27s_method
It is said that you have to use euclidean
2010 May 25
1
Hierarchical clustering using own distance matrices
Hey Everyone!
I wanted to carry out Hierarchical clustering using distance matrices i have
calculated ( instead of euclidean distance etc.)
I understand as.dist is the function for this, but the distances in the
dendrogram i got by using the following script(1) were not the distances
defined in my distance matrices.
script:
var<-read.table("the distance matrix i calculated",
2011 Jul 27
0
Inversions in hierarchical clustering were they shouldn't be
Hi,
I''m using heatmap.2 to cluster my data, using the centroid method for clustering and the maximum method for calculating the distance matrix:
library("gplots")
library("RColorBrewer")
test <- matrix(c(0.96, 0.07, 0.97, 0.98, 0.50, 0.28, 0.29, 0.77,
0.08, 0.96, 0.51, 0.51, 0.14, 0.19, 0.41, 0.51),
ncol=4, byrow=TRUE)
2001 Jun 12
1
cophenetic matrix
Hello,
I analyse some free-sorting data so I use hierarchical
clustering.
I want to compare my proximity matrix with the tree
representation to evalute the fitting. (stress, cophenetic correlation
(pearson's correlation)...)
"The cophenetic similarity of two objects a and b is defined as the
similarity level at wich objects a and b become members of the same
cluster during the course of
2007 Nov 28
2
Clustering
Hello all!
I am performingsome clustering analysis on microarray data using
agnes{cluster} and I have created my own dissimilarity matrix according to a
distance measure different from "euclidean" or "manhattan" etc. My question
is, if I choose for example method="complete", how are the distances
between the elements calculated? Are they taken form the dissimilarity
2004 Oct 11
2
hclust title and paste - messed up
I use the following code to scan a (limited) parameter space of clustering
strategies ...
data <- read.table(...
dataTranspose <- t(data)
distMeth <- c("euclidean",
"maximum",
"manhattan",
"canberra",
"binary"
)
clustMeth <- c("ward",
2008 Mar 08
1
Elbow criterion plots for determining k in hierarchical clustering
Hi There,
I'm working on some cluster analyses on a large data-set using hclust with
Wards method and Manhattan (city block) distance measures. I've created
dendrograms to illustrate the clustering criteria, but would like to create
a plot to examine for the classic elbow criterion to use in determining the
best number of clusters. Ideally I'd like to plot percent variance
explained
2004 Jul 13
0
Calculating sum of squares deviation between 2 similar matrices
Hi all,
I've got clusters and would like to match individual records to each
cluster based on a sum of squares deviation. For each cluster and
individual, I've got 50 variables to use (measured in the same way).
Matrix 1 is individuals and is 25000x50. Matrix 2 is the cluster
centroids and is 100x50. The same variables are found in each matrix
in the same order. I'd like to
2003 Sep 14
1
title for plot contain 4 subplots
Hi,
I'm plotting 4 graphs on one page (2x2 matrix) but I cant seem to get
the title for the whole page right.
I'm doing:
op <- par(mfrow = c(2,2), pty="s")
hist(var$V2, breaks="FD",main="Euclidean Metric", xlab="Sum of 3NN ...
hist(var$V2, breaks="FD",main="Manhattan Metric", xlab="Sum of 3NN ...
hist(var$V2,
2006 Apr 07
2
cclust causes R to crash when using manhattan kmeans
Dear R users,
When I run the following code, R crashes:
require(cclust)
x <- matrix(c(0,0,0,1.5,1,-1), ncol=2, byrow=TRUE)
cclust(x, centers=x[2:3,], dist="manhattan", method="kmeans")
While this works:
cclust(x, centers=x[2:3,], dist="euclidean", method="kmeans")
I'm posting this here because I am not sure if it is a bug.
I've been searching
2012 Jul 04
1
Error in hclust?
Dear R users,
I have noted a difference in the merge distances given by hclust using
centroid method.
For the following data:
x<-c(1009.9,1012.5,1011.1,1011.8,1009.3,1010.6)
and using Euclidean distance, hclust using centroid method gives the
following results:
> x.dist<-dist(x)
> x.aah<-hclust(x.dist,method="centroid")
> x.aah$merge
[,1] [,2]
[1,] -3 -6
2011 Jul 24
0
setting distance matrix and clustering methods in heatmap.2
heatmap.2 defaults to dist for calculating the distance matrix and hclust for
clustering.
Does anyone now how I can set dist to use the euclidean method and hclust to
use the centroid method?
I provided a compilable sample code bellow.
I tried: distfun = dist(method = "euclidean"),
but that doesn't work. Any ideas?
library("gplots")
library("RColorBrewer")
test
2009 Dec 10
1
question about centroid-linkage (cluster analysis)
Dear R community,
I would be greatful if somebody could shed light on the following.
I have created a set of 6 points to check how centroid
agglomeration works in cluster analysis:
> Y <- data.frame(x=c(-1,1,1,-1,10,12),y=c(1,1,-1,-1,0,0))
It is quite intuitive to understand that the last clusters to be joined will be
{1,2,3,4} with {5,6}. Now, the centroid for the first cluster has
2009 Mar 29
1
[cluster package question] What is the "sum of the dissimilarities" in the pam command ?
Hello Martin Maechler and All,
A simple question (I hope):
How can I compute the "sum of the dissimilarities" that appears in the pam
command (from the cluster package) ?
Is it the "manhattan" distance (such as the one implemented by "dist") ?
I am asking since I am running clustering on a dataset. I found 7 medoids
with the pam command, and from it I have the
2008 Jul 03
1
Otpmial initial centroid in kmeans
Helo there. I am using kmeans of base package to cluster my customers. As
the results of kmeans is dependent on the initial centroid, may I know:
1) how can we specify the centroid in the R function? (I don't want random
starting pt)
2) how to determine the optimal (if not, a good) centroid to start with? (I
am not after the fixed seed solution as it only ensure that the
1999 Jan 20
0
dist(*, "euclidean") [was "dist function suggestion"]
> BDR> You will need to call it something else: dist is a clone of an S
> BDR> function, and dist(X, "manhattan") is well-established usage.
>
> one could still imagine an extra Y argument such that
> dist(X, Y=myY, method="euclidean")
> and dist(X, "euclidean", Y=myY)
> would work
> one could even make it such that
> both
2012 Nov 18
1
centroid of hclust
Dear UseRs,i want to find centroid of clusters, which i generated by hclust. Is there a way doing that? i took mean to elements in each cluster to get centroid but i am not sure if i am right?
thanks in advanceeliza
[[alternative HTML version deleted]]
2013 Jan 30
0
betadisper plot
Hello,
I tried to make a betadisper plot; however, it is quite messy at the moment with lines and symbols.
I made two plots, one focusing on sites and the other on treatments.
This is the code that I used:
plot(betadisper(vegdist(y.nth,method="euclidean"),site))
plot(betadisper(vegdist(y.nth,method="euclidean"),treatment))
I have a few questions pertaining to how I could
2012 Oct 21
1
Linear discriminant function analysis based median as group centroid and nonparametric scale estimators???
Dear All,
I am using a specific approach for my master thesis. In essence, a
supervised reclassification is used as an intermediate step to find chemical
parameters which are able to reclassify defined groups. These variables will
be used in a next step where location and scale estimators of the groups are
important. Traditionally linear discriminant analysis is used for
reclassification which
2013 Nov 10
0
Mark each group centroid in a linear discriminant analysis plot
Hi,
How can I calculate and mark each group centroid in a linear discriminant
analysis plot (using ggplot2)?
Script:
## originate from
http://r.789695.n4.nabble.com/LDA-and-confidence-ellipse-td4671308.html
require(MASS)
require(ggplot2)
iris.lda<-lda(Species ~ Sepal.Length + Sepal.Width + Petal.Length +
Petal.Width, data = iris)
LD1<-predict(iris.lda)$x[,1]