Displaying 20 results from an estimated 9000 matches similar to: "Simple clustering help"
2011 Jul 27
0
Inversions in hierarchical clustering were they shouldn't be
Hi,
I''m using heatmap.2 to cluster my data, using the centroid method for clustering and the maximum method for calculating the distance matrix:
library("gplots")
library("RColorBrewer")
test <- matrix(c(0.96, 0.07, 0.97, 0.98, 0.50, 0.28, 0.29, 0.77,
0.08, 0.96, 0.51, 0.51, 0.14, 0.19, 0.41, 0.51),
ncol=4, byrow=TRUE)
2013 Mar 28
2
hierarchical clustering with pearson's coefficient
Hello,
I want to use pearson's correlation as distance between observations and
then use any centroid based linkage distance (ex. Ward's distance)
When linkage distances are formed as the Lance-Williams recursive
formulation, they just require the initial distance between observations.
See here: http://en.wikipedia.org/wiki/Ward%27s_method
It is said that you have to use euclidean
2007 Nov 28
2
Clustering
Hello all!
I am performingsome clustering analysis on microarray data using
agnes{cluster} and I have created my own dissimilarity matrix according to a
distance measure different from "euclidean" or "manhattan" etc. My question
is, if I choose for example method="complete", how are the distances
between the elements calculated? Are they taken form the dissimilarity
2014 Jul 25
0
clustering with hclust
Hi everybody, I have a problem with a cluster analysis.
I am trying to use hclust, method=ward.
The Ward method works with SQUARED Euclidean distances.
Hclust demands "a dissimilarity structure as produced by dist".
Yet, dist does not seem to produce a table of squared euclidean distances,
starting from cosines.
In fact, computing manually the squared euclidean distances from cosines
2001 Apr 27
0
weithed clustering (was: Re: problems with a large data set)
kmeans and clara work great. Thank you for the tip.
I have another question:
Is it possible to weight the observations in a cluster analysis ? I haven't
found any mention of this in the kmeans of clara help texts.
Moritz Lennert
Charg? de recherche
IGEAT - ULB
t?l: 32-2-650.65.16
fax: 32-2-650.50.92
email: mlennert at ulb.ac.be
> On Wed, 25 Apr 2001, Moritz Lennert wrote:
>
2011 May 16
1
pam() clustering for large data sets
Hello everyone,
I need to do k-medoids clustering for data which consists of 50,000
observations. I have computed distances between the observations
separately and tried to use those with pam().
I got the "cannot allocate vector of length" error and I realize this
job is too memory intensive. I am at a bit of a loss on what to do at
this point.
I can't use clara(), because I
2011 Jul 24
0
setting distance matrix and clustering methods in heatmap.2
heatmap.2 defaults to dist for calculating the distance matrix and hclust for
clustering.
Does anyone now how I can set dist to use the euclidean method and hclust to
use the centroid method?
I provided a compilable sample code bellow.
I tried: distfun = dist(method = "euclidean"),
but that doesn't work. Any ideas?
library("gplots")
library("RColorBrewer")
test
2003 May 07
1
-means, hybrid clustering or similar implementations on R
Hi,
I would like to know if someone knows an extended implementation of k-means in R to find appropriate number of clusters for a given k-dimensional data.
Also, I am working on clustering for forecasting, if someone is interested or has knowledge on implementational details please mail me, I would appreciate it.
Regards
Skanda Kallur
"Cogito, ergo sum" (I think, therefore I
2013 Dec 07
1
How to perform clustering without removing rows where NA is present in R
I have a data which contain some NA value in their elements.
What I want to do is to **perform clustering without removing rows**
where the NA is present.
I understand that `gower` distance measure in `daisy` allow such situation.
But why my code below doesn't work?
__BEGIN__
# plot heat map with dendogram together.
library("gplots")
library("cluster")
2010 May 25
1
Hierarchical clustering using own distance matrices
Hey Everyone!
I wanted to carry out Hierarchical clustering using distance matrices i have
calculated ( instead of euclidean distance etc.)
I understand as.dist is the function for this, but the distances in the
dendrogram i got by using the following script(1) were not the distances
defined in my distance matrices.
script:
var<-read.table("the distance matrix i calculated",
2006 Jun 14
1
simple-rss caching
The index page of my rails app grabs an rss feed from a neighboring news
site. Unfortunately, the process of grabbing that feed seems to be
slowing down the initial load time of my site to the point where it
takes about 10-12 seconds to respond and render.
I''d like to speed that up somehow (for 8-10 seconds it looks like my
server is not responding at all..) Any suggestions?
I
2017 Aug 17
0
PAM Clustering
Sorry, I never use pam. In the help, you can see that pam require a
dataframe OR a dissimilarity matrix. If diss=FALSE then "euclidean" was use.So,
I interpret that a matrix of dissimilarity is generated automatically.
Problems may be in your data. Indeed
pam(ruspini, 4)$diss
write a dissimilaty matrix
while
pam(MYdata,10)$diss
wite NULL
2017-08-17 16:03 GMT+02:00 Sema Atasever
2016 Jul 27
2
K MEANS clustering
Hey Parth,
Thanks for the reply.
I am considering implementing a cosine distance metric too, along with
euclidian distance because of the dimensionality issue that comes in with
K-Means and euclidian distance metric.
That does help when we deal with sparse vectors for documents. The
particular problem I'm having is representing centroids in an efficient way.
For example, when we find the mean
2017 Aug 17
2
PAM Clustering
Dear Germano,
Thank you for your fast reply,
In the above code, *MYData *is the actual data set.
Do not we need to convert *MYData to *the dissimilarity matrix using
*pam(as.dist(**MYData**), k = 10, diss = TRUE*)* code line?*
*Regards.*
On Thu, Aug 17, 2017 at 2:58 PM, Germano Rossi <germano.rossi at gmail.com>
wrote:
> try this
>
> MYdata <-
2010 Aug 18
1
Plotting K-means clustering results on an MDS
Hello All,
I'm having some trouble figuring out what the clearest way to plot my
k-means clustering result on an my existing MDS.
First I performed MDS on my distance matrix (note: I performed k-means on
the MDS coordinates because applying a euclidean distance measure to my raw
data would have been inappropriate)
canto.MDS<-cmdscale(canto)
I then figured out what would be my optimum
2007 Jun 13
2
Formatted Data File Question for Clustering -Quickie Project
I am trying to learn how to format Ascii data files for scan or read
into R.
Precisely for a quickie project, I found some code (at end of this
email) to do exactly what I need:
To cluster and graph a dendrogram from package (stats).
I am stuck on how to format a text file to run the script.
I looked at the dataset USArrests (which would be replaced by my data
and labels) using UltraEdit. That
2004 Dec 13
1
Simple Samba connection question to new Active Directory
Hello all!
I currently have a small Windows NT 4 domain (named OLD_NETWORK).
All files are stored on a UNIX server (running Solaris) running
Samba 2.2. Runs perfect. No problems. Samba's only job in my network is JUST
TO STORE AND SERVE OUT FILES to PCs. Samba does not run as a PDC. Merely
validates valid users to get their files off UNIX server.
I believe this is the simplest possible
2002 Jul 18
0
Plotting Clustering Groups Separately
Hi
As a beginer with R I have been trying to plot dendrograms for individual
groups after using cutree.
The example in the help files appears to work fine for Euclidean distances
using the "average" clustering method. However, when I use the "Ward" method
the the reprocessed subgroup does not appear to have the same structure as
it did when the whole dataset was processed.
Is
2005 Jul 26
0
Hierarchical clustering with centroid method
Dear everybody!
In the function hclust, at each stage distances between clusters are recomputed by the Lance-Williams dissimilarity update formula according to the
particular clustering method being used.
Using "centroid" method, Lance-Williams recurrence formula works properly only for euclidean distance.
How is it possible to use properly centroid method with manhattan distance ?
2003 Sep 08
2
Re: clustering polypeptide sequences
Hi Peter,
You didn't give a very specific example, but it seems to me that what
you wish to do is not really complicated. I suppose you have created a
table of sequences vs. say hyprophobicity, charge, etc..., something like...
seq hydroph arom
b0001 0.104762 0.000000
b0002 0.035122 0.065854
b0003 0.024193 0.070968
b0004 -0.096729 0.084112
b0005 -0.973469 0.091837
b0006