Displaying 20 results from an estimated 3000 matches similar to: "Correct use of the cluster::daisy function"
2013 Dec 08
3
Why daisy() in cluster library failed to exclude NA when computing dissimilarity
Hi,
According to daisy function from cluster documentation, it can compute
dissimilarity when NA (missing) value(s) is present.
http://stat.ethz.ch/R-manual/R-devel/library/cluster/html/daisy.html
But why when I tried this code
library(cluster)
x <- c(1.115,NA,NA,0.971,NA)
y <- c(NA,1.006,NA,NA,0.645)
df <- as.data.frame(rbind(x,y))
daisy(df,metric="gower")
It gave this
2011 Jun 16
1
Specify ID variable in daisy{cluster}
Hi All - I am using the daisy function from the cluster library to create a
dissimilarity matrix. I'm going to use that matrix to run a cluster
analysis. My participants are identified with the variable, hhid. However,
when I try to keep hhid in the dataset that I use to create the
dissimilarity matrix, daisy uses it to create the matrix rather than
ignoring it as an ID variable. I need to
2006 Jan 05
0
more on the daisy function
Dear R-helpers,
First of all, a happy new year to everyone!
I succesfully used the daisy function (from package cluster) to find which two
rows from a dataframe differ by only one value, and I now want to come up with
a simpler way to find _which_ value makes the difference between any such
pair of two rows.
Consider a very small example (the actual data counts thousands of rows):
input
2004 Jun 29
1
PAM clustering: using my own dissimilarity matrix
Hello,
I would like to use my own dissimilarity matrix in a PAM clustering with
method "pam" (cluster package) instead of a dissimilarity matrix created
by daisy.
I read data from a file containing the dissimilarity values using
"read.csv". This creates a matrix (alternatively: an array or vector)
which is not accepted by "pam": A call
2010 Nov 06
0
variable type assignment in daisy
Dear Rhelp,
I did a daisy on 5 lifestyle variables, 3 of which were nominal and 2 were ordinal and assigned types “nominal” and “ordinal” for the variables, respectively. I got an output indicating their types as “I” for interval(?). Doing it on the Rdata example “flower” gave the same types in the output as the types they were assigned to. Why is this so? Below are the codes and outputs.
2006 Mar 20
1
type in daisy
Hi,
I'm a PhD student and I want to use the function 'daisy' from the
package 'cluster' to compute dissimilarities.
My variables are of mixed types so I use the argument 'stand' in daisy
to define the type of my variables.
I have the following error message :
Warning message:
binary variable(s) 13, 16, 17, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35,
2001 Jan 09
2
PAM clustering (using triangular matrix)
Hi,
I'm trying to use a similarity matrix (triangular) as input for pam() or
fanny() clustering algorithms.
The problem is that this algorithms can only accept a dissimilarity
matrix, normally generated by daisy().
However, daisy only accept 'data matrix or dataframe. Dissimilarities
will be computed between the rows of x'.
Is there any way to say to that your data are already a
2010 Aug 26
1
daisy(): space allocation issue
Hi,
I'm trying to apply the function daisy() to a data.frame 10000x10 but I have
not enough space (error message: cannot allocate vector of length
1476173280).
I didn't imagine I was not able to work with a matrix of just 10000
observations... I have setted in Rgui --max-mem-size=2G (I'm not able to set
more space..)
How can I solve this issue? Separating observations depending on
2007 Feb 22
0
daisy function in cluster- coerced NAs
I am currently using the function daisy in package cluster to create a
dissimilarity matrix because my multivariate dataset contain missing
data and variables of various types including factors, symmetric and
asymmetric binary and quantitative. This is a step prior to using pco
within ecodist.
There is a warning which comes twice
">NAs introduced by coercion"
I've used
2006 Sep 26
0
cauculating dissimilarities in R
Dear All,
I?ve got a statistical question on calculating
dissimilarities in R.
I want to calculate the different types of dissimilarities
on the ?flower? dataset found in the package
?cluster?. Flower is a data frame with 18 observations
on 8 variables. Variable 1 and 2 are binary, variable 3 is
asymmetric binary, variable 4 is nominal, variable 5 and 6
are ordered and variable 7 and 8 are
2009 Nov 10
2
All possible combinations of functions within a function
Dear All,
I wrote a function for cluster analysis to compute cophenetic correlations
between dissimilarity matrices (using the VEGAN library) and cluster
analyses of every possible clustering algorithm (SEE ATTACHED)
http://old.nabble.com/file/p26288610/cor.coef.R cor.coef.R . As it is now,
it is extremely long, and for the future I was hoping to find a more
efficient way of doing this sort of
2010 May 07
0
Cluster procedure using geographical neighborhood
Dear Dario Sacco,
>>>>> "DS" == Dario Sacco <dario.sacco at unito.it>
>>>>> on Thu, 06 May 2010 17:45:30 +0200 writes:
DS> Dear Dr. Maechler,
DS> I am an agronomist and a researcher at the University of Turin. I am
DS> also teaching "Applied statistics", then I have some knowledge in
DS> Statistics, but not
2006 Apr 07
1
fuzzy classification and dissimilarity matrix
Hello,
I want to make a fuzzy classification from a dissimilarity matrix
(calculated with daisy from package 'cluster'). I have tried to use
fanny (package cluster) but I have the same problems than described in a
previous message
(http://tolstoy.newcastle.edu.au/R/help/05/05/4546.html) i.e. it always
gives me two clusters in the results (even if k is different from 2)
with the same
2004 Feb 06
2
Converting a Dissimilarity Matrix
Hi all,
I'm trying to perform a hierarchical clustering on some
dissimilarity data that I have but the data matrix I have already
contains the dissimilarity values. These values are calculated using
a separate program. The dissimilarity matrix in complete with no
missing values but the hclust, and agnes routines require it in the
form produced by daisy or dist. Is there any of converting
2008 Oct 13
1
Gower distance between a individual and a population
Hi the list,
I need to compute Gower distance between a specific individual and all
the other individual.
The function DAISY from package cluster compute all the pairwise
dissimilarities of a population. If the population is N individuals,
that is arround N^2 distances to compute.
I need to compute the distance between a specific individual and all
the other individual, that is only N
2002 Apr 29
2
cluster analyses
I'm clustering rather large data sets and would like to cut the dendrograms
to get a better view of specific components. I calculate the dissimilarity
matrix using daisy() because I have a mixture of variable types: factors,
ordered factors and numerical variables. If I want one dendrogram, I use
agnes() for the agglomerative nesting and pltree() to draw the dendrogram.
That way, I get the
2003 May 21
1
cluster- binary data.
Hi!
I am trying to calculate a dissimilarity matrix using daisy.
The matrix vectver is binary as i test with:
> levels(as.factor(vectver))
[1] "0" "1"
But the call to daisy gives me the following error message.:
> dfl1 <- daisy(vectver, type = list(asymm = c(1:length(vectver[,1]))))
Error in daisy(vectver, type = list(asymm = c(1:length(vectver[, 1])))) :
at least
2002 May 20
1
R bug in cluster package (PR#1580)
I have apparently found an error in the "pam" function of the "cluster"
library package. Please pardon me if this error has been pointed out or
if this e-mail should be directed to someone else.
The problem only started occurring with R version 1.5.0, which I started
using about a week ago. The problem occurs when you try to use "pam"
with the input being a
2005 Sep 14
0
correlation as distance/dissimilarity
I've been asked (privately)
>>>>> "CarlosJ" == jaramilloc <jaramilloc at si.edu>
>>>>> on Wed, 14 Sep 2005 09:40:22 -0400 writes:
..........
CarlosJ> In Kaufman & Rousseeuw 2000 book on Cluster Analysis, it says that
CarlosJ> Daisy can compute Pearson correlation between variables and then
CarlosJ> transform
2004 Aug 12
2
error using daisy() in library(cluster). Bug?
Hi,
I'm using the cluster library to examine multivariate data.
The data come from a connection to a postgres database, and I did a short R
script to do the analisys. With the cluster version included in R1.8.0, daisy
worked well for my data, but now, when I call daisy, I obtain the following
messages:
---------
Error in if (any(sx == 0)) { : missing value where TRUE/FALSE needed
In