Displaying 20 results from an estimated 2000 matches similar to: "daisy(): space allocation issue"
2010 May 26
3
cluster analysis and supervised classification: an alternative to knn1?
Hi,
I have a 1.000 observations with 10 attributes (of different types: numeric,
dicotomic, categorical ecc..) and a measure M.
I need to cluster these observations in order to assign a new observation
(with the same 10 attributes but not the measure) to a cluster.
I want to calculate for the new observation a measure as the average of the
meausures M of the observations in the cluster
2013 Dec 08
3
Why daisy() in cluster library failed to exclude NA when computing dissimilarity
Hi,
According to daisy function from cluster documentation, it can compute
dissimilarity when NA (missing) value(s) is present.
http://stat.ethz.ch/R-manual/R-devel/library/cluster/html/daisy.html
But why when I tried this code
library(cluster)
x <- c(1.115,NA,NA,0.971,NA)
y <- c(NA,1.006,NA,NA,0.645)
df <- as.data.frame(rbind(x,y))
daisy(df,metric="gower")
It gave this
2006 Mar 20
1
type in daisy
Hi,
I'm a PhD student and I want to use the function 'daisy' from the
package 'cluster' to compute dissimilarities.
My variables are of mixed types so I use the argument 'stand' in daisy
to define the type of my variables.
I have the following error message :
Warning message:
binary variable(s) 13, 16, 17, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35,
2010 Jul 02
3
Knowledge discovery
Hi,
I have 100000 units with 10 attributes (attr1, attr2, attr3, etc...)
For instance:
unit attr1 attr2 attr3 ...
1 a ww 12
2 a re 11
3 b ww 09
4 c yt 02
5 a qw 02
...
I'd like to answer to the question:
a) what are the most frequent combinations of attributes?
b) How could I describe the relations
2011 Jun 16
1
Specify ID variable in daisy{cluster}
Hi All - I am using the daisy function from the cluster library to create a
dissimilarity matrix. I'm going to use that matrix to run a cluster
analysis. My participants are identified with the variable, hhid. However,
when I try to keep hhid in the dataset that I use to create the
dissimilarity matrix, daisy uses it to create the matrix rather than
ignoring it as an ID variable. I need to
2006 Jan 05
0
more on the daisy function
Dear R-helpers,
First of all, a happy new year to everyone!
I succesfully used the daisy function (from package cluster) to find which two
rows from a dataframe differ by only one value, and I now want to come up with
a simpler way to find _which_ value makes the difference between any such
pair of two rows.
Consider a very small example (the actual data counts thousands of rows):
input
2013 Jan 08
0
Correct use of the cluster::daisy function
Hi,
I have two groups, and I want to find the dissimiarity between the members
of the two groups. Since I have mixed level variables on the members, I opt
for the daisy function in the cluster package.
Let's pretend that the following represent my groups:
x <- data.frame(sex=factor(c(1,0,0,1,0,1),
levels=0:1,
labels=c('Male','Female'),
ordered=FALSE),
2010 Nov 06
0
variable type assignment in daisy
Dear Rhelp,
I did a daisy on 5 lifestyle variables, 3 of which were nominal and 2 were ordinal and assigned types “nominal” and “ordinal” for the variables, respectively. I got an output indicating their types as “I” for interval(?). Doing it on the Rdata example “flower” gave the same types in the output as the types they were assigned to. Why is this so? Below are the codes and outputs.
2001 Jan 09
2
PAM clustering (using triangular matrix)
Hi,
I'm trying to use a similarity matrix (triangular) as input for pam() or
fanny() clustering algorithms.
The problem is that this algorithms can only accept a dissimilarity
matrix, normally generated by daisy().
However, daisy only accept 'data matrix or dataframe. Dissimilarities
will be computed between the rows of x'.
Is there any way to say to that your data are already a
2004 Jun 29
1
PAM clustering: using my own dissimilarity matrix
Hello,
I would like to use my own dissimilarity matrix in a PAM clustering with
method "pam" (cluster package) instead of a dissimilarity matrix created
by daisy.
I read data from a file containing the dissimilarity values using
"read.csv". This creates a matrix (alternatively: an array or vector)
which is not accepted by "pam": A call
2007 Feb 22
0
daisy function in cluster- coerced NAs
I am currently using the function daisy in package cluster to create a
dissimilarity matrix because my multivariate dataset contain missing
data and variables of various types including factors, symmetric and
asymmetric binary and quantitative. This is a step prior to using pco
within ecodist.
There is a warning which comes twice
">NAs introduced by coercion"
I've used
2004 Aug 12
2
error using daisy() in library(cluster). Bug?
Hi,
I'm using the cluster library to examine multivariate data.
The data come from a connection to a postgres database, and I did a short R
script to do the analisys. With the cluster version included in R1.8.0, daisy
worked well for my data, but now, when I call daisy, I obtain the following
messages:
---------
Error in if (any(sx == 0)) { : missing value where TRUE/FALSE needed
In
2012 Oct 15
1
weighting variables using Gower with DAISY
Hello,
I am running DAISY in R and using the GOWER metric since I am working with
mixed variables. I am wondering if there is a way to weight the different
variables. I see that there is a weight value for Gower but do not know if
this is how to weight the diffrent variables with different weighting
values. Please advise if there is a way to weight the different variables.
Thank you.
--
View
2012 Oct 18
0
want to count 2 NULLS as disimilar with DIANA/DAISY/GOWER
I am using DIANA/DAISY/GOWER. Some of my categorical data include NULLS.
When assessing disimilarity, these NULLS are considered similar. I do not
want these NULLS to contribute towards similarity. Instead is there a way
for these NULLS to each be considered different so as to contribute to
disimiliarity and not simillarity? Also, I do not want to change these NULLS
in the data as I need them for
2008 Oct 13
1
Gower distance between a individual and a population
Hi the list,
I need to compute Gower distance between a specific individual and all
the other individual.
The function DAISY from package cluster compute all the pairwise
dissimilarities of a population. If the population is N individuals,
that is arround N^2 distances to compute.
I need to compute the distance between a specific individual and all
the other individual, that is only N
2001 Sep 14
1
converting numeric to ordered
Hello all,
would someone please tell me why the following code doesn't "work" (i.e. do
what i expected!).
I wanted to convert numeric variables in a matrix to ordered (for input as
nominal variables into the 'daisy' program).
Why does the following code seem to work, but the "is.ordered" command
reports that the variables are not ordered (factors)??
> xxx <-
2006 Apr 07
1
fuzzy classification and dissimilarity matrix
Hello,
I want to make a fuzzy classification from a dissimilarity matrix
(calculated with daisy from package 'cluster'). I have tried to use
fanny (package cluster) but I have the same problems than described in a
previous message
(http://tolstoy.newcastle.edu.au/R/help/05/05/4546.html) i.e. it always
gives me two clusters in the results (even if k is different from 2)
with the same
2003 May 21
1
cluster- binary data.
Hi!
I am trying to calculate a dissimilarity matrix using daisy.
The matrix vectver is binary as i test with:
> levels(as.factor(vectver))
[1] "0" "1"
But the call to daisy gives me the following error message.:
> dfl1 <- daisy(vectver, type = list(asymm = c(1:length(vectver[,1]))))
Error in daisy(vectver, type = list(asymm = c(1:length(vectver[, 1])))) :
at least
2002 Apr 29
2
cluster analyses
I'm clustering rather large data sets and would like to cut the dendrograms
to get a better view of specific components. I calculate the dissimilarity
matrix using daisy() because I have a mixture of variable types: factors,
ordered factors and numerical variables. If I want one dendrogram, I use
agnes() for the agglomerative nesting and pltree() to draw the dendrogram.
That way, I get the
2009 Jul 14
2
Cluster analysis with missing data
Hi folks,
I tried for the first time hclust. Unfortunately, with missing data in my
data file, it doesn't seem
to work. I found no information about how to consider missing data.
Omission of all missings is not really an option as I would loose to many
cases.
Thanks in advance
Holger
--
View this message in context: