Displaying 20 results from an estimated 2000 matches similar to: "distance method in kmeans"
2011 May 17
1
simprof test using jaccard distance
Dear All,
I would like to use the simprof function (clustsig package) but the available distances do not include Jaccard distance, which is the most appropriate for pres/abs community data. Here is the core of the function:
> simprof
function (data, num.expected = 1000, num.simulated = 999, method.cluster = "average",
method.distance = "euclidean", method.transform =
2013 Jul 18
1
binary distance measure of the "dist" function in the "stats" package
Dear all:
I want to ask question about "binary" distance measure. As far as I
know, there are many binary distance measures,eg, binary Jarcad distance,
binary euclidean distance, and binary Bray-Curtis distance,etc. It is even
more confusing because many have more than one name. So , I wan to know
what the definite name of the binary distance measure of the "dist"
function
2006 Jul 09
2
distance in kmeans algorithm?
Hello.
Is it possible to choose the distance in the kmeans algorithm?
I have m vectors of n components and I want to cluster them using kmeans
algorithm but I want to use the Mahalanobis distance or another distance.
How can I do it in R?
If I use kmeans, I have no option to choose the distance.
Thanks in advance,
Arnau.
2010 Dec 28
3
Jaccard dissimilarity matrix for PCA
Hi
I have a large dataset, containing a wide range of binary variables.
I would like first of all to compute a jaccard matrix, then do a PCA on this
matrix, so that I finally can do a hierarchical clustering on the principal
components.
My problem is, that I don't know how to compute the jaccard dissimilarity
matrix in R? Which package to use, and so on...
Can anybody help me?
Alternatively
2006 Aug 07
5
kmeans and incom,plete distance matrix concern
Hi there
I have been using R to perform kmeans on a dataset. The data is fed in using read.table and then a matrix (x) is created
i.e:
[
mat <- matrix(0, nlevels(DF$V1), nlevels(DF$V2),
dimnames = list(levels(DF$V1), levels(DF$V2)))
mat[cbind(DF$V1, DF$V2)] <- DF$V3
This matrix is then taken and a distance matrix (y) created using dist() before performing the kmeans clustering.
My query
2004 May 28
6
distance in the function kmeans
Hi,
I want to know which distance is using in the function kmeans
and if we can change this distance.
Indeed, in the function pam, we can put a distance matrix in
parameter (by the line "pam<-pam(dist(matrixdata),k=7)" ) but
we can't do it in the function kmeans, we have to put the
matrix of data directly ...
Thanks in advance,
Nicolas BOUGET
2012 Dec 06
1
clustering of binary data
Good morning,
I am analyzing a dataset composed by 364 subjects and 13 binary variables
(0,1 = absence,presence).
I am testing possible association (co-presence) of my variables. To do
this, I was trying with cluster analysis.
My main interest is to check for the significance of the obtained clusters.
First, I tried with the pvclust() function, by using method.hclust="ward"
and
2007 Jun 25
2
manipulate a matrix
I have read everything I can find on how to manipulate a results matrix in R and I have to admit I'm stumped. I have set up a process to extract a dataset from ArcGIS to compute a similarity index (Jaccards) in Vegan. The dataset is fairly simple, but large, and consists of rows = sample area, and columns = elements. I've been able to view the results in R, but I want to get the results
2005 Nov 10
2
error in rowSums:'x' must be numeric
Dear All,
It's Eszter again from Hungary. I could not solve my problem form
yesterday, so I still have to ask your help.
I have a binary dataset of vegetation samples and species as a comma
separated file. I would like to calculate the Jaccard distance of the
dataset. I have the following error message:
Error in rowSums(x, prod(dn), p, na.rm) : 'x' must be numeric
In addition:
2015 Dec 23
2
Cannot allocate vector of size
Antes de nada, me gustaría daros las gracias por toda vuestra ayuda.
He estado probando todo lo que me habéis dicho a la vez, y no hay manera, sigo teniendo el problema con el espacio.
En cuanto al tamaño de la base de datos, es más grande de lo que puse, me equivoqué y puse el tamaño de una base anterior con la que estuve trabajando, la actual tiene 36866 filas x 6500 columnas.
He seguido todas
2013 Feb 08
1
vegdist Error en double(N * (N - 1)/2) : tama?o del vector especificado es muy grande
---------- Forwarded message ----------
From: <r-help-owner@r-project.org>
Date: 2013/2/8
Subject: vegdist Error en double(N * (N - 1)/2) : tama?o del vector
especificado es muy grande
To: caro.bello58@gmail.com
Message rejected by filter rule match
---------- Mensaje reenviado ----------
From: caro bello <caro.bello58@gmail.com>
To: r-help@r-project.org
Cc:
Date: Fri, 8 Feb 2013
2004 May 11
1
AW: Probleme with Kmeans...
Sorry, to solve your question I had tried:
data(faithful)
kmeans(faithful[c(1:20),1],10)
Error: empty cluster: try a better set of initial centers
But when I run this a second time it will be ok.
It seems, that kmeans has problems to initialize good starting points, because of the random choose of these starting initial points.
With kmeans(data,k,centers=c(...) the problem can be solved.
2012 Feb 27
2
kmeans: how to retrieve clusters
Hello,
I'd like to classify data with kmeans algorithm. In my case, I should get 2
clusters in output. Here is my data
colCandInd colCandMed
1 82 2950.5
2 83 1831.5
3 1192 2899.0
4 1193 2103.5
The first cluster is the two first lines
the 2nd cluster is the two last lines
Here is the code:
x = colCandList$colCandInd
y = colCandList$colCandMed
m = matrix(c(x, y),
2003 Jun 06
1
Kmeans again
Dear helpers
I'm sorry to insist but I still think there is something wrong with the function kmeans. For instance, let's try the same small example:
> dados<-matrix(c(-1,0,2,2.5,7,9,0,3,0,6,1,4),6,2)
I will choose observations 3 and 4 for initial centers and just one iteration. The results are
> A<-kmeans(dados,dados[c(3,4),],1)
> A
$cluster
[1] 1 1 1 1 2 2
$centers
2005 Jun 14
1
KMEANS output...
Using R 2.1.0 on Windows
2 questions:
1. Is there a way to parse the output from kmeans within R?
2. If the answer to 1. is convoluted or impossible, how do you save the
output from kmeans in a plain text file for further processing outside R?
Example:
> ktx<-kmeans(x,12, nstart = 200)
I would like to parse ktx within R to extract cluster sizes, sum-of-squares
values, etc., OR save ktx in
2003 Apr 14
2
kmeans clustering
Hi,
I am using kmeans to cluster a dataset.
I test this example:
> data<-matrix(scan("data100.txt"),100,37,byrow=T)
(my dataset is 100 rows and 37 columns--clustering rows)
> c1<-kmeans(data,3,20)
> c1
$cluster
[1] 1 1 1 1 1 1 1 3 3 3 1 3 1 3 3 1 1 1 1 3 1 3 3 1 1 1 3 3 1 1 3 1 1 1 1 3
3
[38] 3 1 1 1 3 1 1 1 1 3 3 3 1 1 1 1 1 1 3 1 3 1 1 3 1 1 1 1 3 1 1 1 1 1 1 3
2011 Apr 06
2
Help in kmeans
Hi All,
I was using the following command for performing kmeans for Iris dataset.
Kmeans_model<-kmeans(dataFrame[,c(1,2,3,4)],centers=3)
This was giving proper results for me. But, in my application we generate
the R commands dynamically and there was a requirement that the column names
will be sent instead of column indices to the R commands.Hence, to
incorporate this, i tried using the R
2003 Jun 03
1
kmeans
Dear helpers
I was working with kmeans from package mva and found some strange situations. When I run several times the kmeans algorithm with the same dataset I get the same partition. I simulated a little example with 6 observations and run kmeans giving the centers and making just one iteration. I expected that the algorithm just allocated the observations to the nearest center but think this
2003 Jun 05
1
kmeans (again)
Regarding a previous question concerning the kmeans function I've tried the
same example and I also get a strange result (at least according to what is
said in the help of the function kmeans). Apparently, the function is
disregarding the initial cluster centers one gives it. According to the help
of the function:
centers: Either the number of clusters or a set of initial cluster
2009 Jul 20
2
kmeans.big.matrix
Hi,
I'm playing around with the 'bigmemory' package, and I have finally
managed to create some really big matrices. However, only now I
realize that there may not be functions made for what I want to do
with the matrices...
I would like to perform a cluster analysis based on a big.matrix.
Googling around I have found indications that a certain
kmeans.big.matrix() function should