similar to: buglet in dist() ?

Displaying 20 results from an estimated 3000 matches similar to: "buglet in dist() ?"

2006 Apr 03
2
about arguments in "bclust"
Hi All, Just want to make sure, in function "bclust", do the following argument only have one option? argument "dist.method" has one option "Euclidian"; argument "hclust.method" has one option "average"; argument "base.method" has one option "kmeans". Thank you! [[alternative HTML version deleted]]
1999 Jan 20
2
dist function suggestion
This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. ---559023410-162216788-916833047=:29339 Content-Type: TEXT/PLAIN; charset=US-ASCII On my R installation (0.62.4) there is no dist() function, so I attach one possibility. It provides
2004 Sep 12
2
mahalanobis distance
Is there a function that calculate the mahalanobis distance in R . The dist function calculates "euclidean"', '"maximum"', '"manhattan"', '"canberra"', '"binary"' or '"minkowski"'. Thanks ../Murli
2012 Oct 01
6
nlme: spatial autocorrelation on a sphere
I have spatial data on a sphere (the Earth) for which I would like to run an gls model assuming that the errors are autcorrelated, i.e. including a corSpatial correlation in the model specification. In this case the distance metric should be calculated on the sphere, therefore metric = "euclidean" in (for example) corSpher would be incorrect. I would be grateful for help on how to
2011 Apr 18
3
how to extract options for a function call
Hi, I'm having some difficulties formulating this question. But what I want, is to extract the options associated with a parameter for a function. e.g. method = c("Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN") in the optim function. So I would like to have a vector with c("Nelder-Mead", "BFGS", "CG",
2005 Sep 12
4
Document clustering for R
I'm working on a project related to document clustering. I know that R has clustering algorithms such as clara, but only supports two distance metrics: euclidian and manhattan, which are not very useful for clustering documents. I was wondering how easy it would be to extend the clustering package in R to support other distance metrics, such as cosine distance, or if there was an API for
2001 Dec 13
2
k-means with euclidian distance but no coordinates
Hi, I'm trying to build a thesaurus that will sensible values for rare words. I suspect the best algorithm to use is k-means although I'm not sure about that -- I would have preferred a k dimensional space with a binary cluster in each dimension so a word can belong to 0..k clusters, but I digress... I can measure the strength of correlation between words fairly easily by counting
2013 Dec 12
2
method default for hclust function
I could not figure out what was the default when I ran hclust() without specifying the method. For example: I just have a code like: hclust(dist(data)) Any input would be appreciated:) [[alternative HTML version deleted]]
2002 Oct 21
1
dist() {"mva" package} bug: treats +/- Inf as NA
Vince Carey found this (thank you!). Since the fix to the problem is not entirely obvious, I post this to R-devel as RFC: help(dist) says: >> Missing values are allowed, and are excluded from all computations >> involving the rows within which they occur. If some columns are >> excluded in calculating a Euclidean, Manhattan or Canberra >> distance, the sum is
2016 Jul 27
2
K MEANS clustering
Hey Parth, Thanks for the reply. I am considering implementing a cosine distance metric too, along with euclidian distance because of the dimensionality issue that comes in with K-Means and euclidian distance metric. That does help when we deal with sparse vectors for documents. The particular problem I'm having is representing centroids in an efficient way. For example, when we find the mean
2016 Jul 26
3
K MEANS clustering
Hello, I've been working on the KMeans clustering algorithm recently and since the past week, I have been stuck on a problem which I'm not able to find a solution to. Since we are representing documents as Tf-idf vectors, they are really sparse vectors (a usual corpus can have around 5000 terms). So it gets really difficult to represent these sparse vectors in a way that would be
2006 Apr 07
2
cclust causes R to crash when using manhattan kmeans
Dear R users, When I run the following code, R crashes: require(cclust) x <- matrix(c(0,0,0,1.5,1,-1), ncol=2, byrow=TRUE) cclust(x, centers=x[2:3,], dist="manhattan", method="kmeans") While this works: cclust(x, centers=x[2:3,], dist="euclidean", method="kmeans") I'm posting this here because I am not sure if it is a bug. I've been searching
2007 Apr 01
4
Abundance data ordination in R
Um texto embutido e sem conjunto de caracteres especificado associado... Nome: n?o dispon?vel Url: https://stat.ethz.ch/pipermail/r-help/attachments/20070401/33921c2a/attachment.pl
1999 Jan 20
0
dist(*, "euclidean") [was "dist function suggestion"]
> BDR> You will need to call it something else: dist is a clone of an S > BDR> function, and dist(X, "manhattan") is well-established usage. > > one could still imagine an extra Y argument such that > dist(X, Y=myY, method="euclidean") > and dist(X, "euclidean", Y=myY) > would work > one could even make it such that > both
2003 Sep 14
1
title for plot contain 4 subplots
Hi, I'm plotting 4 graphs on one page (2x2 matrix) but I cant seem to get the title for the whole page right. I'm doing: op <- par(mfrow = c(2,2), pty="s") hist(var$V2, breaks="FD",main="Euclidean Metric", xlab="Sum of 3NN ... hist(var$V2, breaks="FD",main="Manhattan Metric", xlab="Sum of 3NN ... hist(var$V2,
2010 Jul 20
1
p-values pvclust maximum distance measure
Hi, I am new to clustering and was wondering why pvclust using "maximum" as distance measure nearly always results in p-values above 95%. I wrote an example programme which demonstrates this effect. I uploaded a PDF showing the results Here is the code which produces the PDF file: ------------------------------------------------------------------------------------- s <-
2007 Nov 28
2
Clustering
Hello all! I am performingsome clustering analysis on microarray data using agnes{cluster} and I have created my own dissimilarity matrix according to a distance measure different from "euclidean" or "manhattan" etc. My question is, if I choose for example method="complete", how are the distances between the elements calculated? Are they taken form the dissimilarity
2009 Mar 29
1
[cluster package question] What is the "sum of the dissimilarities" in the pam command ?
Hello Martin Maechler and All, A simple question (I hope): How can I compute the "sum of the dissimilarities" that appears in the pam command (from the cluster package) ? Is it the "manhattan" distance (such as the one implemented by "dist") ? I am asking since I am running clustering on a dataset. I found 7 medoids with the pam command, and from it I have the
2016 May 05
2
GSoC 2016 - Introduction
Hello, Thanks James for the reply. That cleared a few things out. Apologies for replying late because of exams going on. I was going through the previous clustering API to understand how it worked and it seems like the the approach for construction of the termlists which are used for distance metrics use TF-IDF weighting with cosine similarity, which is very similar to the approach I would need
2012 Nov 25
5
bbmle "Warning: optimization did not converge"
I am using the Ben bolker's R package "bbmle" to estimate the parameters of a binomial mixture distribution via Maximum Likelihood Method. For some data sets, I got the following warning messages: *Warning: optimization did not converge (code 1: ) There were 50 or more warnings (use warnings() to see the first 50)* Also, warnings() results the following: *In 0:(n - x) : numerical