thr3ads.net - similar to: "buglet in dist() ?"

Displaying 20 results from an estimated 3000 matches similar to: "buglet in dist() ?"

2006 Apr 03

about arguments in "bclust"

Hi All, Just want to make sure, in function "bclust", do the following argument only have one option? argument "dist.method" has one option "Euclidian"; argument "hclust.method" has one option "average"; argument "base.method" has one option "kmeans". Thank you! [[alternative HTML version deleted]]

dist function suggestion

1999 Jan 20

dist function suggestion

This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. ---559023410-162216788-916833047=:29339 Content-Type: TEXT/PLAIN; charset=US-ASCII On my R installation (0.62.4) there is no dist() function, so I attach one possibility. It provides

mahalanobis distance

2004 Sep 12

mahalanobis distance

Is there a function that calculate the mahalanobis distance in R . The dist function calculates "euclidean"', '"maximum"', '"manhattan"', '"canberra"', '"binary"' or '"minkowski"'. Thanks ../Murli

nlme: spatial autocorrelation on a sphere

2012 Oct 01

nlme: spatial autocorrelation on a sphere

I have spatial data on a sphere (the Earth) for which I would like to run an gls model assuming that the errors are autcorrelated, i.e. including a corSpatial correlation in the model specification. In this case the distance metric should be calculated on the sphere, therefore metric = "euclidean" in (for example) corSpher would be incorrect. I would be grateful for help on how to

how to extract options for a function call

2011 Apr 18

how to extract options for a function call

Hi, I'm having some difficulties formulating this question. But what I want, is to extract the options associated with a parameter for a function. e.g. method = c("Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN") in the optim function. So I would like to have a vector with c("Nelder-Mead", "BFGS", "CG",

Document clustering for R

2005 Sep 12

Document clustering for R

I'm working on a project related to document clustering. I know that R has clustering algorithms such as clara, but only supports two distance metrics: euclidian and manhattan, which are not very useful for clustering documents. I was wondering how easy it would be to extend the clustering package in R to support other distance metrics, such as cosine distance, or if there was an API for

k-means with euclidian distance but no coordinates

2001 Dec 13

k-means with euclidian distance but no coordinates

Hi, I'm trying to build a thesaurus that will sensible values for rare words. I suspect the best algorithm to use is k-means although I'm not sure about that -- I would have preferred a k dimensional space with a binary cluster in each dimension so a word can belong to 0..k clusters, but I digress... I can measure the strength of correlation between words fairly easily by counting

method default for hclust function

2013 Dec 12

method default for hclust function

I could not figure out what was the default when I ran hclust() without specifying the method. For example: I just have a code like: hclust(dist(data)) Any input would be appreciated:) [[alternative HTML version deleted]]

dist() {"mva" package} bug: treats +/- Inf as NA

2002 Oct 21

dist() {"mva" package} bug: treats +/- Inf as NA

Vince Carey found this (thank you!). Since the fix to the problem is not entirely obvious, I post this to R-devel as RFC: help(dist) says: >> Missing values are allowed, and are excluded from all computations >> involving the rows within which they occur. If some columns are >> excluded in calculating a Euclidean, Manhattan or Canberra >> distance, the sum is

K MEANS clustering

2016 Jul 27

K MEANS clustering

Hey Parth, Thanks for the reply. I am considering implementing a cosine distance metric too, along with euclidian distance because of the dimensionality issue that comes in with K-Means and euclidian distance metric. That does help when we deal with sparse vectors for documents. The particular problem I'm having is representing centroids in an efficient way. For example, when we find the mean

K MEANS clustering

2016 Jul 26

K MEANS clustering

Hello, I've been working on the KMeans clustering algorithm recently and since the past week, I have been stuck on a problem which I'm not able to find a solution to. Since we are representing documents as Tf-idf vectors, they are really sparse vectors (a usual corpus can have around 5000 terms). So it gets really difficult to represent these sparse vectors in a way that would be

cclust causes R to crash when using manhattan kmeans

2006 Apr 07

cclust causes R to crash when using manhattan kmeans

Dear R users, When I run the following code, R crashes: require(cclust) x <- matrix(c(0,0,0,1.5,1,-1), ncol=2, byrow=TRUE) cclust(x, centers=x[2:3,], dist="manhattan", method="kmeans") While this works: cclust(x, centers=x[2:3,], dist="euclidean", method="kmeans") I'm posting this here because I am not sure if it is a bug. I've been searching

Abundance data ordination in R

2007 Apr 01

Abundance data ordination in R

Um texto embutido e sem conjunto de caracteres especificado associado... Nome: n?o dispon?vel Url: https://stat.ethz.ch/pipermail/r-help/attachments/20070401/33921c2a/attachment.pl

dist(*, "euclidean") [was "dist function suggestion"]

1999 Jan 20

dist(*, "euclidean") [was "dist function suggestion"]

> BDR> You will need to call it something else: dist is a clone of an S > BDR> function, and dist(X, "manhattan") is well-established usage. > > one could still imagine an extra Y argument such that > dist(X, Y=myY, method="euclidean") > and dist(X, "euclidean", Y=myY) > would work > one could even make it such that > both

title for plot contain 4 subplots

2003 Sep 14

title for plot contain 4 subplots

Hi, I'm plotting 4 graphs on one page (2x2 matrix) but I cant seem to get the title for the whole page right. I'm doing: op <- par(mfrow = c(2,2), pty="s") hist(var$V2, breaks="FD",main="Euclidean Metric", xlab="Sum of 3NN ... hist(var$V2, breaks="FD",main="Manhattan Metric", xlab="Sum of 3NN ... hist(var$V2,

p-values pvclust maximum distance measure

2010 Jul 20

p-values pvclust maximum distance measure

Hi, I am new to clustering and was wondering why pvclust using "maximum" as distance measure nearly always results in p-values above 95%. I wrote an example programme which demonstrates this effect. I uploaded a PDF showing the results Here is the code which produces the PDF file: ------------------------------------------------------------------------------------- s <-

Clustering

2007 Nov 28

Clustering

Hello all! I am performingsome clustering analysis on microarray data using agnes{cluster} and I have created my own dissimilarity matrix according to a distance measure different from "euclidean" or "manhattan" etc. My question is, if I choose for example method="complete", how are the distances between the elements calculated? Are they taken form the dissimilarity

[cluster package question] What is the "sum of the dissimilarities" in the pam command ?

2009 Mar 29

[cluster package question] What is the "sum of the dissimilarities" in the pam command ?

Hello Martin Maechler and All, A simple question (I hope): How can I compute the "sum of the dissimilarities" that appears in the pam command (from the cluster package) ? Is it the "manhattan" distance (such as the one implemented by "dist") ? I am asking since I am running clustering on a dataset. I found 7 medoids with the pam command, and from it I have the

GSoC 2016 - Introduction

2016 May 05

GSoC 2016 - Introduction

Hello, Thanks James for the reply. That cleared a few things out. Apologies for replying late because of exams going on. I was going through the previous clustering API to understand how it worked and it seems like the the approach for construction of the termlists which are used for distance metrics use TF-IDF weighting with cosine similarity, which is very similar to the approach I would need

bbmle "Warning: optimization did not converge"

2012 Nov 25

bbmle "Warning: optimization did not converge"

I am using the Ben bolker's R package "bbmle" to estimate the parameters of a binomial mixture distribution via Maximum Likelihood Method. For some data sets, I got the following warning messages: *Warning: optimization did not converge (code 1: ) There were 50 or more warnings (use warnings() to see the first 50)* Also, warnings() results the following: *In 0:(n - x) : numerical

similar to: buglet in dist() ?