similar to: distance metrics

Displaying 20 results from an estimated 3000 matches similar to: "distance metrics"

2008 Jun 13
3
cluster.stats
Dear list, I just tried to use the function cluster.stat in the package fpc. I just have a couple of questions about the syntax: cluster.stats(d,clustering,alt.clustering=NULL, silhouette=TRUE,G2=FALSE,G3=FALSE) 1) the distance object (d) is an object obtained by the function dist() on my own original matrix? 2) clustering is the clusters vector as result of one of the many clustering methods?
2005 Sep 29
5
Regression slope confidence interval
Hi list, is there any direct way to obtain confidence intervals for the regression slope from lm, predict.lm or the like? (If not, is there any reason? This is also missing in some other statistics softwares, and I thought this would be quite a standard application.) I know that it's easy to implement but it's for explanation to people who faint if they have to do their own programming...
2005 Aug 08
2
selecting outliers
Hi everybody, I'd like to know if there's an easy way for extracting outliers record from a dataset, in order to perform further analysis on them. Thanks Alessandro
2010 Apr 24
4
DICE Coefficient of similarity measure
Hi, I wanted the DICE coefficient (similarity measure for binary variables) to be calculated in R and found that the "igraph" package has the option of "similarity.dice" to do this. But, for this command, the input object should be an igraph object. But, I have a dataframe of columns containing 1's and 0's. Can I convert this dataframe into an igraph object, so that
2011 Jun 09
1
k-nn hierarchical clustering
Hi there, is there any R-function for k-nearest neighbour agglomerative hierarchical clustering? By this I mean standard agglomerative hierarchical clustering as in hclust or agnes, but with the k-nearest neighbour distance between clusters used on the higher levels where there are at least k>1 distances between two clusters (single linkage is 1-nearest neighbour clustering)? Best regards,
2006 Aug 09
2
R CMD check error
Dear list, R CMD check on my updated package now generated the following error: "LaTeX errors when creating DVI version. This typically indicates Rd problems." But the Rd files (and everything else) were checked as "OK" (I removed the problem about which I asked the list some hours ago, but answers are still appreciated because I rather created a rough workaround than
2010 Sep 01
2
Rd-file error: non-ASCII input and no declared encoding
Dear list, I came across the following error for three of my newly written Rd-files: non-ASCII input and no declared encoding I can't make sense of this. Below I copied in one of the three files. Can anybody please tell me what's wrong with it? Thank you, Christian \name{tetragonula} \alias{tetragonula} \alias{tetragonula.coord} \docType{data} % \non_function{} \title{Microsatellite
2005 Aug 08
2
computationally singular
Hi, I have a dataset which has around 138 variables and 30,000 cases. I am trying to calculate a mahalanobis distance matrix for them and my procedure is like this: Suppose my data is stored in mymatrix > S<-cov(mymatrix) # this is fine > D<-sapply(1:nrow(mymatrix), function(i) mahalanobis(mymatrix, mymatrix[i,], S)) Error in solve.default(cov, ...) : system is computationally
2010 Sep 07
1
own distance
Is it possible to implement my own distance and mean for k-means clustering for any clustering package in R? Just looking for simple way, without creating a new package. karsar
2010 Oct 10
1
Package "prabclus" not available?
Hi there, I just tried to install the package prabclus on a computer running Ubuntu Linux 9.04 using install.packages from within R. This gave me a message: Warning message: In install.packages("prabclus") : package ?prabclus? is not available I tried to do this selecting two different CRAN mirrors (same result) and with other packages (installing them works fine). Looking up the
2006 Aug 07
5
kmeans and incom,plete distance matrix concern
Hi there I have been using R to perform kmeans on a dataset. The data is fed in using read.table and then a matrix (x) is created i.e: [ mat <- matrix(0, nlevels(DF$V1), nlevels(DF$V2), dimnames = list(levels(DF$V1), levels(DF$V2))) mat[cbind(DF$V1, DF$V2)] <- DF$V3 This matrix is then taken and a distance matrix (y) created using dist() before performing the kmeans clustering. My query
2006 Aug 18
2
R-update - what about packages and ESS?
Hi there, it seems that if I update R, it doesn't find previously installed packages anymore and is also not found by ESS. Actually the update has been done by our system administrator who assumed that there would be no problems with these things (I don't have root access to this system) and will perhaps not be too keen on installing everything else again. Is there any simple way how
2010 Feb 11
1
cluster/distance large matrix
Hi all, I've stumbled upon some memory limitations for the analysis that I want to run. I've a matrix of distances between 38000 objects. These distances were calculated outside of R. I want to cluster these objects. For smaller sets (egn=100) this is how I proceed: A<-matrix(scan(file, n=100*100),100,100, byrow=TRUE) ad<-as.dist(A)
2011 Aug 10
4
Clustering Large Applications..sort of
Hello all, I am using the clustering functions in R in order to work with large masses of binary time series data, however the clustering functions do not seem able to fit this size of practical problem. Library 'hclust' is good (though it may be sub par for this size of problem, thus doubly poor for this application) in that I do not want to make assumptions about the number of
2012 Aug 21
1
R CMD build error with data files
Dear list, I want to update my prabclus package which I haven't done for quite a while. In the previous version, I had .dat files in my data subdirectory, which I read using .R files. Now R CMD check gives me a warning that .dat files are no longer accepted there. So I changed my filenames to .txt, but actually some of these files are only there in order to be read by .R, not in order
2005 Jul 25
1
cluster
Dear listers: Here I have a question on clustering methods available in R. I am trying to down-sampling the majority class in a classification problem on an imbalanced dataset. Since I don't want to lose information in the original dataset, I don't want to use naive down-sampling: I think using clustering on the majority class' side to select "representative" samples might
2009 Jul 06
2
Hartigan's Dip test
Hi, I just got a value for the dip test out of my data of 0.074 for a sample size of 33. I'm trying to work out what this actually means though? Could someone help me relate this to a p-value? Thanks James
2006 Apr 30
1
Number of Clusters
Dear R users, I am interested in clustering in R. In SAS we have some criteria for determining the number of clusters using the PROC CLUSTER procedure, which are "CCC" cubic clustering criterion (Sarl 1981), Psuedo F (PSF), and Psuedo T square (PST). My question is do thsese criterion exists in R, I tried to search and got one hit (BIC) in Mclust, which I am aware of, any input is
2006 Nov 16
1
silhouette plot colors from trimkmeans solution
I was trying to create a multi-color silhouette plot (each cluster a different color) from clusters created by trimkmeans. This works straighforwardly on an object created from pam however my colors are interwoven when I try the same approach on clusters from trimkmeans. I also tried sorting the silhouette object using sortSilhouette which did not solve the problem. If anyone has a suggestion,
2005 Aug 17
1
Fitting mixture model
I would like to fit a gaussian mixture model to a vector with about 50,000 points. I have tried using Mclust to do so, but 50,000 points requires more memory than I have (and I am running with 4Gb). Any other suggestions for how to do so? Oh, I don't know the number of components, but the number will likely be less than 5 or 6. Thanks, Sean