thr3ads.net - similar to: "own distance"

Displaying 20 results from an estimated 2000 matches similar to: "own distance"

2010 Sep 06

replacing functions

Dear All, Is it possible to replace function with my own? I want to apply pca clustering, but to use some strange correlation function. I'm asking about replacing, say, mean() with new content of mean() and use standard other functions, which might use mean() as part. karsar

R and DBSCAN

2011 Jun 03

R and DBSCAN

Hello everyone, When looking for information about clustering of spatial data in R I was directed towards DBSCAN. I've read some docs about it and theb new questions have arisen. DBSCAN requires some parameters, one of them is "distance". As my data are three dimensional, longitude, latitude and temperature, which "distance" should I use? which dimension is related to

cluster.stats

2008 Jun 13

cluster.stats

Dear list, I just tried to use the function cluster.stat in the package fpc. I just have a couple of questions about the syntax: cluster.stats(d,clustering,alt.clustering=NULL, silhouette=TRUE,G2=FALSE,G3=FALSE) 1) the distance object (d) is an object obtained by the function dist() on my own original matrix? 2) clustering is the clusters vector as result of one of the many clustering methods?

Cluster analysis, defining center seeds or number of clusters

2009 Jun 11

Cluster analysis, defining center seeds or number of clusters

I use kmeans to classify spectral events in high and low 1/3 octave bands: #Do cluster analysis CyclA<-data.frame(LlowA,LhghA) CntrA<-matrix(c(0.9,0.8,0.8,0.75,0.65,0.65), nrow = 3, ncol=2, byrow=TRUE) ClstA<-kmeans(CyclA,centers=CntrA,nstart=50,algorithm="MacQueen") This works well when the actual data shows 1,2 or 3 groups that are not "too close" in a cross plot.

Regression slope confidence interval

2005 Sep 29

Regression slope confidence interval

Hi list, is there any direct way to obtain confidence intervals for the regression slope from lm, predict.lm or the like? (If not, is there any reason? This is also missing in some other statistics softwares, and I thought this would be quite a standard application.) I know that it's easy to implement but it's for explanation to people who faint if they have to do their own programming...

Cluster Analysis - Number of Clusters

2006 Feb 05

Cluster Analysis - Number of Clusters

Hello, I'm playing around with cluster analysis, and am looking for methods to select the number of clusters. I am aware of methods based on a 'pseudo F' or a 'pseudo T^2'. Are there packages in R that will generate these statistics, and/or other statistics to aid in cluster number selection? Thanks, John. --

k-nn hierarchical clustering

2011 Jun 09

k-nn hierarchical clustering

Hi there, is there any R-function for k-nearest neighbour agglomerative hierarchical clustering? By this I mean standard agglomerative hierarchical clustering as in hclust or agnes, but with the k-nearest neighbour distance between clusters used on the higher levels where there are at least k>1 distances between two clusters (single linkage is 1-nearest neighbour clustering)? Best regards,

Clustering Large Applications..sort of

2011 Aug 10

Clustering Large Applications..sort of

Hello all, I am using the clustering functions in R in order to work with large masses of binary time series data, however the clustering functions do not seem able to fit this size of practical problem. Library 'hclust' is good (though it may be sub par for this size of problem, thus doubly poor for this application) in that I do not want to make assumptions about the number of

distance metrics

2007 Mar 12

distance metrics

Hello: Does anyone know if there exists a package that handles methods for [ for dist objects? I would like to access a dist object using matrix notation e.g. dMat = dist(x) dMat[i,j] Thanks in advance to anyone who can point me in the right direction. [[alternative HTML version deleted]]

bug (?!) in "pam()" clustering from fpc package ?

2008 Dec 17

bug (?!) in "pam()" clustering from fpc package ?

Hello all. I wish to run k-means with "manhattan" distance. Since this is not supported by the function "kmeans", I turned to the "pam" function in the "fpc" package. Yet, when I tried to have the algorithm run with different starting points, I found that pam ignores and keep on starting the algorithm from the same starting-points (medoids). For my

selecting outliers

2005 Aug 08

selecting outliers

Hi everybody, I'd like to know if there's an easy way for extracting outliers record from a dataset, in order to perform further analysis on them. Thanks Alessandro

DICE Coefficient of similarity measure

2010 Apr 24

DICE Coefficient of similarity measure

Hi, I wanted the DICE coefficient (similarity measure for binary variables) to be calculated in R and found that the "igraph" package has the option of "similarity.dice" to do this. But, for this command, the input object should be an igraph object. But, I have a dataframe of columns containing 1's and 0's. Can I convert this dataframe into an igraph object, so that

cluster/distance large matrix

2010 Feb 11

cluster/distance large matrix

Hi all, I've stumbled upon some memory limitations for the analysis that I want to run. I've a matrix of distances between 38000 objects. These distances were calculated outside of R. I want to cluster these objects. For smaller sets (egn=100) this is how I proceed: A<-matrix(scan(file, n=100*100),100,100, byrow=TRUE) ad<-as.dist(A)

kmeans and incom,plete distance matrix concern

2006 Aug 07

kmeans and incom,plete distance matrix concern

Hi there I have been using R to perform kmeans on a dataset. The data is fed in using read.table and then a matrix (x) is created i.e: [ mat <- matrix(0, nlevels(DF$V1), nlevels(DF$V2), dimnames = list(levels(DF$V1), levels(DF$V2))) mat[cbind(DF$V1, DF$V2)] <- DF$V3 This matrix is then taken and a distance matrix (y) created using dist() before performing the kmeans clustering. My query

Number of Clusters

2006 Apr 30

Number of Clusters

Dear R users, I am interested in clustering in R. In SAS we have some criteria for determining the number of clusters using the PROC CLUSTER procedure, which are "CCC" cubic clustering criterion (Sarl 1981), Psuedo F (PSF), and Psuedo T square (PST). My question is do thsese criterion exists in R, I tried to search and got one hit (BIC) in Mclust, which I am aware of, any input is

R CMD check error

2006 Aug 09

R CMD check error

Dear list, R CMD check on my updated package now generated the following error: "LaTeX errors when creating DVI version. This typically indicates Rd problems." But the Rd files (and everything else) were checked as "OK" (I removed the problem about which I asked the list some hours ago, but answers are still appreciated because I rather created a rough workaround than

Latent class multinomial (or conditional) logit using R?

2011 Dec 23

Latent class multinomial (or conditional) logit using R?

Hi everyone? Does anybody know how can I estimate a Latent class multinomial (or conditional) logit using R? I have tried flexmix, poLCA, and they do not seem to support this model. thanks in advance adan -- View this message in context: http://r.789695.n4.nabble.com/Latent-class-multinomial-or-conditional-logit-using-R-tp4230083p4230083.html Sent from the R help mailing list archive at

computationally singular

2005 Aug 08

computationally singular

Hi, I have a dataset which has around 138 variables and 30,000 cases. I am trying to calculate a mahalanobis distance matrix for them and my procedure is like this: Suppose my data is stored in mymatrix > S<-cov(mymatrix) # this is fine > D<-sapply(1:nrow(mymatrix), function(i) mahalanobis(mymatrix, mymatrix[i,], S)) Error in solve.default(cov, ...) : system is computationally

Cluster analysis, factor variables, large data set

2011 Mar 31

Cluster analysis, factor variables, large data set

Dear R helpers, I have a large data set with 36 variables and about 50.000 cases. The variabels represent labour market status during 36 months, there are 8 different variable values (e.g. Full-time Employment, Student,...) Only cases with at least one change in labour market status is included in the data set. To analyse sub sets of the data, I have used daisy in the cluster-package to create

cluster size

2009 Dec 11

cluster size

hi r-help, i am doing kmeans clustering in stats. i tried for five clusters clustering using: kcl1 <- kmeans(as1[,c("contlife","somlife","agglife","sexlife", "rellife","hordlife","doutlife","symtlife","washlife",

similar to: own distance