similar to: maxitems in cluster validity

Displaying 20 results from an estimated 3000 matches similar to: "maxitems in cluster validity"

2010 Oct 22
2
(no subject)
I am doing cluster analysis on 8768 respondents on 5 lifestyle variables and am having difficulty constructing a dissimilarity matrix which I will use for PAM.  I always get an error:  “cannot allocate  vector of size 293.3 Mb” even if I have already increased my memory to its limit of 4000.  I did it on 2GB , 32-bit OS .  I tried ff and filehash and I still get the same error.  Can you please
2008 Jun 13
3
cluster.stats
Dear list, I just tried to use the function cluster.stat in the package fpc. I just have a couple of questions about the syntax: cluster.stats(d,clustering,alt.clustering=NULL, silhouette=TRUE,G2=FALSE,G3=FALSE) 1) the distance object (d) is an object obtained by the function dist() on my own original matrix? 2) clustering is the clusters vector as result of one of the many clustering methods?
2010 Oct 29
1
transposing a column table
Dear R-user, I need help on how to transpose this column of clustering vector in R with 8768 entries derived from a PAM clustering output in a vertical  view to an excel file   Clustering vector:    [1] 1 1 2 2 1 2 1 2 1 1 2 2 1 2 2 2 2 1 1 1 1 2 2 1 2 2 1 2 2 2 2 2 2 2 2 1 2   [38] 2 1 1 2 2 2 2 2 1 2 1 2 2 2 2 1 2 1 2 2 1 2 2 2 2 2 2 1 2 1 2 2 2 1 1 2 2   [75] 2 1 2 2 2 2 2 2 2 1 1 2 1 2 2 2 2 2
2005 Sep 29
5
Regression slope confidence interval
Hi list, is there any direct way to obtain confidence intervals for the regression slope from lm, predict.lm or the like? (If not, is there any reason? This is also missing in some other statistics softwares, and I thought this would be quite a standard application.) I know that it's easy to implement but it's for explanation to people who faint if they have to do their own programming...
2010 Apr 24
4
DICE Coefficient of similarity measure
Hi, I wanted the DICE coefficient (similarity measure for binary variables) to be calculated in R and found that the "igraph" package has the option of "similarity.dice" to do this. But, for this command, the input object should be an igraph object. But, I have a dataframe of columns containing 1's and 0's. Can I convert this dataframe into an igraph object, so that
2005 Aug 08
2
selecting outliers
Hi everybody, I'd like to know if there's an easy way for extracting outliers record from a dataset, in order to perform further analysis on them. Thanks Alessandro
2011 Jun 09
1
k-nn hierarchical clustering
Hi there, is there any R-function for k-nearest neighbour agglomerative hierarchical clustering? By this I mean standard agglomerative hierarchical clustering as in hclust or agnes, but with the k-nearest neighbour distance between clusters used on the higher levels where there are at least k>1 distances between two clusters (single linkage is 1-nearest neighbour clustering)? Best regards,
2006 Aug 09
2
R CMD check error
Dear list, R CMD check on my updated package now generated the following error: "LaTeX errors when creating DVI version. This typically indicates Rd problems." But the Rd files (and everything else) were checked as "OK" (I removed the problem about which I asked the list some hours ago, but answers are still appreciated because I rather created a rough workaround than
2011 Feb 28
1
mixture models/latent class regression comparison
Dear list, I have been comparing the outputs of two packages for latent class regression, namely 'flexmix', and 'mmlcr'. What I have noticed is that the flexmix package appears to come up with a much better fit than the mmlcr package (based on logLik, AIC, BIC, and visual inspection). Has anyone else observed such behaviour? Has anyone else been successful in using the mmlcr
2005 Aug 08
2
computationally singular
Hi, I have a dataset which has around 138 variables and 30,000 cases. I am trying to calculate a mahalanobis distance matrix for them and my procedure is like this: Suppose my data is stored in mymatrix > S<-cov(mymatrix) # this is fine > D<-sapply(1:nrow(mymatrix), function(i) mahalanobis(mymatrix, mymatrix[i,], S)) Error in solve.default(cov, ...) : system is computationally
2010 Sep 01
2
Rd-file error: non-ASCII input and no declared encoding
Dear list, I came across the following error for three of my newly written Rd-files: non-ASCII input and no declared encoding I can't make sense of this. Below I copied in one of the three files. Can anybody please tell me what's wrong with it? Thank you, Christian \name{tetragonula} \alias{tetragonula} \alias{tetragonula.coord} \docType{data} % \non_function{} \title{Microsatellite
2006 Nov 01
1
cluster analysis using Dmax
Dear All, a long time ago I ran a cluster analysis where the dissimilarity matrix used consisted of Dmax (or Kolmogorov-Smirnov distance) values. In other words the maximum difference between two cumulative proportion curves. This all worked very well indeed. The matrix was calculated using Dbase III+ and took a day and a half and the clustering was done using MV-ARCH, with the resultant
2009 May 18
1
Save Cluster results to data frame
If I cluster my data into 3 sets, using pam for instance, is there a way to save the resultant cluster results, to the originating data frame. and related to that how do i say change the cluster names to something a bit more meaningful that 1..2...3 So it goes like this. Data ---> Cluster into 3 groups ----> given them meaningful names
2010 Oct 10
1
Package "prabclus" not available?
Hi there, I just tried to install the package prabclus on a computer running Ubuntu Linux 9.04 using install.packages from within R. This gave me a message: Warning message: In install.packages("prabclus") : package ?prabclus? is not available I tried to do this selecting two different CRAN mirrors (same result) and with other packages (installing them works fine). Looking up the
2010 Jul 02
2
K-means result - variance between cluster
Hi, I like to present the results from the clustering method k-means in terms of variances: within and between Cluster. The k-means object gives only the within cluster sum of squares by cluster, so the between variance part is missing,for calculation the following table, which I try to get. Number of | Variance within | Var between | Var total | F-value Cluster k | cluster | cluster
2011 Mar 31
1
Cluster analysis, factor variables, large data set
Dear R helpers, I have a large data set with 36 variables and about 50.000 cases. The variabels represent labour market status during 36 months, there are 8 different variable values (e.g. Full-time Employment, Student,...) Only cases with at least one change in labour market status is included in the data set. To analyse sub sets of the data, I have used daisy in the cluster-package to create
2005 Jul 25
1
cluster
Dear listers: Here I have a question on clustering methods available in R. I am trying to down-sampling the majority class in a classification problem on an imbalanced dataset. Since I don't want to lose information in the original dataset, I don't want to use naive down-sampling: I think using clustering on the majority class' side to select "representative" samples might
2009 Dec 11
1
cluster size
hi r-help, i am doing kmeans clustering in stats. i tried for five clusters clustering using: kcl1 <- kmeans(as1[,c("contlife","somlife","agglife","sexlife",                         "rellife","hordlife","doutlife","symtlife","washlife",                       
2006 Aug 18
2
R-update - what about packages and ESS?
Hi there, it seems that if I update R, it doesn't find previously installed packages anymore and is also not found by ESS. Actually the update has been done by our system administrator who assumed that there would be no problems with these things (I don't have root access to this system) and will perhaps not be too keen on installing everything else again. Is there any simple way how
2010 Feb 11
1
cluster/distance large matrix
Hi all, I've stumbled upon some memory limitations for the analysis that I want to run. I've a matrix of distances between 38000 objects. These distances were calculated outside of R. I want to cluster these objects. For smaller sets (egn=100) this is how I proceed: A<-matrix(scan(file, n=100*100),100,100, byrow=TRUE) ad<-as.dist(A)