Displaying 20 results from an estimated 3000 matches similar to: "maxitems in cluster validity"
2010 Oct 22
2
(no subject)
I am doing cluster analysis on 8768 respondents on 5 lifestyle variables and am having difficulty constructing a dissimilarity matrix which I will use for PAM. I always get an error: “cannot allocate vector of size 293.3 Mb” even if I have already increased my memory to its limit of 4000. I did it on 2GB , 32-bit OS . I tried ff and filehash and I still get the same error. Can you please
2008 Jun 13
3
cluster.stats
Dear list,
I just tried to use the function cluster.stat in the package fpc.
I just have a couple of questions about the syntax:
cluster.stats(d,clustering,alt.clustering=NULL,
silhouette=TRUE,G2=FALSE,G3=FALSE)
1) the distance object (d) is an object obtained by the function dist() on
my own original matrix?
2) clustering is the clusters vector as result of one of the many clustering
methods?
2010 Oct 29
1
transposing a column table
Dear R-user,
I need help on how to transpose this column of clustering vector in R with 8768 entries derived from a PAM clustering output in a vertical view to an excel file
Clustering vector:
[1] 1 1 2 2 1 2 1 2 1 1 2 2 1 2 2 2 2 1 1 1 1 2 2 1 2 2 1 2 2 2 2 2 2 2 2 1 2
[38] 2 1 1 2 2 2 2 2 1 2 1 2 2 2 2 1 2 1 2 2 1 2 2 2 2 2 2 1 2 1 2 2 2 1 1 2 2
[75] 2 1 2 2 2 2 2 2 2 1 1 2 1 2 2 2 2 2
2005 Sep 29
5
Regression slope confidence interval
Hi list,
is there any direct way to obtain confidence intervals for the regression
slope from lm, predict.lm or the like?
(If not, is there any reason? This is also missing in some other statistics
softwares, and I thought this would be quite a standard application.)
I know that it's easy to implement but it's for
explanation to people who faint if they have to do their own
programming...
2010 Apr 24
4
DICE Coefficient of similarity measure
Hi,
I wanted the DICE coefficient (similarity measure for binary variables)
to be calculated in R and found that the "igraph" package has the option
of "similarity.dice" to do this. But, for this command, the input object
should be an igraph object. But, I have a dataframe of columns
containing 1's and 0's. Can I convert this dataframe into an igraph
object, so that
2005 Aug 08
2
selecting outliers
Hi everybody,
I'd like to know if there's an easy way for extracting
outliers record from a dataset, in order to perform
further analysis on them.
Thanks
Alessandro
2011 Jun 09
1
k-nn hierarchical clustering
Hi there,
is there any R-function for k-nearest neighbour agglomerative hierarchical
clustering?
By this I mean standard agglomerative hierarchical clustering as in hclust
or agnes, but with the k-nearest neighbour distance between clusters used
on the higher levels where there are at least k>1 distances between two
clusters (single linkage is 1-nearest neighbour clustering)?
Best regards,
2006 Aug 09
2
R CMD check error
Dear list,
R CMD check on my updated package now generated the following error:
"LaTeX errors when creating DVI version.
This typically indicates Rd problems."
But the Rd files (and everything else) were checked as "OK" (I
removed the problem about which I asked the list some hours ago, but
answers are still appreciated because I rather created a rough
workaround than
2011 Feb 28
1
mixture models/latent class regression comparison
Dear list,
I have been comparing the outputs of two packages for latent class
regression, namely 'flexmix', and 'mmlcr'. What I have noticed is that
the flexmix package appears to come up with a much better fit than the
mmlcr package (based on logLik, AIC, BIC, and visual inspection). Has
anyone else observed such behaviour? Has anyone else been successful
in using the mmlcr
2005 Aug 08
2
computationally singular
Hi,
I have a dataset which has around 138 variables and 30,000 cases. I am
trying to calculate a mahalanobis distance matrix for them and my
procedure is like this:
Suppose my data is stored in mymatrix
> S<-cov(mymatrix) # this is fine
> D<-sapply(1:nrow(mymatrix), function(i) mahalanobis(mymatrix, mymatrix[i,], S))
Error in solve.default(cov, ...) : system is computationally
2010 Sep 01
2
Rd-file error: non-ASCII input and no declared encoding
Dear list,
I came across the following error for three of my newly written Rd-files:
non-ASCII input and no declared encoding
I can't make sense of this.
Below I copied in one of the three files.
Can anybody please tell me what's wrong with it?
Thank you,
Christian
\name{tetragonula}
\alias{tetragonula}
\alias{tetragonula.coord}
\docType{data}
% \non_function{}
\title{Microsatellite
2006 Nov 01
1
cluster analysis using Dmax
Dear All,
a long time ago I ran a cluster analysis where the dissimilarity matrix used
consisted of Dmax (or Kolmogorov-Smirnov distance) values. In other words
the maximum difference between two cumulative proportion curves. This all
worked very well indeed. The matrix was calculated using Dbase III+ and
took a day and a half and the clustering was done using MV-ARCH, with the
resultant
2009 May 18
1
Save Cluster results to data frame
If I cluster my data into 3 sets, using pam for instance, is there a way
to save the resultant cluster results, to the originating data frame.
and related to that how do i say change the cluster names to something a
bit more meaningful that 1..2...3
So it goes like this.
Data ---> Cluster into 3 groups ----> given them meaningful names
2010 Oct 10
1
Package "prabclus" not available?
Hi there,
I just tried to install the package prabclus on a computer running Ubuntu
Linux 9.04 using install.packages from within R.
This gave me a message:
Warning message:
In install.packages("prabclus") : package ?prabclus? is not available
I tried to do this selecting two different CRAN mirrors (same result) and
with other packages (installing them works fine).
Looking up the
2010 Jul 02
2
K-means result - variance between cluster
Hi,
I like to present the results from the clustering method k-means in
terms of variances: within and between Cluster. The k-means object
gives only the within cluster sum of squares by cluster, so the between
variance part is missing,for calculation the following table, which I
try to get.
Number of | Variance within | Var between | Var total | F-value
Cluster k | cluster | cluster
2011 Mar 31
1
Cluster analysis, factor variables, large data set
Dear R helpers,
I have a large data set with 36 variables and about 50.000 cases. The
variabels represent labour market status during 36 months, there are 8
different variable values (e.g. Full-time Employment, Student,...)
Only cases with at least one change in labour market status is
included in the data set.
To analyse sub sets of the data, I have used daisy in the
cluster-package to create
2005 Jul 25
1
cluster
Dear listers:
Here I have a question on clustering methods available in R. I am
trying to down-sampling the majority class in a classification problem
on an imbalanced dataset. Since I don't want to lose information in
the original dataset, I don't want to use naive down-sampling: I think
using clustering on the majority class' side to select
"representative" samples might
2009 Dec 11
1
cluster size
hi r-help,
i am doing kmeans clustering in stats. i tried for five clusters clustering using:
kcl1 <- kmeans(as1[,c("contlife","somlife","agglife","sexlife",
"rellife","hordlife","doutlife","symtlife","washlife",
2006 Aug 18
2
R-update - what about packages and ESS?
Hi there,
it seems that if I update R, it doesn't find previously installed packages
anymore and is also not found by ESS.
Actually the update has been done by our system administrator who assumed
that there would be no problems with these things (I don't have root
access to this system) and will perhaps not be too keen on installing
everything else again.
Is there any simple way how
2010 Feb 11
1
cluster/distance large matrix
Hi all,
I've stumbled upon some memory limitations for the analysis that I want to
run.
I've a matrix of distances between 38000 objects. These distances were
calculated outside of R.
I want to cluster these objects.
For smaller sets (egn=100) this is how I proceed:
A<-matrix(scan(file, n=100*100),100,100, byrow=TRUE)
ad<-as.dist(A)