thr3ads.net - similar to: "Cluster Analysis with minimum cluster size?"

Displaying 20 results from an estimated 10000 matches similar to: "Cluster Analysis with minimum cluster size?"

2006 Apr 05

"partitioning cluster function"

Hi All, For the function "bclust"(e1071), the argument "base.method" is explained as "must be the name of a partitioning cluster function returning a list with the same components as the return value of 'kmeans'. In my understanding, there are three partitioning cluster functions in R, which are "clara, pam, fanny". Then I check each of them to

plot and validation in clustering

2006 Mar 20

plot and validation in clustering

Hi there, I use function "kmeans" and "clara" to cluster one flow cytometry dataset. By using function "plot", the clusters got from "clara" can be graphed, while "kmeans" not. How can I get the plot of the clusters of "kmeans"? And, I hope to compare the two methods "kmeans" and "clara", or in other word, I

Re: Re: Find Closest 5 Cases?

2004 Feb 13

Re: Re: Find Closest 5 Cases?

Art (and group), I'm doing this as a form of missing value analysis. Approximately 30% of the cases are missing data for one variable. To impute values for those cases, I'd like to match those cases that are missing the variable to all other cases and then take an average of those to infill. I realize there are many methods for imputing data. I'm not well versed on any in

Cluster analysis, defining center seeds or number of clusters

2009 Jun 11

Cluster analysis, defining center seeds or number of clusters

I use kmeans to classify spectral events in high and low 1/3 octave bands: #Do cluster analysis CyclA<-data.frame(LlowA,LhghA) CntrA<-matrix(c(0.9,0.8,0.8,0.75,0.65,0.65), nrow = 3, ncol=2, byrow=TRUE) ClstA<-kmeans(CyclA,centers=CntrA,nstart=50,algorithm="MacQueen") This works well when the actual data shows 1,2 or 3 groups that are not "too close" in a cross plot.

weithed clustering (was: Re: problems with a large data set)

2001 Apr 27

weithed clustering (was: Re: problems with a large data set)

kmeans and clara work great. Thank you for the tip. I have another question: Is it possible to weight the observations in a cluster analysis ? I haven't found any mention of this in the kmeans of clara help texts. Moritz Lennert Charg? de recherche IGEAT - ULB t?l: 32-2-650.65.16 fax: 32-2-650.50.92 email: mlennert at ulb.ac.be > On Wed, 25 Apr 2001, Moritz Lennert wrote: >

Setting a minimum number of observations within an individual cluster

2007 Jun 13

Setting a minimum number of observations within an individual cluster

Hi I'm trying to cluster a continuous dataset with a varying number of clusters and with a restriction that each cluster must have more than 'x' number of observations. I have tried the clara function, using silhouette to give me the neighbouring cluster mediod of each observation, then merging an observation from a cluster with less than 'x' obs. into its' neighbour,

clusterMI: Cluster Analysis with Missing Values by Multiple Imputation

2024 Mar 13

clusterMI: Cluster Analysis with Missing Values by Multiple Imputation

Dear all, I am pleased to announce the release of a new package named 'clusterMI' on CRAN. clusterMI allows clustering of incomplete observations by addressing missing values using multiple imputation. For achieving this goal, the methodology consists in three steps: 1. missing data imputation using tailored imputation models: four multiple imputation methods are proposed, two are

clusterMI: Cluster Analysis with Missing Values by Multiple Imputation

2024 Mar 13

clusterMI: Cluster Analysis with Missing Values by Multiple Imputation

About clustering techniques

2008 Jul 29

About clustering techniques

Hello R users It's some time I am playing with a dataset to do some cluster analysis. The data set consists of 14 columns being geographical coordinates and monthly temperatures in annual files latitutde - longitude - temperature 1 -..... - temperature 12 I have some missing values in some cases, maybe there are 8 monthly valid values at some points with four non valid. I don't want to

Empty cluster / segfault using vanilla kmeans with version 2.15.2

2013 Mar 13

Empty cluster / segfault using vanilla kmeans with version 2.15.2

Hello, here is a working reproducible example which crashes R using kmeans or gives empty clusters using the nstart option with R 15.2. library(cluster) kmeans(ruspini,4) kmeans(ruspini,4,nstart=2) kmeans(ruspini,4,nstart=4) kmeans(ruspini,4,nstart=10) ?kmeans either we got empty always clusters and or, after some further commands an segfault. regards, Detlef Groth ------------ [R] Empty

cluster size

2009 Dec 11

cluster size

hi r-help, i am doing kmeans clustering in stats. i tried for five clusters clustering using: kcl1 <- kmeans(as1[,c("contlife","somlife","agglife","sexlife", "rellife","hordlife","doutlife","symtlife","washlife",

Removing leading and trailing spaces (string manipulation)

2004 Mar 31

Removing leading and trailing spaces (string manipulation)

Hi all, I'm running the following code to generate 40 different jpegs based on the resulting data. I'd like the file names to be 'Cluster1.jpeg', however the code write filenames like 'Cluster 1 .jpeg'. How can I get rid of the unwanted spaces? I've looked at ?format and it doesn't seem to work - at least in this context. ################### ClusCount <- 40

How to improve this code?

2004 Apr 04

How to improve this code?

Hi all, I've got some functioning code that I've literally taken hours to write. My 'R' coding is getting better...it used to take days :) I know I've done a poor job of optimizing the code. In addition, I'm missing an important step and don't know where to put it. So, three questions: 1) I'd like the resulting output to be sorted on distance (ascending) and

kmeans cluster stability

2001 Mar 13

kmeans cluster stability

I'm doing kmeans partitioning on a small (n=26) dataset that has 5 variables. I noticed that if I repeatedly run the same command, the cluster centers change and the cluster membership changes. Using RW1022 under Windows NT & Windows 2000 >kmeans(pottery[,1:5], 4, 20) [...snip] $size [1] 7 3 9 7 [...snip] $size [1] 7 10 4 5 [...snip] $size [1] 6 10 5 5 yields a different

Empty cluster / segfault using vanilla kmeans with version 2.15.2

2013 Feb 03

Empty cluster / segfault using vanilla kmeans with version 2.15.2

Dear experts, I am encountering a version-dependent issue. My laptop runs Ubuntu 12.04 LTS 64-bit, R 2.14.1; the issue explained below never occurred with this version of R My desktop runs Ubuntu 11.10 64-bit, R 2.13.2; what follows applies to this setup. The data I'm clustering is constituted by the rows of a 320 x 6 matrix containing integers ranging from 1 to 7, no missing data. I applied

about clustering method

2006 Feb 27

about clustering method

Hi there, I'm doing some clustering analysis and try to find all the algorithms related to clustering in R. Here is the list of the algorithms I found. But I'm not sure if It's the complete list. Could you please check it and see if there're other ones? Thank you very much! P.S.: List of the algorithms related to clustering: (1) Hierarchical methods hclust

Using kmeans given cluster centroids and data with NAs

2005 Mar 31

Using kmeans given cluster centroids and data with NAs

Hello, I have used the functions agnes and cutree to cluster my data (4977 objects x 22 variables) into 8 clusters. I would like to refine the solution using a k-means or similar algorithm, setting the initial cluster centres as the group means from agnes. However my data matrix has NA's in it and the function kmeans does not appear to accept this? > dim(centres) [1] 8 22 > dim(data)

outlier threshold

2005 Feb 25

outlier threshold

For the analysis of financial data wih a large variance, what is the best way to select an outlier threshold? Listed below, is there a best method to select an outlier threshold and how does R calculate it? In R, how do you find the outlier threshold through an interquartile range? In R, how do you find the outlier threshold using the hist command? In R, how do you find the outlier threshold

kmeans: number of cluster centres must lie between 1 and nrow(x)

2011 Feb 01

kmeans: number of cluster centres must lie between 1 and nrow(x)

Dear R, Can't I cluster a dataset into k clusters where k is exactly the number of observations? I have version 12.2 installed. See this example > a <- matrix(1:100, 20) > kmeans(a, 20) Error: number of cluster centres must lie between 1 and nrow(x) This is a bit ad-hoc but I known R from version 2.12 allows number of clusters to be one. So I guess allowing number of clusters to be

Calculate Distance and Aggregate Data?

2004 Feb 24

Calculate Distance and Aggregate Data?

Hi all, I've been struggling learning R and need to turn to the list again. I've got a dataset (comma-delimited file) with the following fields: recid, latitude, longitude, population, dwelling and age. For each observation, I'd like to calculate the total number of people and dwellings and average age within 2 k.m. Distance could be Euclidean, however, a proper distance

similar to: Cluster Analysis with minimum cluster size?