similar to: Cluster Analysis

Displaying 20 results from an estimated 200 matches similar to: "Cluster Analysis"

2010 Dec 02
1
kmeans() compared to PROC FASTCLUS
Hello all, I've been comparing results from kmeans() in R to PROC FASTCLUS in SAS and I'm getting drastically different results with a real life data set. Even with a simulated data set starting with the same seeds with very well seperated clusters the resulting cluster means are still different. I was hoping to look at the source code of kmeans(), but it's in C and FORTRAN and
2017 Mar 09
2
GSoC 2017 Project Proposal
Hello devs. I would like to propose how I plan to go about improving and getting a system that can be integrated into Xapian in this GSoC for the clustering branch. I have identified three areas of work which were not touched last time. 1) Automated Performance Analysis I had roughly implemented 2 evaluation techniques previously (Distance b/w document and centroids within clusters and
2011 Aug 10
4
Clustering Large Applications..sort of
Hello all, I am using the clustering functions in R in order to work with large masses of binary time series data, however the clustering functions do not seem able to fit this size of practical problem. Library 'hclust' is good (though it may be sub par for this size of problem, thus doubly poor for this application) in that I do not want to make assumptions about the number of
2012 Mar 23
1
how to cluster rows of words in a text file
Hi: I am trying to cluster the rows of a text file with kmeans: I load the data as follows file1 <- read.csv("somefile.csv") and the file can be viewed having the following line of words > file1 1 word1 word3 word4 word1 2 word1 word4 word3 word1 3 word4 word2 word4 word3 4 word4 word2 word1 word3 5 word2 word2 word4 word2 file_as_matrix <- as.matrix(file1); Now,
2010 Apr 24
4
DICE Coefficient of similarity measure
Hi, I wanted the DICE coefficient (similarity measure for binary variables) to be calculated in R and found that the "igraph" package has the option of "similarity.dice" to do this. But, for this command, the input object should be an igraph object. But, I have a dataframe of columns containing 1's and 0's. Can I convert this dataframe into an igraph object, so that
2012 Apr 13
4
Help with stemDocument
Hi, All: I am new to R and tm package. I'm trying to do the stemming using tm_map() and it doesn't seem to work: *I used:* > stemDocument(t_cmts[[100]]) *Where t_cmts is the corpus object, the results is:* bottle loose box abt airpak sections top plastic bottle squashed nearly flush neck previous shipments bottle wrapped securely bubble wrap wno bottle damage packaging poor
2008 Apr 13
0
Calinsky and Harabasz Index for Cluster Determination with Diana in R
Hello all, I have a set of data points, which I have pair distances for. I managed to create dendrogram for this data set using diana() in R, however this only gives me the tree and not the clusters themselves. I am trying to determine clusters using Calinsky and Harabasz Index (CH Index). I, however, cannot find how to accomplish this using R. Is there anyone who could help me with this? I
2002 Feb 20
2
Clustering and Calinski's index
I have to solve a clustering problem. My first step is to determinate the number of clusters, that's why I 'm using the Calinski index ( [tr(b)/(k-1)]/[tr(w)/(k-1)] ) which i try to maximize to have the best number of clusters. A function is already implemented in R to calculate this index : clustIndex(cl,x, index="calinski") where cl is the result of a clustering method ,
2010 Apr 05
0
Agnes in Cluster Package and index.G1 in the clusterSim package questions
Dear R Users: I am new to R and I am trying to do a cluster analysis on a single continuous variable using the Agnes [Agglomerative Nesting (Hierarchical Clustering) ] in the Package ‘cluster’. I was able to apply this clustering method to my data: ward1 <- Agnes(balances, diss= FALSE, metric = "euclidean", stand = TRUE, method = "ward", keep.diss =TRUE, keep.data =
2011 Feb 05
1
different results in MASS's mca and SAS's corresp
Dear list: I have tried MASS's mca function and SAS's PROC corresp on the farms data (included in MASS, also used as mca's example), the results are different: R: mca(farms)$rs: 1 2 1 0.059296637 0.0455871427 2 0.043077902 -0.0354728795 3 0.059834286 0.0730485572 4 0.059834286 0.0730485572 5 0.012900181 -0.0503121890 6
2007 Mar 06
4
R and SAS proc format
Dear all, Is there an R equivalent to SAS's proc format? Best regards J. Lamack _________________________________________________________________ O Windows Live Spaces ? seu espa?o na internet com fotos (500 por m?s), blog e agora com rede social http://spaces.live.com/
2010 Jul 13
1
Equivalent of SAS's FIRST. And LAST. Variable in R?
Hi all, I'm just wondering if there is a equivalent of SAS's FIRST. and LAST. variables in R? For example, suppose this is a snapshot of the data: ClientCode CaseCode open close Important 1 37 28 2003-07-08 2003-09-02 1 2 37 310 2003-11-01 2004-09-10 1 3 37 1562 2007-04-03 2007-07-27 1 4
2011 Sep 09
2
NMDS plot and Adonis (PerMANOVA) of community composition with presence absence and relative intensity
Hi! Thanks for providing great help in R-related statistics. Now, however I'm stuck. I'm not a statistics person but I was recommended to use R to perform a nmds plot and PerMANOVA of my dataset. Sample(treatment) in the columns and species (OTU) in the rows. I have 4 treatments (Ambient Temperature, Ambient temperature+Low pH, High temperature, High temperature+low pH), and I have 16
2004 Apr 05
3
Selecting Best Regression Equation
Dear all, Does R or S-plus or any of their packages provide any command to form any of the following procedures to find Best Regression Equation - 1. 'All Possible Regressions Procedures' (is there any automated command to perform 2^p regressions and ordering according to criteria R2(adj), mallows Cp, s2- by not setting all the regression models manually), 2. 'Backward
2004 Mar 09
1
Package cclust error
Hello, here is my problem, After looking at the mail archives, I found a description of the error I get when I use this package. At first I even tought that they were showing how to solve it. But the thing is that by saying "the programmer forgot drop=FALSE" doesn't show me how I should get rid of the problem I have looked inside the package very quickly and I found three
2005 Apr 07
2
about mantelhaen.test (PR#7779)
Full_Name: Chien-yu Peng Version: 2.0.1 OS: Windows XP Professional Submission from: (NULL) (140.109.72.181) Dear all: Although I don't know you, I am thankful for your help. When I use the function mantelhaen.test for R x C x K (R, C > 2) table, the output is not the same as SAS's. I don't know that the result consist with one of SAS's. But it works correctly for 2
2008 Feb 08
2
R version of SAS Proc Varclus
I am interested in finding an R version of SAS "Proc Varclus". SAS's Proc Varclus implements an oblique cluster analysis based on principal components. How can I find out if R has a package that runs the same algorithm implemented in SAS "Proc Varclus"? Thank you, Mary Helen Black __________________________ Mary Helen Black, M.S. Keck School of Medicine of USC
2010 Nov 11
2
Consistency of Logistic Regression
Dear R developers, I have noticed a discrepancy between the coefficients returned by R's glm() for logistic regression and SAS's PROC LOGISTIC. I am using dist = binomial and link = logit for both R and SAS. I believe R uses IRLS whereas SAS uses Fisher's scoring, but the difference is something like 100 SE on the intercept. What accounts for such a huge difference? Thank you for
2007 Mar 09
2
piecing together statements (macro?)
Hi All I am pretty new to R but saw stata and sas's macro facilities and am looking for how such things work in R. I am trying to piece together a series of statements: n = 5 #want to have it dynamic with respect to n for (j in 1:n) { eval(paste("x", j, "=x[", j, "]", sep="")) } I want the created statements 'x1=x[1]' immediately executed
2004 Mar 15
2
R equiv to proc gremove in maps package
Is there an R equivalent to SAS's proc gremove? You would use this procedure to combine the units on an existing map, for example to build a map of Metropolitan Statistical Areas (MSAs) from the [US] counties dataset where the internal boundries surround the MSAs (which are groups of counties) rather than the individual counties. I can imagine the mechanism would be to find and erase the