similar to: k-means clustering

Displaying 20 results from an estimated 3000 matches similar to: "k-means clustering"

2007 Oct 11
2
reference for logistic regression
Dear list, first accept my apologies for asking a non-R question. Can anyone point me to a good reference on logistic regression? web or book references would be great. I am interested in the use and interpretation of dummy variables and prediction models. I checked the contributed section in the CRAN homepage but could not find anything (Julian Faraway?s "practical Regression and ANOVA
2007 Oct 01
3
mean of subset of rows
Dear list, this must be an easy one: I have a data.frame of two columns, "ID" with four different levels (A to D) and numerical "size", and each of the 4 different IDs is repeated a different number of times. I would like to get the mean size for each ID as another data.frame. I have tried the following: >ID= as.character(unique(data[,1])) # I use unique() because
2008 Mar 11
2
Design�s validate() output
Dear list Is there anywhere I could find further information on how to interpret the output for a logistic regression for validate() from Design package?. I tried ?validate and google but I cannot find information on what the rows and the columns represent. Thanks David
2008 Feb 08
2
correlation
Dear list I would like to compare two measurements of disease severity (M1 and M2), one of the is continuous (M1 ranging from 1 to 10) and the other is ordinal (M2 takes Low, Medium, high and very high). Do you think is ok to use cor() function to test whether the two agree, i.e correlate? I am afraid that if I set M2 to 1,2,3 and 4, the function cor() will take them as continuous and
2008 Jan 21
2
summary of categorical variables
Dear list, I have a data.frame with nine categorical variables (0,1,2 and NAs) that I would like to get the number of events for each of them. I can extract this using summary() for each variable at a time with the as.factor()argument (otherwise it will get me the mean value): >summary(as.factor(mydf[,3])) 0 1 2 NA's 194 67 4 2 Trying to use apply() to get this for
2008 Jan 18
1
histogram with NAs
Dear list, I have a categorical variable in a data.frame that I would like to plot using a histogram to show number of events. Values are 0, 1 and some NAs. I can?t make the hist() function to 1) include a column with the number of NAs 2) have the x axis to be categorical, I always get 0, 0.2, 0.4,... 1 divisions Can anyone help me? This is my code. "database" is my data.frame and
2016 Jul 26
3
K MEANS clustering
Hello, I've been working on the KMeans clustering algorithm recently and since the past week, I have been stuck on a problem which I'm not able to find a solution to. Since we are representing documents as Tf-idf vectors, they are really sparse vectors (a usual corpus can have around 5000 terms). So it gets really difficult to represent these sparse vectors in a way that would be
2008 May 16
4
reading and analyzing a text file
Dear list, I have a text file from a scanner that includes 20 lines of text (scanner settings) before it actually starts showing the readings in a tabular format (headings are ID, intensity, background and few others). I am a biologist with some experience using R and my question is if it is possible to read this file into an R workspace and store the actual readings in a dataframe,
2016 Jul 27
2
K MEANS clustering
Hey Parth, Thanks for the reply. I am considering implementing a cosine distance metric too, along with euclidian distance because of the dimensionality issue that comes in with K-Means and euclidian distance metric. That does help when we deal with sparse vectors for documents. The particular problem I'm having is representing centroids in an efficient way. For example, when we find the mean
2012 Aug 28
1
K-Means clustering Algorithm
I was wondering if there was an R equivalent to the two phased approach that MATLAB uses in performing the Kmeans algorithm. If not is there away that I can determine if the kmeans in R and the kmeans in MATLAB are essentially giving me the same clustering information within a small amount of error? -- View this message in context:
2019 Apr 29
2
Manejo de colores CMY(K?) según valores de variables.
Buenas noches; Traigo una pregunta que supongo que alguno ya la tendrá resuelta, porque se me hace difícil entender algo que presupongo fácil. Quiero, según los valores de 3 o 4 variables numéricas, convenientemente escaladas, conseguir gamas de colores. Supongamos las variables numéricas: X, Y, Z; a cada variable le correspondería un color; pongamos que X = C (cian), Y = M (magenta) y Z = Y
2007 May 13
2
Some questions on repeated measures (M)ANOVA & mixed models with lme4
Dear R Masters, I'm an anesthesiology resident trying to make his way through basic statistics. Recently I have been confronted with longitudinal data in a treatment vs. control analysis. My dataframe is in the form of: subj | group | baseline | time | outcome (long) or subj | group | baseline | time1 |...| time6 | (wide) The measured variable is a continuous one. The null hypothesis in
2011 Nov 05
1
3-D ellipsoid equations
+ Hello, The parametric equations of an ellipsoid can be written in terms of spherical coordinates. The three spherical coordinates are converted to Cartesian coordinates by X=a cos (α) sin(θ) Y=b sin(α) sin(θ) Z=c cos(θ) for α and θ The parameter α varies from 0 to 2 π and θ varies from 0 to π . Here ( X o , Y o ,Z o ) is the center of the ellipsoid, and θ is the angle
2007 Jun 24
2
ANOVA non-sphericity test and corrections (eg, Greenhouse-Geisser)
I'm an experimental psychologist and when I run ANOVA analysis in SPSS, I normally ask for a test of non-sphericity (Box's M-test). I also ask for output of the corrections for non-sphericity, such as Greenhouse-Geisser and Huhn-Feldt. These tests and correction factors are commonly used in the journals for experimental and other psychology reports. I have been switching from SPSS to R
2011 Jun 17
4
Bartlett's Test of Sphericity
Hello Dear R user, I want to conduct a Principal components analysis and I need to run two tests to check whether I can do it or not. I found how to run the KMO test, however i cannot find an R fonction for the Bartlett's test of sphericity. Does somebody know if it exists? Thanks for your help! Thibault [[alternative HTML version deleted]]
2001 Dec 13
2
k-means with euclidian distance but no coordinates
Hi, I'm trying to build a thesaurus that will sensible values for rare words. I suspect the best algorithm to use is k-means although I'm not sure about that -- I would have preferred a k dimensional space with a binary cluster in each dimension so a word can belong to 0..k clusters, but I digress... I can measure the strength of correlation between words fairly easily by counting
2009 Nov 09
1
Getting Sphericity Tests for Within Subject Repeated Measure Anova (using "car" package) (Adjusted Dataset)
[corrected dataset below] Hello everyone, I am trying to do within subjects repeated measures anova followed by the test of sphericity (sample dataset below). I am able to get either mixed model or linear model anova and TukeyHSD, but have no luck with Repeated-Measures Assuming Sphericity or Separate Sphericity Tests. I am trying to follow example from "car" package, but it seems
2002 Jan 28
4
Multivariate response trees
I would like to know if someone has done work on trees with multivariate response. I need something like rpart but for vector responses. If someone has code that he/she is willing to share, I would be grateful. If not, even guidelines for writing my own starting from rpart would be welcomed. ft. -- Fernando TUSELL e-mail: Departamento de
2009 Mar 03
1
repeated measures anova, sphericity, epsilon, etc
I have 3 questions (below). Background: I am teaching an introductory statistics course in which we are covering (among other things) repeated measures anova. This time around teaching it, we are using R for all of our computations. We are starting by covering the univariate approach to repeated measures anova. Doing a basic repeated measures anova (univariate approach) using aov() seems
2006 Mar 20
1
does lme repeated measures require sphericity?
I haven't been able to find an answer on this that's direct, only implied. In several places I have read that when people asked for sphericity tests they were guided toward lme or mlm models. But, there is no direct indication that the lme method is not subject to the sphericity assumption. In fact, it seems like it should be. Its just a linear model that handles random and