I have been asked to forward this. Please reply directly or include the people who have been CC-ed in this e-mail. Thank you.> forwarded message from "Timothy Waters"<timothy.waters at plant-sciences.oxford.ac.uk> -----> > Consider the following problem. You have a dataset with approx 190 > datapoints. Each datapoint has between 7 and 16 dimensions known: mosthave> 7, a few have 16, many have 14. The ones that have seven are divided into > two categories, such that the vast bulk fall into a category withdimensions> 1,2,3,4,5,6,7 known, and the others have dimensions 8,9,10,11,12,13,14 > known. > > So, a fairly difficult dataset, but anyway. > > Now, on to the analysis. You wish to look for the presence of any form of > multivariate structuring to the data, specifically, discrete clusters > identified by combinations of one or more variables. Clearly you have a > couple of options. You can just get it to produce a dendrogram (strictlya> phenogram in biological terms), or you can ask it to cluster the data into > some number n of sets, where 1 =< n =< 10 (for present purposes) . Youcan> then look at each possible solution (i.e. each value of n) and examinewhat> discriminant function analysis tells you about the ease of separation ofthe> clusters you have identified. > > BUT, can you give a program a dataset (i.e., this dataset) and say "Findn,> where n is the number of clusters that the data is structured into, such > that the statistical differences between clusters are maximally > significant." > > Thanks for any help, > > Tim. > > > > > -----End of forwarded message from "Timothy Waters"<timothy.waters at plant-sciences.oxford.ac.uk> -----> >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Fri, Aug 09, 2002 at 12:10:38PM +0100, Adaikalavan Ramasamy wrote:> I have been asked to forward this. Please reply directly or include the > people who have been CC-ed in this e-mail. Thank you. > > > forwarded message from "Timothy Waters" > <timothy.waters at plant-sciences.oxford.ac.uk> ----- > > > > Consider the following problem. You have a dataset with approx 190 > > datapoints. Each datapoint has between 7 and 16 dimensions known: most > have > > 7, a few have 16, many have 14. The ones that have seven are divided into > > two categories, such that the vast bulk fall into a category with > dimensions > > 1,2,3,4,5,6,7 known, and the others have dimensions 8,9,10,11,12,13,14 > > known. > > > > So, a fairly difficult dataset, but anyway. > > > > Now, on to the analysis. You wish to look for the presence of any form of > > multivariate structuring to the data, specifically, discrete clusters > > identified by combinations of one or more variables. Clearly you have a > > couple of options. You can just get it to produce a dendrogram (strictly > a > > phenogram in biological terms), or you can ask it to cluster the data into > > some number n of sets, where 1 =< n =< 10 (for present purposes) . You > can > > then look at each possible solution (i.e. each value of n) and examine > what > > discriminant function analysis tells you about the ease of separation of > the > > clusters you have identified. > > > > BUT, can you give a program a dataset (i.e., this dataset) and say "Find > n, > > where n is the number of clusters that the data is structured into, such > > that the statistical differences between clusters are maximally > > significant."kmeans (package mva), pam, fanny and clara (package cluster) are likely to do accomplish such a job... (but the fact that you have two groups of data defined in two different subspaces could cause harm... if they are orthogonal, may be it would make more sense to split your data in two sets...) Hopin' it helps, L. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Reasonably Related Threads
- Mayday ! Needing urgent help about writing results to a file
- compressing/reducing data for plot
- How to translate the 2D-density matrix (the output of bkde2D function) into matrix of datapoints' amounts?
- Lattice help: Dotplot
- How to plot multiple time series with different time base in same plot?