Dear R-help, In my Master thesis I measured 10 variables from 18 lakes. These measurements were taken 4 times a year in 3 depths, so I have 12 samples from each lake. I know that 12 samples can not be treated as replications, since they don't correspond to the same environmental characteristics and are not statistically independent, but I want to use these 12 samples as an estimate of an annual range the 18 lakes have of the 10 variables. I want to make a cluster analysis of the 18 lakes and my known possibilities were: 1- Make an average of the 12 samples from each lake and make the cluster (Using ward's method); 2- Use all 216 samples (18*12) to make the cluster (Which yields a mess). But I thought I could begin the cluster algorithm already with 18 clusters (Lakes) each with 12 individuals (samples) and normally proceed with the calculations (using ward's method). So I will obtain a cluster of the 18 lakes, but using the 12 samples. I got the cluster Fortran algorithm and I'm trying to translate it to the R language to see how it works and maybe implement this kind of cluster of cluster analysis. Does anyone knows if there is an algorithm that does this? Actually I did it by hand and got very good and meaningful results, but I want to implement it to try another merging criterias. Thanks Diego Pujoni Zooplankton Ecology Laboratory Biological Sciences Institute Federal University of Minas Gerais Brazil
Hello Diego, This might not be relevant, but on reading your question the first idea that struck me was that ordination trajectories of your lakes over time might be more informative than clustering. Michael On 5 January 2011 01:31, Diego Pujoni <diegopujoni at gmail.com> wrote:> Dear R-help, > > In my Master thesis I measured 10 variables from 18 lakes. These > measurements were taken 4 times a year in 3 depths, so I have 12 > samples from each lake. I know that 12 samples can not be treated as > replications, since they don't correspond to the same environmental > characteristics and are not statistically independent, but I want to > use these 12 samples as an estimate of an annual range the 18 lakes > have of the 10 variables. > > I want to make a cluster analysis of the 18 lakes and my known > possibilities were: > 1- Make an average of the 12 samples from each lake and make the > cluster (Using ward's method); > 2- Use all 216 samples (18*12) to make the cluster (Which yields a mess). > > But I thought I could begin the cluster algorithm already with 18 > clusters (Lakes) each with 12 individuals (samples) and normally > proceed with the calculations (using ward's method). So I will obtain > a cluster of the 18 lakes, but using the 12 samples. > > I got the cluster Fortran algorithm and I'm trying to translate it to > the R language to see how it works and maybe implement this kind of > cluster of cluster analysis. > > Does anyone knows if there is an algorithm that does this? Actually I > did it by hand and got very good and meaningful results, but I want to > implement it to try another merging criterias. > > Thanks > > Diego Pujoni > Zooplankton Ecology Laboratory > Biological Sciences Institute > Federal University of Minas Gerais > Brazil > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hi Michael, I agree with you and I will make this ordination. But I also want to check a spatial correlation of the variables, so I thought that comparing the dendrogram of the environmental variables with the dendrogram of the geographical distances of the lakes it will indicates if similar lakes are next to each other. But I have just one geographical coordinate for each lake, but 12 measures of environmental variables. How can I analyse this? Thank you very much for the attention Diego PJ
Hi Diego, It sounds like what you want to do is to cluster 18 "observations" (each of them are clusters themselves), and then have each of the 18 tips have the rest of the cluster hierarchy in them. I don't think this is possible through hclust, BUT, it can (relatively) be easier to program using a dedrogram object. Would it interest you to write something like this up? Cheers, Tal ----------------Contact Details:------------------------------------------------------- Contact me: Tal.Galili@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) ---------------------------------------------------------------------------------------------- On Tue, Jan 4, 2011 at 4:31 PM, Diego Pujoni <diegopujoni@gmail.com> wrote:> Dear R-help, > > In my Master thesis I measured 10 variables from 18 lakes. These > measurements were taken 4 times a year in 3 depths, so I have 12 > samples from each lake. I know that 12 samples can not be treated as > replications, since they don't correspond to the same environmental > characteristics and are not statistically independent, but I want to > use these 12 samples as an estimate of an annual range the 18 lakes > have of the 10 variables. > > I want to make a cluster analysis of the 18 lakes and my known > possibilities were: > 1- Make an average of the 12 samples from each lake and make the > cluster (Using ward's method); > 2- Use all 216 samples (18*12) to make the cluster (Which yields a mess). > > But I thought I could begin the cluster algorithm already with 18 > clusters (Lakes) each with 12 individuals (samples) and normally > proceed with the calculations (using ward's method). So I will obtain > a cluster of the 18 lakes, but using the 12 samples. > > I got the cluster Fortran algorithm and I'm trying to translate it to > the R language to see how it works and maybe implement this kind of > cluster of cluster analysis. > > Does anyone knows if there is an algorithm that does this? Actually I > did it by hand and got very good and meaningful results, but I want to > implement it to try another merging criterias. > > Thanks > > Diego Pujoni > Zooplankton Ecology Laboratory > Biological Sciences Institute > Federal University of Minas Gerais > Brazil > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]