Jackson Rodrigues
2014-Jun-30 14:03 UTC
[R] How to combine/join/merge etc PCA and Cluster?
Hello everybody, I Would like to get some help to plot together, Principal Components Analysis (PCA) and clusters. I am handling environmental data from 25 locations spread across 5 different ecosystems.When grouped into 5 clusters, locations from different ecosystems are arranged in the same group. So, I want to plot together PCA and Clusters, in a such way that locations belonging to the same ecosystem are displayed with the same COLORS, and locations grouped in the same cluster, have the same SHAPE. For example: I have 1 site from Savanna and 1 site from Serengeti that are in the same cluster, Thus I would like to plot both with different colors blue and green, respectively due different ecosystem classification and with the same shape (triangle) because they are in the same group. So, I would have a cluster with 2 triangles one green and another blue. How to make this combination? There is command lines below that I'm using to make my combination, as presented by Bocard in Numerical Ecology Analysis with R chapter 4 or 5. with this code, it is possible to join PCA and Cluster however, without differentiating the colors of each ecosystem. #PCA mydata.pca<-rda(mydata) #Cluster mydata.w <- hclust(dist(Sqchord.mydata), "ward") plot(mydata.w, hang=-1) rect.hclust(mydata.w, 5) # Cut the dendrogram to yield 5 groups gr <- cutree(mydata.w, k=5) grl <- levels(factor(gr)) # Get the site scores, scaling 1 sit.sc1 <- scores(mydata.pca, display="w", scaling=3) # Plot the sites with cluster symbols and colours (scaling 3) p <- plot(mydata.pca, display="wa", scaling=3, type="n", main="Mydata PCA and clusters") abline(v=0, lty="dotted") abline(h=0, lty="dotted") for (i in 1:length(grl)) { points(sit.sc1[gr==i,], pch=(14+i), cex=2.5, col=i+1)} text(sit.sc1, row.names(Pre_euro_veg.1.all), cex=0.7, pos=3) # Add legend interactively legend(locator(1), paste("Cluster", c(1:length(grl))), pch=14+c(1:length(grl)), col=1+c(1:length(grl)), pt.cex=2) Thank you for any help. Jackson Rodrigues [[alternative HTML version deleted]]
This is a long way from being a reproducible example with no data. You also don't mention that you are using package vegan. I'll use the dune, dune.env data included in that package: library(vegan) mydata.pca <- rda(dune) mydata.w <- hclust(dist(dune), "ward") plot(mydata.w, hang=-1) rect.hclust(mydata.w, 5) gr <- cutree(mydata.w, k=5) # Don't need grl <- levels(factor(gr)) sit.sc1 <- scores(mydata.pca, display="w", scaling=3) p <- plot(mydata.pca, display="wa", scaling=3, type="n", main="Mydata PCA and clusters") # Don't need either abline() functions since plot() adds these. # symbol = cluster (5 groups), color = Moisture (4 groups) # Note, pch=16 and pch=19 are the same symbol so you don't want pch=15:19 points(sit.sc1, pch=as.numeric(gr), col=as.numeric(dune.env$Moisture)) # or points(sit.sc1, pch=as.numeric(gr)+20, bg=as.numeric(dune.env$Moisture)) # I'll skip the text labels The legend will be complicated since there are groups*ecosytems combinations. ------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Jackson Rodrigues Sent: Monday, June 30, 2014 9:03 AM To: r-help at r-project.org Subject: [R] How to combine/join/merge etc PCA and Cluster? Hello everybody, I Would like to get some help to plot together, Principal Components Analysis (PCA) and clusters. I am handling environmental data from 25 locations spread across 5 different ecosystems.When grouped into 5 clusters, locations from different ecosystems are arranged in the same group. So, I want to plot together PCA and Clusters, in a such way that locations belonging to the same ecosystem are displayed with the same COLORS, and locations grouped in the same cluster, have the same SHAPE. For example: I have 1 site from Savanna and 1 site from Serengeti that are in the same cluster, Thus I would like to plot both with different colors blue and green, respectively due different ecosystem classification and with the same shape (triangle) because they are in the same group. So, I would have a cluster with 2 triangles one green and another blue. How to make this combination? There is command lines below that I'm using to make my combination, as presented by Bocard in Numerical Ecology Analysis with R chapter 4 or 5. with this code, it is possible to join PCA and Cluster however, without differentiating the colors of each ecosystem. #PCA mydata.pca<-rda(mydata) #Cluster mydata.w <- hclust(dist(Sqchord.mydata), "ward") plot(mydata.w, hang=-1) rect.hclust(mydata.w, 5) # Cut the dendrogram to yield 5 groups gr <- cutree(mydata.w, k=5) grl <- levels(factor(gr)) # Get the site scores, scaling 1 sit.sc1 <- scores(mydata.pca, display="w", scaling=3) # Plot the sites with cluster symbols and colours (scaling 3) p <- plot(mydata.pca, display="wa", scaling=3, type="n", main="Mydata PCA and clusters") abline(v=0, lty="dotted") abline(h=0, lty="dotted") for (i in 1:length(grl)) { points(sit.sc1[gr==i,], pch=(14+i), cex=2.5, col=i+1)} text(sit.sc1, row.names(Pre_euro_veg.1.all), cex=0.7, pos=3) # Add legend interactively legend(locator(1), paste("Cluster", c(1:length(grl))), pch=14+c(1:length(grl)), col=1+c(1:length(grl)), pt.cex=2) Thank you for any help. Jackson Rodrigues [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.