trichter at uni-bremen.de
2015-Sep-27 21:22 UTC
[R] How to find out if two cells in a dataframe belong to the same pre-specified factor-level
Dear list, I really couldnt find a better way to describe my question, so please bear with me. To illustrate my problem, i have a matrix with ecological distances (m1) and one with genetic distances (m2) for a number of biological species. I have merged both matrices and want to plot both distances versus each other, as illustrated in this example: library(reshape) library(ggplot2) library(dplyr) dist1 <- matrix(runif(16),4,4) dist2 <- matrix(runif(16),4,4) rownames(dist1) <- colnames(dist1) <- paste0("A",1:4) rownames(dist2) <- colnames(dist2) <- paste0("A",1:4) m1 <- melt(dist1) m2 <- melt(dist2) final <- full_join(m1,m2, by=c("Var1","Var2")) ggplot(final, aes(value.x,value.y)) + geom_point() Here is the twist: The biological species belong to certain groups, which are given in the dataframe `species`, for example: species <- data.frame(spcs=as.character(paste0("A",1:4)), grps=as.factor(c(rep("cat",2),(rep("dog",2))))) I want to check if a x,y pair in final (as in `final$Var1`, `final$Var2`) belongs to the same group of species (here "cat" or "dog"), and then want to color all groups specifically in the x,y-scatterplot. Thus, i need an R translation for: final$group <- If (final$Var1 and final$Var2) belong to the same group as specified in species, then assign the species group here, else do nothing or assign NA so i can proceed with ggplot(final, aes(value.x,value.y, col=group)) + geom_point() So, in the example, the pairs A1-A1, A1-A2, A2-A1, A2-A2 should be identified as "both cats", hence should get the factor "cat". Thank you very much! Tim
Adams, Jean
2015-Sep-28 18:15 UTC
[R] How to find out if two cells in a dataframe belong to the same pre-specified factor-level
Here's one approach that works. I made some changes to the code you provided. Full working example code given below. library(reshape) library(ggplot2) library(dplyr) dist1 <- matrix(runif(16), 4, 4) dist2 <- matrix(runif(16), 4, 4) rownames(dist1) <- colnames(dist1) <- paste0("A", 1:4) rownames(dist2) <- colnames(dist2) <- paste0("A", 1:4) m1 <- melt(dist1) m2 <- melt(dist2) # I changed the by= argument here final <- full_join(m1, m2, by=c("X1", "X2")) # I made some changes to keep spcs character and grps factor species <- data.frame(spcs=paste0("A", 1:4), grps=as.factor(c(rep("cat", 2), (rep("dog", 2)))), stringsAsFactors=FALSE) # define new variables for final indicating group membership final$g1 <- species$grps[match(final$X1, species$spcs)] final$g2 <- species$grps[match(final$X2, species$spcs)] final$group <- as.factor(with(final, ifelse(g1==g2, as.character(g1), "dif"))) # plot just the rows with matching groups ggplot(final[final$group!="dif", ], aes(value.x, value.y, col=group)) + geom_point() # plot all the rows ggplot(final, aes(value.x, value.y, col=group)) + geom_point() Jean On Sun, Sep 27, 2015 at 4:22 PM, <trichter at uni-bremen.de> wrote:> Dear list, > I really couldnt find a better way to describe my question, so please bear > with me. > > To illustrate my problem, i have a matrix with ecological distances (m1) > and one with genetic distances (m2) for a number of biological species. I > have merged both matrices and want to plot both distances versus each > other, as illustrated in this example: > > library(reshape) > library(ggplot2) > library(dplyr) > > dist1 <- matrix(runif(16),4,4) > dist2 <- matrix(runif(16),4,4) > rownames(dist1) <- colnames(dist1) <- paste0("A",1:4) > rownames(dist2) <- colnames(dist2) <- paste0("A",1:4) > > m1 <- melt(dist1) > m2 <- melt(dist2) > > final <- full_join(m1,m2, by=c("Var1","Var2")) > ggplot(final, aes(value.x,value.y)) + geom_point() > > Here is the twist: > The biological species belong to certain groups, which are given in the > dataframe `species`, for example: > > species <- data.frame(spcs=as.character(paste0("A",1:4)), > grps=as.factor(c(rep("cat",2),(rep("dog",2))))) > > I want to check if a x,y pair in final (as in `final$Var1`, `final$Var2`) > belongs to the same group of species (here "cat" or "dog"), and then want > to color all groups specifically in the x,y-scatterplot. > Thus, i need an R translation for: > > final$group <- If (final$Var1 and final$Var2) belong to the same group as > specified > in species, then assign the species group here, else do nothing or > assign NA > > so i can proceed with > > ggplot(final, aes(value.x,value.y, col=group)) + geom_point() > > So, in the example, the pairs A1-A1, A1-A2, A2-A1, A2-A2 should be > identified as "both cats", hence should get the factor "cat". > > Thank you very much! > > > Tim > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Tim Richter-Heitmann
2015-Sep-29 09:27 UTC
[R] How to find out if two cells in a dataframe belong to the same pre-specified factor-level
Thank you, that turned out to work very well. If you want to, you can answer it here: http://stackoverflow.com/questions/32809249/how-to-find-out-if-two-cells-in-a-dataframe-belong-to-the-same-pre-specified-fac/ The question wasnt properly answered, which is why i switched to R-list. On 28.09.2015 20:15, Adams, Jean wrote:> Here's one approach that works. I made some changes to the code you > provided. Full working example code given below. > > library(reshape) > library(ggplot2) > library(dplyr) > > dist1 <- matrix(runif(16), 4, 4) > dist2 <- matrix(runif(16), 4, 4) > rownames(dist1) <- colnames(dist1) <- paste0("A", 1:4) > rownames(dist2) <- colnames(dist2) <- paste0("A", 1:4) > m1 <- melt(dist1) > m2 <- melt(dist2) > # I changed the by= argument here > final <- full_join(m1, m2, by=c("X1", "X2")) > > # I made some changes to keep spcs character and grps factor > species <- data.frame(spcs=paste0("A", 1:4), > grps=as.factor(c(rep("cat", 2), (rep("dog", 2)))), > stringsAsFactors=FALSE) > > # define new variables for final indicating group membership > final$g1 <- species$grps[match(final$X1, species$spcs)] > final$g2 <- species$grps[match(final$X2, species$spcs)] > final$group <- as.factor(with(final, ifelse(g1==g2, as.character(g1), > "dif"))) > > # plot just the rows with matching groups > ggplot(final[final$group!="dif", ], aes(value.x, value.y, col=group)) + > geom_point() > # plot all the rows > ggplot(final, aes(value.x, value.y, col=group)) + geom_point() > > Jean > > > On Sun, Sep 27, 2015 at 4:22 PM, <trichter at uni-bremen.de > <mailto:trichter at uni-bremen.de>> wrote: > > Dear list, > I really couldnt find a better way to describe my question, so > please bear with me. > > To illustrate my problem, i have a matrix with ecological > distances (m1) and one with genetic distances (m2) for a number of > biological species. I have merged both matrices and want to plot > both distances versus each other, as illustrated in this example: > > library(reshape) > library(ggplot2) > library(dplyr) > > dist1 <- matrix(runif(16),4,4) > dist2 <- matrix(runif(16),4,4) > rownames(dist1) <- colnames(dist1) <- paste0("A",1:4) > rownames(dist2) <- colnames(dist2) <- paste0("A",1:4) > > m1 <- melt(dist1) > m2 <- melt(dist2) > > final <- full_join(m1,m2, by=c("Var1","Var2")) > ggplot(final, aes(value.x,value.y)) + geom_point() > > Here is the twist: > The biological species belong to certain groups, which are given > in the dataframe `species`, for example: > > species <- data.frame(spcs=as.character(paste0("A",1:4)), > grps=as.factor(c(rep("cat",2),(rep("dog",2))))) > > I want to check if a x,y pair in final (as in `final$Var1`, > `final$Var2`) belongs to the same group of species (here "cat" or > "dog"), and then want to color all groups specifically in the > x,y-scatterplot. > Thus, i need an R translation for: > > final$group <- If (final$Var1 and final$Var2) belong to the same > group as specified > in species, then assign the species group here, else do > nothing or assign NA > > so i can proceed with > > ggplot(final, aes(value.x,value.y, col=group)) + geom_point() > > So, in the example, the pairs A1-A1, A1-A2, A2-A1, A2-A2 should be > identified as "both cats", hence should get the factor "cat". > > Thank you very much! > > > Tim > > ______________________________________________ > R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- > To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- Tim Richter-Heitmann (M.Sc.) PhD Candidate International Max-Planck Research School for Marine Microbiology University of Bremen Microbial Ecophysiology Group (AG Friedrich) FB02 - Biologie/Chemie Leobener Stra?e (NW2 A2130) D-28359 Bremen Tel.: 0049(0)421 218-63062 Fax: 0049(0)421 218-63069