Adrian Johnson
2021-May-26 21:16 UTC
[R] Decompose df1 into another df2 based on values in df1
Hello, I am trying to convert a df (given below as d1) into df2 (given below as res). I tried using loops for each row. I cannot get it right. Moreover the df is 250000 x 500 in dimension and I cannot get it to work. Could anyone help me here please. Thanks. Adrian. d1 <- structure(list(S1 = c("a1|a2", "b1|b3", "w"), S2 = c("w", "b1", "c2"), S3 = c("a2", "b3|b4|b1", "c1|c4"), S4 = c("w", "b4", "c4" ), S5 = c("a2/a3", "w", "w")), class = "data.frame", row.names = c("A", "B", "C")) res <- structure(list(S1 = c(1L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L), S2 = c(0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L), S3 = c(0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 0L, 1L), S4 = c(0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L), S5 = c(0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c("a1", "a2", "a3", "b1", "b2", "b3", "b4", "c1", "c2", "c4")) [[alternative HTML version deleted]]
Bert Gunter
2021-May-27 00:28 UTC
[R] Decompose df1 into another df2 based on values in df1
Thank you for the reprex. However your specification was too vague for me to know exactly what your data are like, so I tried to assume the most general possibility, with the consequence that I may be giving you an answer to the wrong question. Hopefully, you can adjust as needed to get what you want. I need also warn you that I am nearly certain there are more elegant, cleverer, faster ways to do this. I just used simple tools. So you may wish to wait a bit to see whether others can improve on my attempt. First of all, I assumed the "a2/a3" in S5 in d1 is a typo and it should be "a2|a3". If it is is not a typo then substitute "\\||\\/" for "\\|" in the strsplit function in the code that follows. Secondly, I assumed that your identifiers, "a1" for example, could occur more than 1 time in your data. If the only possibilities are 0 or 1 times, then the code I provided --in particular the last sapply-- is too complicated. A faster approach in that case might be to use R's outer() function; I leave that as an exercise for you or someone else to help you with if so. Here is my code for your reprex: getall<- function(x){ ul <-unlist(strsplit(x,"\\|")) ul[ul != "w"] } allvals <- lapply(d1, getall) uneeks <- sort(unique(unlist(allvals))) sapply(allvals, function(x)table(factor(x, levels = uneeks))) ## which gives> sapply(allvals, function(x)table(factor(x, levels = uneeks)))S1 S2 S3 S4 S5 a1 1 0 0 0 0 a2 1 0 1 0 1 a3 0 0 0 0 1 b1 1 1 1 0 0 b3 1 0 1 0 0 b4 0 0 1 1 0 c1 0 0 1 0 0 c2 0 1 0 0 0 c4 0 0 1 1 0 Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Wed, May 26, 2021 at 2:18 PM Adrian Johnson <oriolebaltimore at gmail.com> wrote:> Hello, > > I am trying to convert a df (given below as d1) into df2 (given below as > res). > > I tried using loops for each row. I cannot get it right. Moreover the df > is 250000 x 500 in dimension and I cannot get it to work. > > Could anyone help me here please. > > Thanks. > Adrian. > > d1 <- > structure(list(S1 = c("a1|a2", "b1|b3", "w"), S2 = c("w", "b1", > "c2"), S3 = c("a2", "b3|b4|b1", "c1|c4"), S4 = c("w", "b4", "c4" > ), S5 = c("a2/a3", "w", "w")), class = "data.frame", row.names = c("A", > "B", "C")) > > res <- > structure(list(S1 = c(1L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L), > S2 = c(0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L), S3 = c(0L, > 1L, 0L, 1L, 0L, 1L, 1L, 1L, 0L, 1L), S4 = c(0L, 0L, 0L, 0L, > 0L, 0L, 1L, 0L, 0L, 1L), S5 = c(0L, 1L, 1L, 0L, 0L, 0L, 0L, > 0L, 0L, 0L)), class = "data.frame", row.names = c("a1", "a2", > "a3", "b1", "b2", "b3", "b4", "c1", "c2", "c4")) > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Eik Vettorazzi
2021-May-27 10:16 UTC
[R] Decompose df1 into another df2 based on values in df1
A tidyverse-ish solution would be library(dplyr) library(tidyr) library(tibble) # max cols to split values into seps<-max(stringr::str_count(unlist(d1),"[/|]"))+1 d1 %>% pivot_longer(S1:S5, names_to="S") %>% mutate(value=na_if(value,"w")) %>% separate(value,"[/|]", into=LETTERS[1:seps], fill="right") %>% pivot_longer(-S, names_to=NULL, values_to="rownames") %>% filter(!is.na(rownames)) %>% mutate(index=1L)%>%pivot_wider(names_from=S, values_from=index) %>% mutate_all(replace_na,0L) %>% column_to_rownames(var = "rownames") Best, Eik Am 26.05.2021 um 23:16 schrieb Adrian Johnson:> Hello, > > I am trying to convert a df (given below as d1) into df2 (given below as > res). > > I tried using loops for each row. I cannot get it right. Moreover the df > is 250000 x 500 in dimension and I cannot get it to work. > > Could anyone help me here please. > > Thanks. > Adrian. > > d1 <- > structure(list(S1 = c("a1|a2", "b1|b3", "w"), S2 = c("w", "b1", > "c2"), S3 = c("a2", "b3|b4|b1", "c1|c4"), S4 = c("w", "b4", "c4" > ), S5 = c("a2/a3", "w", "w")), class = "data.frame", row.names = c("A", > "B", "C")) > > res <- > structure(list(S1 = c(1L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L), > S2 = c(0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L), S3 = c(0L, > 1L, 0L, 1L, 0L, 1L, 1L, 1L, 0L, 1L), S4 = c(0L, 0L, 0L, 0L, > 0L, 0L, 1L, 0L, 0L, 1L), S5 = c(0L, 1L, 1L, 0L, 0L, 0L, 0L, > 0L, 0L, 0L)), class = "data.frame", row.names = c("a1", "a2", > "a3", "b1", "b2", "b3", "b4", "c1", "c2", "c4")) > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- _____________________________________________________________________ Universit?tsklinikum Hamburg-Eppendorf; K?rperschaft des ?ffentlichen Rechts; Gerichtsstand: Hamburg | www.uke.de Vorstandsmitglieder: Prof. Dr. Burkhard G?ke (Vorsitzender), Joachim Pr?l?, Prof. Dr. Blanche Schwappach-Pignataro, Marya Verdel _____________________________________________________________________ SAVE PAPER - THINK BEFORE PRINTING