Yuan, Keming (CDC/DDNID/NCIPC/DVP)
2019-Mar-25 16:10 UTC
[R] loop through columns in a data frame
Hi All, I have a data frame with variable names like A_le, A_me, B_le, B_me, C_le, C_me.... if A_le=1 or A_me=1 then I need to create a new column A_new=1. Same operation to create columns B_new, C_new... Does anyone know how to use loop (or other methods) to create new columns? In SAS, I can use array to get it done. But I don't know how to do it in R. Thanks, Keming Yuan CDC [[alternative HTML version deleted]]
R Notebook You forgot to provide what your test data looks like. For example, are all the columns a single letter followed by ?_" as the name, or are there longer names? Are there always matched pairs (?le? and ?me?) or can singles occur? Hide library(tidyverse)# create some data test <- tibble(a_le = sample(3, 10, TRUE), a_me = sample(3, 10, TRUE), b_le = sample(3, 10, TRUE), b_me = sample(3, 10, TRUE), long_le = sample(3, 10, TRUE), long_me = sample(3, 10, TRUE), short_le = sample(3, 10, TRUE) ) So get the names of the columns that contain ?le? or ?me? and group them together for processing Hide col_names <- grep("_(le|me)$", names(test), value = TRUE) group <- tibble(id = str_remove(col_names, "_.*"), col = col_names) result <- group %>% group_by(id) %>% do(tibble(x = rowSums(test[, .$col] == 1)))# add new columns backfor (i in split(result, result$id)){ test[, paste0(i$id[1], "_new")] <- as.integer(i$x > 0) } test a_le <int> a_me <int> b_le <int> b_me <int> long_le <int> long_me <int> short_le <int> a_new <int> b_new <int> long_new <int> 3 1 2 3 1 2 2 1 0 1 2 3 3 2 1 1 1 0 0 1 3 2 3 2 1 3 3 0 0 1 2 3 1 3 3 1 2 0 1 1 1 1 2 1 1 2 2 1 1 1 3 3 3 1 1 1 1 0 1 1 1 2 1 2 2 2 2 1 1 0 1 3 2 3 1 1 3 1 0 1 3 1 1 1 3 3 2 1 1 0 1 1 1 2 3 3 3 1 1 0 1-10 of 10 rows | 1-10 of 11 columns Jim Holtman *Data Munger Guru* *What is the problem that you are trying to solve?Tell me what you want to do, not how you want to do it.* On Mon, Mar 25, 2019 at 10:08 AM Yuan, Keming (CDC/DDNID/NCIPC/DVP) via R-help <r-help at r-project.org> wrote:> Hi All, > > I have a data frame with variable names like A_le, A_me, B_le, B_me, C_le, > C_me.... > if A_le=1 or A_me=1 then I need to create a new column A_new=1. Same > operation to create columns B_new, C_new... > Does anyone know how to use loop (or other methods) to create new columns? > In SAS, I can use array to get it done. But I don't know how to do it in R. > > Thanks, > > Keming Yuan > CDC > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
"Does anyone know how to use loop (or other methods) to create new columns? In SAS, I can use array to get it done. But I don't know how to do it in R." Yup. Practically all users of R know how, as this is entirely elementary. You will too if you make the effort to go through a basic R tutorial, of which there are many on the web (and one shipped with R). Cheers, Bert On Mon, Mar 25, 2019 at 10:08 AM Yuan, Keming (CDC/DDNID/NCIPC/DVP) via R-help <r-help at r-project.org> wrote:> Hi All, > > I have a data frame with variable names like A_le, A_me, B_le, B_me, C_le, > C_me.... > if A_le=1 or A_me=1 then I need to create a new column A_new=1. Same > operation to create columns B_new, C_new... > Does anyone know how to use loop (or other methods) to create new columns? > In SAS, I can use array to get it done. But I don't know how to do it in R. > > Thanks, > > Keming Yuan > CDC > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Yuan, Keming (CDC/DDNID/NCIPC/DVP)
2019-Mar-25 18:26 UTC
[R] loop through columns in a data frame
Thank you so much, Jim. That?s exactly what I need. Sorry for not providing the data frame. But you created the correct data structure. Thanks again! From: jim holtman <jholtman at gmail.com> Sent: Monday, March 25, 2019 2:07 PM To: Yuan, Keming (CDC/DDNID/NCIPC/DVP) <vrm4 at cdc.gov> Cc: R-help at r-project.org Subject: Re: [R] loop through columns in a data frame R Notebook You forgot to provide what your test data looks like. For example, are all the columns a single letter followed by ?_" as the name, or are there longer names? Are there always matched pairs (?le? and ?me?) or can singles occur? Hide library(tidyverse) # create some data test <- tibble(a_le = sample(3, 10, TRUE), a_me = sample(3, 10, TRUE), b_le = sample(3, 10, TRUE), b_me = sample(3, 10, TRUE), long_le = sample(3, 10, TRUE), long_me = sample(3, 10, TRUE), short_le = sample(3, 10, TRUE) ) So get the names of the columns that contain ?le? or ?me? and group them together for processing Hide col_names <- grep("_(le|me)$", names(test), value = TRUE) group <- tibble(id = str_remove(col_names, "_.*"), col = col_names) result <- group %>% group_by(id) %>% do(tibble(x = rowSums(test[, .$col] == 1))) # add new columns back for (i in split(result, result$id)){ test[, paste0(i$id[1], "_new")] <- as.integer(i$x > 0) } test a_le <int> a_me <int> b_le <int> b_me <int> long_le <int> long_me <int> short_le <int> a_new <int> b_new <int> long_new <int> 3 1 2 3 1 2 2 1 0 1 2 3 3 2 1 1 1 0 0 1 3 2 3 2 1 3 3 0 0 1 2 3 1 3 3 1 2 0 1 1 1 1 2 1 1 2 2 1 1 1 3 3 3 1 1 1 1 0 1 1 1 2 1 2 2 2 2 1 1 0 1 3 2 3 1 1 3 1 0 1 3 1 1 1 3 3 2 1 1 0 1 1 1 2 3 3 3 1 1 0 1-10 of 10 rows | 1-10 of 11 columns Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Mon, Mar 25, 2019 at 10:08 AM Yuan, Keming (CDC/DDNID/NCIPC/DVP) via R-help <r-help at r-project.org<mailto:r-help at r-project.org>> wrote: Hi All, I have a data frame with variable names like A_le, A_me, B_le, B_me, C_le, C_me.... if A_le=1 or A_me=1 then I need to create a new column A_new=1. Same operation to create columns B_new, C_new... Does anyone know how to use loop (or other methods) to create new columns? In SAS, I can use array to get it done. But I don't know how to do it in R. Thanks, Keming Yuan CDC [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org<mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]