Yuan, Keming (CDC/DDNID/NCIPC/DVP)
2019-Mar-25 16:10 UTC
[R] loop through columns in a data frame
Hi All, I have a data frame with variable names like A_le, A_me, B_le, B_me, C_le, C_me.... if A_le=1 or A_me=1 then I need to create a new column A_new=1. Same operation to create columns B_new, C_new... Does anyone know how to use loop (or other methods) to create new columns? In SAS, I can use array to get it done. But I don't know how to do it in R. Thanks, Keming Yuan CDC [[alternative HTML version deleted]]
R Notebook
You forgot to provide what your test data looks like. For example, are all
the columns a single letter followed by ?_" as the name, or are there
longer names? Are there always matched pairs (?le? and ?me?) or can singles
occur?
Hide
library(tidyverse)# create some data
test <- tibble(a_le = sample(3, 10, TRUE),
a_me = sample(3, 10, TRUE),
b_le = sample(3, 10, TRUE),
b_me = sample(3, 10, TRUE),
long_le = sample(3, 10, TRUE),
long_me = sample(3, 10, TRUE),
short_le = sample(3, 10, TRUE)
)
So get the names of the columns that contain ?le? or ?me? and group them
together for processing
Hide
col_names <- grep("_(le|me)$", names(test), value = TRUE)
group <- tibble(id = str_remove(col_names, "_.*"), col = col_names)
result <- group %>%
group_by(id) %>%
do(tibble(x = rowSums(test[, .$col] == 1)))# add new columns backfor
(i in split(result, result$id)){
test[, paste0(i$id[1], "_new")] <- as.integer(i$x > 0)
}
test
a_le
<int>
a_me
<int>
b_le
<int>
b_me
<int>
long_le
<int>
long_me
<int>
short_le
<int>
a_new
<int>
b_new
<int>
long_new
<int>
3 1 2 3 1 2 2 1 0 1
2 3 3 2 1 1 1 0 0 1
3 2 3 2 1 3 3 0 0 1
2 3 1 3 3 1 2 0 1 1
1 1 2 1 1 2 2 1 1 1
3 3 3 1 1 1 1 0 1 1
1 2 1 2 2 2 2 1 1 0
1 3 2 3 1 1 3 1 0 1
3 1 1 1 3 3 2 1 1 0
1 1 1 2 3 3 3 1 1 0
1-10 of 10 rows | 1-10 of 11 columns
Jim Holtman
*Data Munger Guru*
*What is the problem that you are trying to solve?Tell me what you want to
do, not how you want to do it.*
On Mon, Mar 25, 2019 at 10:08 AM Yuan, Keming (CDC/DDNID/NCIPC/DVP) via
R-help <r-help at r-project.org> wrote:
> Hi All,
>
> I have a data frame with variable names like A_le, A_me, B_le, B_me, C_le,
> C_me....
> if A_le=1 or A_me=1 then I need to create a new column A_new=1. Same
> operation to create columns B_new, C_new...
> Does anyone know how to use loop (or other methods) to create new columns?
> In SAS, I can use array to get it done. But I don't know how to do it
in R.
>
> Thanks,
>
> Keming Yuan
> CDC
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
"Does anyone know how to use loop (or other methods) to create new columns? In SAS, I can use array to get it done. But I don't know how to do it in R." Yup. Practically all users of R know how, as this is entirely elementary. You will too if you make the effort to go through a basic R tutorial, of which there are many on the web (and one shipped with R). Cheers, Bert On Mon, Mar 25, 2019 at 10:08 AM Yuan, Keming (CDC/DDNID/NCIPC/DVP) via R-help <r-help at r-project.org> wrote:> Hi All, > > I have a data frame with variable names like A_le, A_me, B_le, B_me, C_le, > C_me.... > if A_le=1 or A_me=1 then I need to create a new column A_new=1. Same > operation to create columns B_new, C_new... > Does anyone know how to use loop (or other methods) to create new columns? > In SAS, I can use array to get it done. But I don't know how to do it in R. > > Thanks, > > Keming Yuan > CDC > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Yuan, Keming (CDC/DDNID/NCIPC/DVP)
2019-Mar-25 18:26 UTC
[R] loop through columns in a data frame
Thank you so much, Jim. That?s exactly what I need. Sorry for not providing the
data frame. But you created the correct data structure. Thanks again!
From: jim holtman <jholtman at gmail.com>
Sent: Monday, March 25, 2019 2:07 PM
To: Yuan, Keming (CDC/DDNID/NCIPC/DVP) <vrm4 at cdc.gov>
Cc: R-help at r-project.org
Subject: Re: [R] loop through columns in a data frame
R Notebook
You forgot to provide what your test data looks like. For example, are all the
columns a single letter followed by ?_" as the name, or are there longer
names? Are there always matched pairs (?le? and ?me?) or can singles occur?
Hide
library(tidyverse)
# create some data
test <- tibble(a_le = sample(3, 10, TRUE),
a_me = sample(3, 10, TRUE),
b_le = sample(3, 10, TRUE),
b_me = sample(3, 10, TRUE),
long_le = sample(3, 10, TRUE),
long_me = sample(3, 10, TRUE),
short_le = sample(3, 10, TRUE)
)
So get the names of the columns that contain ?le? or ?me? and group them
together for processing
Hide
col_names <- grep("_(le|me)$", names(test), value = TRUE)
group <- tibble(id = str_remove(col_names, "_.*"), col = col_names)
result <- group %>%
group_by(id) %>%
do(tibble(x = rowSums(test[, .$col] == 1)))
# add new columns back
for (i in split(result, result$id)){
test[, paste0(i$id[1], "_new")] <- as.integer(i$x > 0)
}
test
a_le
<int>
a_me
<int>
b_le
<int>
b_me
<int>
long_le
<int>
long_me
<int>
short_le
<int>
a_new
<int>
b_new
<int>
long_new
<int>
3
1
2
3
1
2
2
1
0
1
2
3
3
2
1
1
1
0
0
1
3
2
3
2
1
3
3
0
0
1
2
3
1
3
3
1
2
0
1
1
1
1
2
1
1
2
2
1
1
1
3
3
3
1
1
1
1
0
1
1
1
2
1
2
2
2
2
1
1
0
1
3
2
3
1
1
3
1
0
1
3
1
1
1
3
3
2
1
1
0
1
1
1
2
3
3
3
1
1
0
1-10 of 10 rows | 1-10 of 11 columns
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.
On Mon, Mar 25, 2019 at 10:08 AM Yuan, Keming (CDC/DDNID/NCIPC/DVP) via R-help
<r-help at r-project.org<mailto:r-help at r-project.org>> wrote:
Hi All,
I have a data frame with variable names like A_le, A_me, B_le, B_me, C_le,
C_me....
if A_le=1 or A_me=1 then I need to create a new column A_new=1. Same operation
to create columns B_new, C_new...
Does anyone know how to use loop (or other methods) to create new columns? In
SAS, I can use array to get it done. But I don't know how to do it in R.
Thanks,
Keming Yuan
CDC
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org<mailto:R-help at r-project.org> mailing list -- To
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]