DIGHE, NILESH [AG/2362]
2017-Jul-20 15:55 UTC
[R] dynamically create columns using a function
Hi,
I am writing a function to dynamically create column names and fill those
columns with some basic calculations. My function "demo_fn" takes
argument "blup_datacut" and I like to use the contents of those
arguments to dynamically create new columns in my dataset. Please note that I
have another function called "calc_gg" within the function
"demo_fn". Both functions are pasted below.
I have a for loop within my function and it appears to only create new column
for the last value in the argument "blup_datacut" which makes me think
that I am not storing the values coming out of for_loop correctly. I have
"expected_results", dataset, & functions pasted below to reproduce
my problem and expected results.
Any help will be greatly appreciate.
# dataset
dem<- structure(list(id = c("L1", "L2", "L3",
"M1", "M2", "M3"), TEST_SET_NAME =
c("A",
"A", "A", "B", "B", "B"),
YLD_BE_REG1 = c(1467L, 1455L, 1382L,
1463L, 1466L, 1455L), YLD_BE_REG2 = c(1501L, 1441L, 1421L, 1482L,
1457L, 1490L), IS_GG = c("NO", "NO", "YES",
"NO", "NO", "YES"
)), .Names = c("id", "TEST_SET_NAME",
"YLD_BE_REG1", "YLD_BE_REG2",
"IS_GG"), class = "data.frame", row.names = c(NA, -6L))
# function demo_fn
demo_fn<- function (dat, blup_datacut = c("REG1",
"REG2"))
{
for (i in seq_along(blup_datacut)) {
col_name_gg <- paste("GG", blup_datacut[i], sep =
"_")
col_mean_gg <- paste("YLD_BE", blup_datacut[i], sep =
"_")
dat2 <- calc_gg(dataset = dat, col = col_mean_gg, col_name =
col_name_gg)
}
dat2
}
# function calc_gg
Calc_gg<- function (dataset, col, col_name)
{
mutate_call = lazyeval::interp(~round(((a - mean(a[IS_GG =
"YES"], na.rm = TRUE))/mean(a[IS_GG == "YES"], na.rm
= TRUE)) *
100, 1), a = as.name(col))
dataset %>% group_by(TEST_SET_NAME) %>% mutate_(.dots =
setNames(list(mutate_call),
col_name)) %>% ungroup()
}
# run function
results_demo<- demo_fn(dat = dem)
# expected results
structure(list(id = c("L1", "L2", "L3",
"M1", "M2", "M3"), TEST_SET_NAME =
c("A",
"A", "A", "B", "B", "B"),
YLD_BE_REG1 = c(1467L, 1455L, 1382L,
1463L, 1466L, 1455L), YLD_BE_REG2 = c(1501L, 1441L, 1421L, 1482L,
1457L, 1490L), IS_GG = c("NO", "NO", "YES",
"NO", "NO", "YES"
), GG_REG1 = c(6.2, 5.3, 0, 0.5, 0.8, 0), GG_REG2 = c(5.6, 1.4,
0, -0.5, -2.2, 0)), .Names = c("id", "TEST_SET_NAME",
"YLD_BE_REG1",
"YLD_BE_REG2", "IS_GG", "GG_REG1",
"GG_REG2"), row.names = c(NA,
-6L), class = "data.frame")
Thanks.
Nilesh
This email and any attachments were sent from a Monsanto email account and may
contain confidential and/or privileged information. If you are not the intended
recipient, please contact the sender and delete this email and any attachments
immediately. Any unauthorized use, including disclosing, printing, storing,
copying or distributing this email, is prohibited. All emails and attachments
sent to or from Monsanto email accounts may be subject to monitoring, reading,
and archiving by Monsanto, including its affiliates and subsidiaries, as
permitted by applicable law. Thank you.
[[alternative HTML version deleted]]
Hi,
I don't know about the lazyeval package or what you are trying to do but to
answer the main question "How to create columns dynamically using a
function?" I would do something like that:
# dataset
dem <- structure(list(id = c("L1", "L2", "L3",
"M1", "M2", "M3"),
TEST_SET_NAME = c("A",
"A", "A", "B", "B", "B"),
YLD_BE_REG1 = c(1467L, 1455L, 1382L,
1463L, 1466L, 1455L), YLD_BE_REG2 = c(1501L, 1441L, 1421L, 1482L,
1457L, 1490L), IS_GG = c("NO", "NO", "YES",
"NO", "NO", "YES"
)), .Names = c("id", "TEST_SET_NAME",
"YLD_BE_REG1", "YLD_BE_REG2",
"IS_GG"), class = "data.frame", row.names = c(NA, -6L))
demo_fn<- function (data, f, names) {
for (i in names) {
data <- f(data, i)
}
data
}
f <- function(data, name) {
col_work <- paste("YLD_BE", name, sep = "_")
col_name_result <- paste("GG", name, sep = "_")
#do something interesting, here I am simply copying the column
data[col_name_result] <- data[col_work]
data
}
demo_fn(dem, f, c("REG1", "REG2"))
If you are working with large datasets it might not be the best solution as
my understanding is that this method involves a lot of copying.
Hope it helps,
Elie Canonici Merle
2017-07-20 17:55 GMT+02:00 DIGHE, NILESH [AG/2362] <
nilesh.dighe at monsanto.com>:
> Hi,
> I am writing a function to dynamically create column names and fill those
> columns with some basic calculations. My function "demo_fn"
takes argument
> "blup_datacut" and I like to use the contents of those arguments
to
> dynamically create new columns in my dataset. Please note that I have
> another function called "calc_gg" within the function
"demo_fn". Both
> functions are pasted below.
> I have a for loop within my function and it appears to only create new
> column for the last value in the argument "blup_datacut" which
makes me
> think that I am not storing the values coming out of for_loop correctly. I
> have "expected_results", dataset, & functions pasted below to
reproduce my
> problem and expected results.
> Any help will be greatly appreciate.
>
>
> # dataset
> dem<- structure(list(id = c("L1", "L2",
"L3", "M1", "M2", "M3"),
> TEST_SET_NAME = c("A",
> "A", "A", "B", "B", "B"),
YLD_BE_REG1 = c(1467L, 1455L, 1382L,
> 1463L, 1466L, 1455L), YLD_BE_REG2 = c(1501L, 1441L, 1421L, 1482L,
> 1457L, 1490L), IS_GG = c("NO", "NO", "YES",
"NO", "NO", "YES"
> )), .Names = c("id", "TEST_SET_NAME",
"YLD_BE_REG1", "YLD_BE_REG2",
> "IS_GG"), class = "data.frame", row.names = c(NA, -6L))
>
> # function demo_fn
>
> demo_fn<- function (dat, blup_datacut = c("REG1",
"REG2"))
>
> {
>
> for (i in seq_along(blup_datacut)) {
>
> col_name_gg <- paste("GG", blup_datacut[i], sep =
"_")
>
> col_mean_gg <- paste("YLD_BE", blup_datacut[i], sep =
"_")
>
> dat2 <- calc_gg(dataset = dat, col = col_mean_gg, col_name >
col_name_gg)
>
> }
>
> dat2
>
> }
>
>
> # function calc_gg
>
> Calc_gg<- function (dataset, col, col_name)
>
> {
>
> mutate_call = lazyeval::interp(~round(((a - mean(a[IS_GG =>
> "YES"], na.rm = TRUE))/mean(a[IS_GG == "YES"],
na.rm = TRUE)) *
>
> 100, 1), a = as.name(col))
>
> dataset %>% group_by(TEST_SET_NAME) %>% mutate_(.dots >
setNames(list(mutate_call),
>
> col_name)) %>% ungroup()
>
> }
>
>
> # run function
> results_demo<- demo_fn(dat = dem)
>
> # expected results
>
> structure(list(id = c("L1", "L2", "L3",
"M1", "M2", "M3"), TEST_SET_NAME >
c("A",
>
> "A", "A", "B", "B", "B"),
YLD_BE_REG1 = c(1467L, 1455L, 1382L,
>
> 1463L, 1466L, 1455L), YLD_BE_REG2 = c(1501L, 1441L, 1421L, 1482L,
>
> 1457L, 1490L), IS_GG = c("NO", "NO", "YES",
"NO", "NO", "YES"
>
> ), GG_REG1 = c(6.2, 5.3, 0, 0.5, 0.8, 0), GG_REG2 = c(5.6, 1.4,
>
> 0, -0.5, -2.2, 0)), .Names = c("id", "TEST_SET_NAME",
"YLD_BE_REG1",
>
> "YLD_BE_REG2", "IS_GG", "GG_REG1",
"GG_REG2"), row.names = c(NA,
>
> -6L), class = "data.frame")
>
> Thanks.
> Nilesh
> This email and any attachments were sent from a Monsanto email account and
> may contain confidential and/or privileged information. If you are not the
> intended recipient, please contact the sender and delete this email and any
> attachments immediately. Any unauthorized use, including disclosing,
> printing, storing, copying or distributing this email, is prohibited. All
> emails and attachments sent to or from Monsanto email accounts may be
> subject to monitoring, reading, and archiving by Monsanto, including its
> affiliates and subsidiaries, as permitted by applicable law. Thank you.
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
DIGHE, NILESH [AG/2362]
2017-Jul-21 13:00 UTC
[R] dynamically create columns using a function
Hi Elie,
Thanks for your time and efforts. I plugged in the calculation I wanted to do in
the code you provided and got I wanted perfectly. Below is the solution to my
original problem.
# dataset
dem<- structure(list(id = c("L1", "L2", "L3",
"M1", "M2", "M3"), TEST_SET_NAME =
c("A",
"A", "A", "B", "B", "B"),
YLD_BE_REG1 = c(1467L, 1455L, 1382L,
1463L, 1466L, 1455L), YLD_BE_REG2 = c(1501L, 1441L, 1421L, 1482L,
1457L, 1490L), IS_GG = c("NO", "NO", "YES",
"NO", "NO", "YES"
)), .Names = c("id", "TEST_SET_NAME",
"YLD_BE_REG1", "YLD_BE_REG2",
"IS_GG"), class = "data.frame", row.names = c(NA, -6L))
# function calc_gg
calc_gg<- function (dataset, col, col_name)
{
mutate_call = lazyeval::interp(~round(((a - mean(a[IS_GG =
"YES"], na.rm = TRUE))/mean(a[IS_GG == "YES"], na.rm
= TRUE)) *
100, 1), a = as.name(col))
dataset %>% group_by(TEST_SET_NAME) %>% mutate_(.dots =
setNames(list(mutate_call),
col_name)) %>% ungroup()
}
# function f
f<- function (dat, blup_datacut)
{
col_name_gg <- paste("GG", blup_datacut, sep = "_")
col_mean_gg <- paste("YLD_BE", blup_datacut, sep =
"_")
dat2 <- calc_gg(dataset = dat, col = col_mean_gg, col_name = col_name_gg)
dat2
}
# function demo_fn
demo_fn<- function (dat, f, blup_datacut)
{
for (i in blup_datacut) {
dat <- f(dat, i)
}
dat
}
# get expected results by applying functions
demo_fn(dem, f, c("REG1", "REG2"))
Best Regards,
Nilesh
From: Elie Canonici Merle [mailto:elie.canonicimerle at gmail.com]
Sent: Friday, July 21, 2017 3:44 AM
To: DIGHE, NILESH [AG/2362] <nilesh.dighe at monsanto.com>
Cc: r-help at r-project.org
Subject: Re: [R] dynamically create columns using a function
Hi,
I don't know about the lazyeval package or what you are trying to do but to
answer the main question "How to create columns dynamically using a
function?" I would do something like that:
# dataset
dem <- structure(list(id = c("L1", "L2", "L3",
"M1", "M2", "M3"), TEST_SET_NAME =
c("A",
"A", "A", "B", "B", "B"),
YLD_BE_REG1 = c(1467L, 1455L, 1382L,
1463L, 1466L, 1455L), YLD_BE_REG2 = c(1501L, 1441L, 1421L, 1482L,
1457L, 1490L), IS_GG = c("NO", "NO", "YES",
"NO", "NO", "YES"
)), .Names = c("id", "TEST_SET_NAME",
"YLD_BE_REG1", "YLD_BE_REG2",
"IS_GG"), class = "data.frame", row.names = c(NA, -6L))
demo_fn<- function (data, f, names) {
for (i in names) {
data <- f(data, i)
}
data
}
f <- function(data, name) {
col_work <- paste("YLD_BE", name, sep = "_")
col_name_result <- paste("GG", name, sep = "_")
#do something interesting, here I am simply copying the column
data[col_name_result] <- data[col_work]
data
}
demo_fn(dem, f, c("REG1", "REG2"))
If you are working with large datasets it might not be the best solution as my
understanding is that this method involves a lot of copying.
Hope it helps,
Elie Canonici Merle
2017-07-20 17:55 GMT+02:00 DIGHE, NILESH [AG/2362] <nilesh.dighe at
monsanto.com<mailto:nilesh.dighe at monsanto.com>>:
Hi,
I am writing a function to dynamically create column names and fill those
columns with some basic calculations. My function "demo_fn" takes
argument "blup_datacut" and I like to use the contents of those
arguments to dynamically create new columns in my dataset. Please note that I
have another function called "calc_gg" within the function
"demo_fn". Both functions are pasted below.
I have a for loop within my function and it appears to only create new column
for the last value in the argument "blup_datacut" which makes me think
that I am not storing the values coming out of for_loop correctly. I have
"expected_results", dataset, & functions pasted below to reproduce
my problem and expected results.
Any help will be greatly appreciate.
# dataset
dem<- structure(list(id = c("L1", "L2", "L3",
"M1", "M2", "M3"), TEST_SET_NAME =
c("A",
"A", "A", "B", "B", "B"),
YLD_BE_REG1 = c(1467L, 1455L, 1382L,
1463L, 1466L, 1455L), YLD_BE_REG2 = c(1501L, 1441L, 1421L, 1482L,
1457L, 1490L), IS_GG = c("NO", "NO", "YES",
"NO", "NO", "YES"
)), .Names = c("id", "TEST_SET_NAME",
"YLD_BE_REG1", "YLD_BE_REG2",
"IS_GG"), class = "data.frame", row.names = c(NA, -6L))
# function demo_fn
demo_fn<- function (dat, blup_datacut = c("REG1",
"REG2"))
{
for (i in seq_along(blup_datacut)) {
col_name_gg <- paste("GG", blup_datacut[i], sep =
"_")
col_mean_gg <- paste("YLD_BE", blup_datacut[i], sep =
"_")
dat2 <- calc_gg(dataset = dat, col = col_mean_gg, col_name =
col_name_gg)
}
dat2
}
# function calc_gg
Calc_gg<- function (dataset, col, col_name)
{
mutate_call = lazyeval::interp(~round(((a - mean(a[IS_GG =
"YES"], na.rm = TRUE))/mean(a[IS_GG == "YES"], na.rm
= TRUE)) *
100, 1), a = as.name<http://as.name>(col))
dataset %>% group_by(TEST_SET_NAME) %>% mutate_(.dots =
setNames(list(mutate_call),
col_name)) %>% ungroup()
}
# run function
results_demo<- demo_fn(dat = dem)
# expected results
structure(list(id = c("L1", "L2", "L3",
"M1", "M2", "M3"), TEST_SET_NAME =
c("A",
"A", "A", "B", "B", "B"),
YLD_BE_REG1 = c(1467L, 1455L, 1382L,
1463L, 1466L, 1455L), YLD_BE_REG2 = c(1501L, 1441L, 1421L, 1482L,
1457L, 1490L), IS_GG = c("NO", "NO", "YES",
"NO", "NO", "YES"
), GG_REG1 = c(6.2, 5.3, 0, 0.5, 0.8, 0), GG_REG2 = c(5.6, 1.4,
0, -0.5, -2.2, 0)), .Names = c("id", "TEST_SET_NAME",
"YLD_BE_REG1",
"YLD_BE_REG2", "IS_GG", "GG_REG1",
"GG_REG2"), row.names = c(NA,
-6L), class = "data.frame")
Thanks.
Nilesh
This email and any attachments were sent from a Monsanto email account and may
contain confidential and/or privileged information. If you are not the intended
recipient, please contact the sender and delete this email and any attachments
immediately. Any unauthorized use, including disclosing, printing, storing,
copying or distributing this email, is prohibited. All emails and attachments
sent to or from Monsanto email accounts may be subject to monitoring, reading,
and archiving by Monsanto, including its affiliates and subsidiaries, as
permitted by applicable law. Thank you.
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org<mailto:R-help at r-project.org> mailing list -- To
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
This email and any attachments were sent from a Monsanto email account and may
contain confidential and/or privileged information. If you are not the intended
recipient, please contact the sender and delete this email and any attachments
immediately. Any unauthorized use, including disclosing, printing, storing,
copying or distributing this email, is prohibited. All emails and attachments
sent to or from Monsanto email accounts may be subject to monitoring, reading,
and archiving by Monsanto, including its affiliates and subsidiaries, as
permitted by applicable law. Thank you.
[[alternative HTML version deleted]]