Dear R users,
My data frame has four "groups" namely A1, B2, C3,
& D4. Each group has 12 rows (variable "plotno). I like to randomly
sample one "plotno" within each "groups" variable and label
it as "CONTROL" and label others as "TEST" in a new variable
called "entry". I am trying to do this by looping over the group
variable and then sample "plotno" within a given group. I am ending
up with four "CONTROL" plots but they are generated by sampling over
all the groups instead of each group. I need one random "plotno"
assigned as a "CONTROL" per group (A1, B2, C3, D4). I would
appreciate any help in modifying my function "funa" or suggest any
alternative and better way to do this task. Below is the dataset and function I
am working with.
# dataset (df)
structure(list(plotno = 1:48, groups = c("A1", "A1",
"A1", "A1",
"A1", "A1", "A1", "A1", "A1",
"A1", "A1", "A1", "B2", "B2",
"B2",
"B2", "B2", "B2", "B2", "B2",
"B2", "B2", "B2", "B2", "C3",
"C3",
"C3", "C3", "C3", "C3", "C3",
"C3", "C3", "C3", "C3", "C3",
"D4",
"D4", "D4", "D4", "D4", "D4",
"D4", "D4", "D4", "D4", "D4",
"D4"
)), .Names = c("plotno", "groups"), row.names = c(NA, -48L),
class = "data.frame")
# function (funa)
function (dataset)
{
set.seed(1)
bay <- unique(dataset$groups)
IND <- c()
df2 <- dataset
for (i in bay) {
IND[i] <- which(plotno %in% sample(plotno, 1))
df2$entry <- ifelse(df2$plotno %in% IND, "CONTROL",
"TEST")
}
df2
}
# session info
R version 3.2.1 (2015-06-18)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] tools_3.2.1
Thanks.
Nilesh
Nilesh Dighe
(806)-252-7492 (Cell)
(806)-741-2019 (Office)
This e-mail message may contain privileged and/or confidential information, and
is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please
notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of
this e-mail by you is strictly prohibited.
All e-mails and attachments sent and received are subject to monitoring, reading
and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking
for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage
caused by any such code transmitted by or accompanying
this e-mail or any attachment.
The information contained in this email may be subject to the export control
laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and
sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this
information you are obligated to comply with all
applicable U.S. export laws and regulations.
[[alternative HTML version deleted]]
I would change strategies.
Create a new variable, say,
num.in.grp <- rep(1:12, 4)
Then sample from 1:12, and add appropriate amounts so that they become row
numbers within the four sets of 12 rows
ctrls <- ssample(1:12, 4, replace=TRUE) + c(0,12,24,36)
Now that we have four random row numbers, assign entry appropriately
entry <- rep('TEST', 48)
entry[ctrls] <- 'CONTROL'
The above is not tested, and makes several assumptions, particularly that
the data frame is sorted by groups and that there are four groups of 12
each. Thus it is does not generalize, not without some work.
-Don
--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
On 3/17/16, 10:18 AM, "R-help on behalf of DIGHE, NILESH [AG/2362]"
<r-help-bounces at r-project.org on behalf of nilesh.dighe at
monsanto.com>
wrote:
>Dear R users,
> My data frame has four "groups" namely A1, B2, C3,
& D4.
>Each group has 12 rows (variable "plotno). I like to randomly sample
one
>"plotno" within each "groups" variable and label it as
"CONTROL" and
>label others as "TEST" in a new variable called "entry".
I am trying to
>do this by looping over the group variable and then sample
"plotno"
>within a given group. I am ending up with four "CONTROL" plots
but they
>are generated by sampling over all the groups instead of each group. I
>need one random "plotno" assigned as a "CONTROL" per
group (A1, B2, C3,
>D4). I would appreciate any help in modifying my function "funa"
or
>suggest any alternative and better way to do this task. Below is the
>dataset and function I am working with.
>
># dataset (df)
>structure(list(plotno = 1:48, groups = c("A1", "A1",
"A1", "A1",
>"A1", "A1", "A1", "A1",
"A1", "A1", "A1", "A1", "B2",
"B2", "B2",
>"B2", "B2", "B2", "B2",
"B2", "B2", "B2", "B2", "B2",
"C3", "C3",
>"C3", "C3", "C3", "C3",
"C3", "C3", "C3", "C3", "C3",
"C3", "D4",
>"D4", "D4", "D4", "D4",
"D4", "D4", "D4", "D4", "D4",
"D4", "D4"
>)), .Names = c("plotno", "groups"), row.names = c(NA,
-48L), class >"data.frame")
>
># function (funa)
>
>function (dataset)
>
>{
>
> set.seed(1)
>
> bay <- unique(dataset$groups)
>
> IND <- c()
>
> df2 <- dataset
>
> for (i in bay) {
>
> IND[i] <- which(plotno %in% sample(plotno, 1))
>
> df2$entry <- ifelse(df2$plotno %in% IND, "CONTROL",
"TEST")
>
> }
>
> df2
>
>}
>
>
># session info
>
>R version 3.2.1 (2015-06-18)
>
>Platform: i386-w64-mingw32/i386 (32-bit)
>
>Running under: Windows 7 x64 (build 7601) Service Pack 1
>
>
>
>locale:
>
>[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
>States.1252 LC_MONETARY=English_United States.1252
>
>[4] LC_NUMERIC=C LC_TIME=English_United
>States.1252
>
>
>
>attached base packages:
>
>[1] stats graphics grDevices utils datasets methods base
>
>
>
>loaded via a namespace (and not attached):
>
>[1] tools_3.2.1
>
>Thanks.
>Nilesh
>
>
>Nilesh Dighe
>(806)-252-7492 (Cell)
>(806)-741-2019 (Office)
>
>
>This e-mail message may contain privileged and/or confidential
>information, and is intended to be received only by persons entitled
>to receive such information. If you have received this e-mail in error,
>please notify the sender immediately. Please delete it and
>all attachments from any servers, hard drives or any other media. Other
>use of this e-mail by you is strictly prohibited.
>
>All e-mails and attachments sent and received are subject to monitoring,
>reading and archival by Monsanto, including its
>subsidiaries. The recipient of this e-mail is solely responsible for
>checking for the presence of "Viruses" or other
"Malware".
>Monsanto, along with its subsidiaries, accepts no liability for any
>damage caused by any such code transmitted by or accompanying
>this e-mail or any attachment.
>
>
>The information contained in this email may be subject to the export
>control laws and regulations of the United States, potentially
>including but not limited to the Export Administration Regulations (EAR)
>and sanctions regulations issued by the U.S. Department of
>Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of
>this information you are obligated to comply with all
>applicable U.S. export laws and regulations.
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
Hi,
you can try
df1<-split(df,df$groups)
lapply(df1, function(x)
{
x<-cbind(x,entry=0)
sam <- sample(x$plotno,1)
x$entry[which(x$plotno==sam)]<-"CONTROL"
x$entry[which(!x$plotno==sam)]<-"TEST"
x
}
)
Tanvir Ahamed
G?teborg, Sweden | mashranga at yahoo.com
________________________________
From: "DIGHE, NILESH [AG/2362]" <nilesh.dighe at monsanto.com>
To: "r-help at r-project.org" <r-help at r-project.org>
Sent: Thursday, 17 March 2016, 18:18
Subject: [R] sample within a loop
Dear R users,
My data frame has four "groups" namely A1, B2, C3,
& D4. Each group has 12 rows (variable "plotno). I like to randomly
sample one "plotno" within each "groups" variable and label
it as "CONTROL" and label others as "TEST" in a new variable
called "entry". I am trying to do this by looping over the group
variable and then sample "plotno" within a given group. I am ending
up with four "CONTROL" plots but they are generated by sampling over
all the groups instead of each group. I need one random "plotno"
assigned as a "CONTROL" per group (A1, B2, C3, D4). I would
appreciate any help in modifying my function "funa" or suggest any
alternative and better way to do this task. Below is the dataset and function I
am working with.
# dataset (df)
structure(list(plotno = 1:48, groups = c("A1", "A1",
"A1", "A1",
"A1", "A1", "A1", "A1", "A1",
"A1", "A1", "A1", "B2", "B2",
"B2",
"B2", "B2", "B2", "B2", "B2",
"B2", "B2", "B2", "B2", "C3",
"C3",
"C3", "C3", "C3", "C3", "C3",
"C3", "C3", "C3", "C3", "C3",
"D4",
"D4", "D4", "D4", "D4", "D4",
"D4", "D4", "D4", "D4", "D4",
"D4"
)), .Names = c("plotno", "groups"), row.names = c(NA, -48L),
class = "data.frame")
# function (funa)
function (dataset)
{
set.seed(1)
bay <- unique(dataset$groups)
IND <- c()
df2 <- dataset
for (i in bay) {
IND[i] <- which(plotno %in% sample(plotno, 1))
df2$entry <- ifelse(df2$plotno %in% IND, "CONTROL",
"TEST")
}
df2
}
# session info
R version 3.2.1 (2015-06-18)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] tools_3.2.1
Thanks.
Nilesh
Nilesh Dighe
(806)-252-7492 (Cell)
(806)-741-2019 (Office)
This e-mail message may contain privileged and/or confidential information, and
is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please
notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of
this e-mail by you is strictly prohibited.
All e-mails and attachments sent and received are subject to monitoring, reading
and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking
for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage
caused by any such code transmitted by or accompanying
this e-mail or any attachment.
The information contained in this email may be subject to the export control
laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and
sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this
information you are obligated to comply with all
applicable U.S. export laws and regulations.
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Tanvir & Don: Thanks a lot for your solutions. Both solutions work great.
I really appreciate your help.
Regards,
Nilesh
-----Original Message-----
From: Mohammad Tanvir Ahamed [mailto:mashranga at yahoo.com]
Sent: Thursday, March 17, 2016 1:24 PM
To: DIGHE, NILESH [AG/2362]; r-help at r-project.org
Subject: Re: [R] sample within a loop
Hi,
you can try
df1<-split(df,df$groups)
lapply(df1, function(x)
{
x<-cbind(x,entry=0)
sam <- sample(x$plotno,1)
x$entry[which(x$plotno==sam)]<-"CONTROL"
x$entry[which(!x$plotno==sam)]<-"TEST"
x
}
)
Tanvir Ahamed
G?teborg, Sweden | mashranga at yahoo.com
________________________________
From: "DIGHE, NILESH [AG/2362]" <nilesh.dighe at monsanto.com>
To: "r-help at r-project.org" <r-help at r-project.org>
Sent: Thursday, 17 March 2016, 18:18
Subject: [R] sample within a loop
Dear R users,
My data frame has four "groups" namely A1, B2, C3,
& D4. Each group has 12 rows (variable "plotno). I like to randomly
sample one "plotno" within each "groups" variable and label
it as "CONTROL" and label others as "TEST" in a new variable
called "entry". I am trying to do this by looping over the group
variable and then sample "plotno" within a given group. I am ending
up with four "CONTROL" plots but they are generated by sampling over
all the groups instead of each group. I need one random "plotno"
assigned as a "CONTROL" per group (A1, B2, C3, D4). I would
appreciate any help in modifying my function "funa" or suggest any
alternative and better way to do this task. Below is the dataset and function I
am working with.
# dataset (df)
structure(list(plotno = 1:48, groups = c("A1", "A1",
"A1", "A1",
"A1", "A1", "A1", "A1", "A1",
"A1", "A1", "A1", "B2", "B2",
"B2",
"B2", "B2", "B2", "B2", "B2",
"B2", "B2", "B2", "B2", "C3",
"C3",
"C3", "C3", "C3", "C3", "C3",
"C3", "C3", "C3", "C3", "C3",
"D4",
"D4", "D4", "D4", "D4", "D4",
"D4", "D4", "D4", "D4", "D4",
"D4"
)), .Names = c("plotno", "groups"), row.names = c(NA, -48L),
class = "data.frame")
# function (funa)
function (dataset)
{
set.seed(1)
bay <- unique(dataset$groups)
IND <- c()
df2 <- dataset
for (i in bay) {
IND[i] <- which(plotno %in% sample(plotno, 1))
df2$entry <- ifelse(df2$plotno %in% IND, "CONTROL",
"TEST")
}
df2
}
# session info
R version 3.2.1 (2015-06-18)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] tools_3.2.1
Thanks.
Nilesh
Nilesh Dighe
(806)-252-7492 (Cell)
(806)-741-2019 (Office)
This e-mail message may contain privileged and/or confidential information, and
is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please
notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of
this e-mail by you is strictly prohibited.
All e-mails and attachments sent and received are subject to monitoring, reading
and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking
for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage
caused by any such code transmitted by or accompanying
this e-mail or any attachment.
The information contained in this email may be subject to the export control
laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and
sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this
information you are obligated to comply with all
applicable U.S. export laws and regulations.
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
This e-mail message may contain privileged and/or confidential information, and
is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please
notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of
this e-mail by you is strictly prohibited.
All e-mails and attachments sent and received are subject to monitoring, reading
and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking
for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage
caused by any such code transmitted by or accompanying
this e-mail or any attachment.
The information contained in this email may be subject to the export control
laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and
sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this
information you are obligated to comply with all
applicable U.S. export laws and regulations.