AbouEl-Makarim Aboueissa
2021-Sep-03 01:30 UTC
[R] Splitting a data column randomly into 3 groups
Dear All: How to split a column data *randomly* into three groups. Please see the attached data. I need to split column #2 titled "Data" with many thanks abou ______________________ *AbouEl-Makarim Aboueissa, PhD* *Professor, Statistics and Data Science* *Graduate Coordinator* *Department of Mathematics and Statistics* *University of Southern Maine* -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: data_example.txt URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20210902/064b4219/attachment.txt>
What is stopping you Abou? Some of us here start wondering if we have better things to do than homework for others. Help is supposed to be after they try and encounter issues that we may help with. So think about your problem. You supplied data in a file that is NOT in CSV format but is in Tab separated format. You need to get it in to your program and store it in something. It looks like you have 204 items so 1/3 of those would be exactly 68. So if your data is in an object like a vector or data.frame, you want to choose random number between 1 and 204. How do you do that? You need 1/3 of the length of the object items, in your case 68. Now extract the items with those indices into say A1. Extract all the rest into a temporary item. Make another 68 random indices, with no overlap, and copy those items into A2 and the ones that do not have those into A3 and you are sort of done, other than some cleanup or whatever. There are many ways to do the above and I am sure packages too. But since you have made no visible effort, I personally am not going to pick anything in particular. Had you shown some text and code along the lines of the above and just wanted to know how to copy just the ones that were not selected, we could easily ... -----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of AbouEl-Makarim Aboueissa Sent: Thursday, September 2, 2021 9:30 PM To: R mailing list <r-help at r-project.org> Subject: [R] Splitting a data column randomly into 3 groups Dear All: How to split a column data *randomly* into three groups. Please see the attached data. I need to split column #2 titled "Data" with many thanks abou ______________________ *AbouEl-Makarim Aboueissa, PhD* *Professor, Statistics and Data Science* *Graduate Coordinator* *Department of Mathematics and Statistics* *University of Southern Maine*
Hi Abou, One way is to shuffle the original data frame using sample(). and split up the result into three equal parts. I was going to provide example code, but Avi's response popped up and I kind of agree with him. Jim On Fri, Sep 3, 2021 at 11:31 AM AbouEl-Makarim Aboueissa <abouelmakarim1962 at gmail.com> wrote:> > Dear All: > > How to split a column data *randomly* into three groups. Please see the > attached data. I need to split column #2 titled "Data" > > with many thanks > abou > ______________________ > > > *AbouEl-Makarim Aboueissa, PhD* > > *Professor, Statistics and Data Science* > *Graduate Coordinator* > > *Department of Mathematics and Statistics* > *University of Southern Maine* > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Your question is ambiguous. One reading is n <- length(table$Data) m <- n %/% 3 s <- sample(1:n, n) X <- table$Data[s[1:m]] Y <- table$Data[s[(m+1):(2*m)]] Z <- table$Data[s[(m*2+1):(3*m)]] On Fri, 3 Sept 2021 at 13:31, AbouEl-Makarim Aboueissa <abouelmakarim1962 at gmail.com> wrote:> > Dear All: > > How to split a column data *randomly* into three groups. Please see the > attached data. I need to split column #2 titled "Data" > > with many thanks > abou > ______________________ > > > *AbouEl-Makarim Aboueissa, PhD* > > *Professor, Statistics and Data Science* > *Graduate Coordinator* > > *Department of Mathematics and Statistics* > *University of Southern Maine* > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.