?Hi, I need help on how to sub-sample my data in R. I have two tables, one table with global dataset and another with a list of codes for sub sample the global dataset. ? Download link ? https://www.wetransfer.com/downloads/a53aa3f722b2d2887b60b09239f7c4e620151018152100/ad852e ? ?All these tables have a column called: alloc_key. I want to use the list of codes in reg6id_subsample_key.csv to extract all raws from reg6idGlobal.csv for matching/ similar codes in column alloc_key of reg6idGlobal.csv file. ?Somebody with an idea on how I can go about doing this? Thanks for your help John? ? [[alternative HTML version deleted]]
Dear John I suspect ?merge will help here. You would need to tidy up afterwards (or before) if you do not want the other columns in your dataframe of keys to be included in the result. On 19/10/2015 11:19, John Wasige wrote:> ?Hi, I need help on how to sub-sample my data in R. I have two tables, one > table with global dataset and another with a list of codes for sub sample > the global dataset. > > ? > Download link > ? > > https://www.wetransfer.com/downloads/a53aa3f722b2d2887b60b09239f7c4e620151018152100/ad852e > > ? > > ?All these tables have a column called: alloc_key. I want to use the list > of codes in reg6id_subsample_key.csv to extract all raws from > reg6idGlobal.csv for matching/ similar codes in column alloc_key of > reg6idGlobal.csv file. > > > ?Somebody with an idea on how I can go about doing this? > > Thanks for your help > > John? > > > > > ? > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Michael http://www.dewey.myzen.co.uk/home.html
I all you are doing is matching a single column and then extracting rows that match, then the '%in%' operator should work: indx <- reg6idGlobal$alloc_key %in% req6id_subsample_key$alloc_key my_subset <- reg6idGlobal[indx, ] Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Mon, Oct 19, 2015 at 6:19 AM, John Wasige <johnwasige at gmail.com> wrote:> ?Hi, I need help on how to sub-sample my data in R. I have two tables, one > table with global dataset and another with a list of codes for sub sample > the global dataset. > > ? > Download link > ? > > > https://www.wetransfer.com/downloads/a53aa3f722b2d2887b60b09239f7c4e620151018152100/ad852e > > ? > > ?All these tables have a column called: alloc_key. I want to use the list > of codes in reg6id_subsample_key.csv to extract all raws from > reg6idGlobal.csv for matching/ similar codes in column alloc_key of > reg6idGlobal.csv file. > > > ?Somebody with an idea on how I can go about doing this? > > Thanks for your help > > John? > > > > > ? > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]