Ranjeet Kumar Jha
2022-Jul-25 13:02 UTC
[R] Need to insert various rows of data from a data frame after particular rows from another dataframe
Hello Everyone, I have dataset in a particular format in "dacnet_yield_update till 2019.xlsx" file, where I need to insert the data of rows 2018-2019 and 2019-2020 for the districts those data are available in "Kharif crops yield_18-19.xlsx". I need to insert these two rows of data belonging to every district, if data is available in a later excel file, just after the particular crop group data for the particular district. I have put the data file in the given link. https://drive.google.com/drive/u/0/folders/1dNmGTI8_c9PK1QqmfIjnpbyzuiCXgxFC Please help solving this problem. Regards and Thanks, Ranjeet [[alternative HTML version deleted]]
CALUM POLWART
2022-Jul-27 06:22 UTC
[R] Need to insert various rows of data from a data frame after particular rows from another dataframe
Not very clear what you are trying to do. But I'd have thought possibly dplyr left_join might be a solution for you. The base R equivalent is merge(). It might be a rbind or cbind can do it too. On Wed, 27 Jul 2022, 03:30 Ranjeet Kumar Jha, <ranjeetjhaiitkgp at gmail.com> wrote:> Hello Everyone, > > I have dataset in a particular format in "dacnet_yield_update till > 2019.xlsx" file, where I need to insert the data of rows 2018-2019 and > 2019-2020 for the districts those data are available in "Kharif crops > yield_18-19.xlsx". I need to insert these two rows of data belonging to > every district, if data is available in a later excel file, just after the > particular crop group data for the particular district. > > I have put the data file in the given link. > > https://drive.google.com/drive/u/0/folders/1dNmGTI8_c9PK1QqmfIjnpbyzuiCXgxFC > > Please help solving this problem. > > Regards and Thanks, > Ranjeet > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
PIKAL Petr
2022-Jul-27 06:30 UTC
[R] Need to insert various rows of data from a data frame after particular rows from another dataframe
Hi.>From what you say, plain "rbind" could be used, if the columns in both setsare the same and in the same order. After that you can reorder the resulting data frame as you wish by "order". AFAIK for most functions row order in data frame does not matter. Cheers Petr> -----Original Message----- > From: R-help <r-help-bounces at r-project.org> On Behalf Of Ranjeet Kumar Jha > Sent: Monday, July 25, 2022 3:03 PM > To: R-help <r-help at r-project.org> > Subject: [R] Need to insert various rows of data from a data frame after > particular rows from another dataframe > > Hello Everyone, > > I have dataset in a particular format in "dacnet_yield_update till2019.xlsx" file,> where I need to insert the data of rows 2018-2019 and > 2019-2020 for the districts those data are available in "Kharif cropsyield_18-> 19.xlsx". I need to insert these two rows of data belonging to everydistrict, if> data is available in a later excel file, just after the particular cropgroup data for> the particular district. > > I have put the data file in the given link. > https://drive.google.com/drive/u/0/folders/1dNmGTI8_c9PK1QqmfIjnpbyzuiC > XgxFC > > Please help solving this problem. > > Regards and Thanks, > Ranjeet > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
Ebert,Timothy Aaron
2022-Jul-27 12:34 UTC
[R] Need to insert various rows of data from a data frame after particular rows from another dataframe
I think what you want is full_join() from the dplyr package. https://www.rdocumentation.org/packages/dplyr/versions/0.7.8/topics/join The only requirement is that both data frames must have a column in common wherein the data are entered in the same way. So the column labeled "state" needs to have "ANANTAPUR" in both data frames rather than "ANANTAPUR" in one data frame and "Anantapur" in the other. Reformat your excel spreadsheet to remove headers. Your first column should be three columns: State, then Crop, then district rather than headings. The first row can contain variable names. I would make variable names simple (like "Area" and "Production") but some like more information so "Area_Ha" and "Production_TN_per_Ha" would also work. It is best not to use special symbols in variable names and keep variable names as one string of characters (no spaces). Including such will eventually cause problems. https://www.w3schools.com/r/r_variables.asp#:~:text=Variable%20Names&text=Rules%20for%20R%20variables%20are,be%20followed%20by%20a%20digit. One exception to the rules in the link is that you can make a variable name T or F. R defaults to interpreting these as TRUE and FALSE. The problem happens when the programmer reassigns these and then tries to use T or F as Boolean in other parts of the program. Tim -----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of Ranjeet Kumar Jha Sent: Monday, July 25, 2022 9:03 AM To: R-help <r-help at r-project.org> Subject: [R] Need to insert various rows of data from a data frame after particular rows from another dataframe [External Email] Hello Everyone, I have dataset in a particular format in "dacnet_yield_update till 2019.xlsx" file, where I need to insert the data of rows 2018-2019 and 2019-2020 for the districts those data are available in "Kharif crops yield_18-19.xlsx". I need to insert these two rows of data belonging to every district, if data is available in a later excel file, just after the particular crop group data for the particular district. I have put the data file in the given link. https://urldefense.proofpoint.com/v2/url?u=https-3A__drive.google.com_drive_u_0_folders_1dNmGTI8-5Fc9PK1QqmfIjnpbyzuiCXgxFC&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=9B32l682GguXDLEFdPm6j5JZNatveGSlY7lnwLYFVOW2TX1tNLeHbDE49MYxSh_Q&s=4_bhl2_drIA0Pn3LHMcoAd02lX0t6bAx2wSlhVAJelA&e Please help solving this problem. Regards and Thanks, Ranjeet [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dhelp&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=9B32l682GguXDLEFdPm6j5JZNatveGSlY7lnwLYFVOW2TX1tNLeHbDE49MYxSh_Q&s=MAGsb78RBOWV0usgeNnmHsZcYoQI959dmihJ9Ycs8Lo&ePLEASE do read the posting guide https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.org_posting-2Dguide.html&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=9B32l682GguXDLEFdPm6j5JZNatveGSlY7lnwLYFVOW2TX1tNLeHbDE49MYxSh_Q&s=PSiyw67xhInkZo69l1HojQKGOqthbxYpGL5Q14cPo8w&eand provide commented, minimal, self-contained, reproducible code.
Richard O'Keefe
2022-Jul-28 06:41 UTC
[R] Need to insert various rows of data from a data frame after particular rows from another dataframe
I'm retired, and I had an hour on my hands while tea cooked and my granddaughter did her homework, and I just *love* showing off how helpful I am. Good news: someone finally looked at your data. (That would be me.) Bad news: it's going to be a lot of work to do what you want to, and YOU SHOULDN'T EVEN TRY because it won't make any sense, as we would all have known at once had you been clear about the structure of your data in the beginning. You have two files. dacnet_yield_update till 2019.csv is a straightforward "data" file with structure crop factor(arhar,bajra,cotton,gram,maize,moong,mustard, potato,rice,soyabean,sugarcane,urad,wheat) season factor(kharif,rabi), state.id integer(1201..1235), state.name factor, # 34 states district.id integer(15001..15648), district.name factor, year integer(1998..2017), yield decimal(0.001 .. 314.736, precision=3) The one problem is that the file name is misleading. It says "till 2019", but includes no data for 2019 or 2018. Ah, but the other file! That's not a "data" file intended for machine use at all. It's a "display" file intended for human beings to look at and go "wow, gosh, lookit them numbahs". It's the kind of thing that gets included as an appendix in an official report which seems as if perversely designed to impede the development of insight as much as possible. Amongst other difficulties: - the same column contains state names, district names, crop names, and assorted junk; - state names are not coded the same way in the two files; - district names are in UPPERCASE in the .xls files and have numbers prefixed to them for no apparent reason; - crop names are not coded the same way in the two files; - yields are not coded the same way in the two files (3 digit precision in one, 2 digit precision in the other) and I have some doubt as to whether they are measuring the same thing; - above all, years appear to be CALENDAR years in the .csv file (e.g., 2017) but FINANCIAL years in the .xsl file (e.g., 2018-19) Now I could wrangle the .xls file into something closer to the .csv file easily enough. I'd do it by converting the .xsl to .csv, then writing a script in AWK. *BUT* my uncertainty that "yield" means the same thing in the two files and my certainty that "year" does NOT mean the same thing make it unrewarding to do so. The .xls file is the end product of some process that derived it from data better structured for computation. It seems like a better use of your time to go and look for the original data. It also seems like a good use of your time to make certain you know what the fields of the .csv file actually mean. ARE those calendar yields, or just part of a financial year? Why are only the yields of interest and not the area planted? How are the yields computed? On Wed, 27 Jul 2022 at 14:31, Ranjeet Kumar Jha <ranjeetjhaiitkgp at gmail.com> wrote:> Hello Everyone, > > I have dataset in a particular format in "dacnet_yield_update till > 2019.xlsx" file, where I need to insert the data of rows 2018-2019 and > 2019-2020 for the districts those data are available in "Kharif crops > yield_18-19.xlsx". I need to insert these two rows of data belonging to > every district, if data is available in a later excel file, just after the > particular crop group data for the particular district. > > I have put the data file in the given link. > > https://drive.google.com/drive/u/0/folders/1dNmGTI8_c9PK1QqmfIjnpbyzuiCXgxFC > > Please help solving this problem. > > Regards and Thanks, > Ranjeet > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]