Dear friends, I have one dataframe which contains 378 observations, and another one, containing 362 observations. Both dataframes have two columns, one date column and another one with the number of transits. I wanted to come up with a code so that I could fill in the dates that are missing in one of the dataframes and replace the column of transits with the value NA. I have tried several things but R obviously complains that the length of the dataframes are different. How can I solve this? Any guidance will be greatly appreciated, Best regards, Paul [[alternative HTML version deleted]]
Make some small dataframes of just a few rows that illustrate the problem structure. Make a third that has the result you want. You will get an answer very quickly. Without a self-contained reproducible problem, results vary. Mark R. Mark Sharp, Ph.D. msharp at TxBiomed.org> On Mar 27, 2017, at 3:09 PM, Paul Bernal <paulbernal07 at gmail.com> wrote: > > Dear friends, > > I have one dataframe which contains 378 observations, and another one, > containing 362 observations. > > Both dataframes have two columns, one date column and another one with the > number of transits. > > I wanted to come up with a code so that I could fill in the dates that are > missing in one of the dataframes and replace the column of transits with > the value NA. > > I have tried several things but R obviously complains that the length of > the dataframes are different. > > How can I solve this? > > Any guidance will be greatly appreciated, > > Best regards, > > Paul > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.CONFIDENTIALITY NOTICE: This e-mail and any files and/or...{{dropped:10}}
Dear friend Mark, Great suggestion! Thank you for replying. I have two dataframes, dataframe1 and dataframe2. dataframe1 has two columns, one with the dates in YYYY-MM-DD format and the other colum with number of transits (all of which were set to NA values). dataframe1 starts in 1985-10-01 (october 1st 1985) and ends in 2017-03-01 (march 1 2017). dataframe2 has the same two columns, one with the dates in YYYY-MM-DD format, and the other column with number of transits. dataframe2 starts have the same start and end dates, however, dataframe2 has missing dates between the start and end dates, so it has fewer observations. dataframe1 has a total of 378 observations and dataframe2 has a total of 362 observations. I would like to come up with a code that could do the following: Get the dates of dataframe1 that are missing in dataframe2 and add them as records to dataframe 2 but with NA values. <dataframe1 <dataframe2 Date Transits Date Transits 1985-10-01 NA 1985-10-01 15 1985-11-01 NA 1986-01-01 20 1985-12-01 NA 1986-02-01 5 1986-01-01 NA 1986-02-01 NA 2017-03-01 NA I would like to fill in the missing dates in dataframe2, with NA as value for the missing transits, so that I could end up with a dataframe3 looking as follows: <dataframe3 Date Transits 1985-10-01 15 1985-11-01 NA 1985-12-01 NA 1986-01-01 20 1986-02-01 5 2017-03-01 NA This is what I want to accomplish. Thanks, beforehand for your help, Best regards, Paul 2017-03-27 15:15 GMT-05:00 Mark Sharp <msharp at txbiomed.org>:> Make some small dataframes of just a few rows that illustrate the problem > structure. Make a third that has the result you want. You will get an > answer very quickly. Without a self-contained reproducible problem, results > vary. > > Mark > R. Mark Sharp, Ph.D. > msharp at TxBiomed.org > > > > > > > On Mar 27, 2017, at 3:09 PM, Paul Bernal <paulbernal07 at gmail.com> wrote: > > > > Dear friends, > > > > I have one dataframe which contains 378 observations, and another one, > > containing 362 observations. > > > > Both dataframes have two columns, one date column and another one with > the > > number of transits. > > > > I wanted to come up with a code so that I could fill in the dates that > are > > missing in one of the dataframes and replace the column of transits with > > the value NA. > > > > I have tried several things but R obviously complains that the length of > > the dataframes are different. > > > > How can I solve this? > > > > Any guidance will be greatly appreciated, > > > > Best regards, > > > > Paul > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > CONFIDENTIALITY NOTICE: This e-mail and any files and/or attachments > transmitted, may contain privileged and confidential information and is > intended solely for the exclusive use of the individual or entity to whom > it is addressed. If you are not the intended recipient, you are hereby > notified that any review, dissemination, distribution or copying of this > e-mail and/or attachments is strictly prohibited. If you have received this > e-mail in error, please immediately notify the sender stating that this > transmission was misdirected; return the e-mail to sender; destroy all > paper copies and delete all electronic copies from your system without > disclosing its contents. >[[alternative HTML version deleted]]
You could use merge() or %in%. Best, Ulrik Mark Sharp <msharp at txbiomed.org> schrieb am Mo., 27. M?rz 2017, 22:20:> Make some small dataframes of just a few rows that illustrate the problem > structure. Make a third that has the result you want. You will get an > answer very quickly. Without a self-contained reproducible problem, results > vary. > > Mark > R. Mark Sharp, Ph.D. > msharp at TxBiomed.org > > > > > > > On Mar 27, 2017, at 3:09 PM, Paul Bernal <paulbernal07 at gmail.com> wrote: > > > > Dear friends, > > > > I have one dataframe which contains 378 observations, and another one, > > containing 362 observations. > > > > Both dataframes have two columns, one date column and another one with > the > > number of transits. > > > > I wanted to come up with a code so that I could fill in the dates that > are > > missing in one of the dataframes and replace the column of transits with > > the value NA. > > > > I have tried several things but R obviously complains that the length of > > the dataframes are different. > > > > How can I solve this? > > > > Any guidance will be greatly appreciated, > > > > Best regards, > > > > Paul > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > CONFIDENTIALITY NOTICE: This e-mail and any files and/or...{{dropped:10}} > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Anthoni, Peter (IMK)
2017-Mar-28 05:42 UTC
[R] Looping Through DataFrames with Differing Lenghts
Hi Paul, match might help, but without a real data sample, it is hard to check if the following might work. mm=match(df.col378[,"Date"],df.col362[,"Date"]) #mm will have NAs, where there is no matching date in df.col362 #and have the index of the match, where the two dates match new.df=cbind(df.col378,"transits.col362"=df.col362[mm,"transits"]) cheers Peter> On 27 Mar 2017, at 22:09, Paul Bernal <paulbernal07 at gmail.com> wrote: > > Dear friends, > > I have one dataframe which contains 378 observations, and another one, > containing 362 observations. > > Both dataframes have two columns, one date column and another one with the > number of transits. > > I wanted to come up with a code so that I could fill in the dates that are > missing in one of the dataframes and replace the column of transits with > the value NA. > > I have tried several things but R obviously complains that the length of > the dataframes are different. > > How can I solve this? > > Any guidance will be greatly appreciated, > > Best regards, > > Paul > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.