Thanks, I meant if there are missing data at the beginning and end of a dataframe, how to interpolate according to available data? For example, the A column has missing values at the beginning and end, how to interpolate linearly between 10 and 12 for the missing values? df <- data.frame(A=c(NA, NA,10,11,12, NA),B=c(5,5,4,3,4,5),C=c(3.3,4,3,1.5, 2.2,4),time=as.Date(c("1990-01-01","1990-02- 07","1990-02-14","1990-02-28","1990-03-01","1990-03-20"))) On Thu, Jul 21, 2016 at 4:48 PM, William Dunlap <wdunlap at tibco.com> wrote:> Try approx(), as in: > > df <- > data.frame(A=c(10,11,12),B=c(5,5,4),C=c(3.3,4,3),time=as.Date(c("1990-01-01","1990-02-07","1990-02-14"))) > with(df, approx(x=time, y=C, xout=seq(min(time), max(time), by="days"))) > > Do you notice how one can copy and paste that example out of the > mail an into R to see how it works? It would help if your questions > had that same property - show how the example data could be created. > > > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > On Thu, Jul 21, 2016 at 3:34 PM, lily li <chocold12 at gmail.com> wrote: > >> I have a question about interpolating missing values in a dataframe. The >> dataframe is in the following, Column C has no data before 2009-01-05 and >> after 2009-12-31, how to interpolate data for the blanks? That is to say, >> interpolate linearly between these two gaps using 5.4 and 6.1? Thanks. >> >> >> df >> time A B C >> 2009-01-01 3 4.5 >> 2009-01-02 4 5 >> 2009-01-03 3.3 6 >> 2009-01-04 4.1 7 >> 2009-01-05 4.4 6.2 5.4 >> ... >> >> 2009-11-20 5.1 5.5 6.1 >> 2009-11-21 5.4 4 >> ... >> 2009-12-31 4.5 6 >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > >[[alternative HTML version deleted]]
> On 22 Jul 2016, at 01:54, lily li <chocold12 at gmail.com> wrote: > > Thanks, I meant if there are missing data at the beginning and end of a > dataframe, how to interpolate according to available data? > > For example, the A column has missing values at the beginning and end, how > to interpolate linearly between 10 and 12 for the missing values? > > df <- data.frame(A=c(NA, NA,10,11,12, NA),B=c(5,5,4,3,4,5),C=c(3.3,4,3,1.5, > 2.2,4),time=as.Date(c("1990-01-01","1990-02- > 07","1990-02-14","1990-02-28","1990-03-01","1990-03-20"))) >As William was answered; with(df, approx(x=time, y=A, xout=seq(min(time, na.rm =T), max(time, na.rm = T), by="days"))) will help you interpolate linearly between knwon values even column has NA?s.> > On Thu, Jul 21, 2016 at 4:48 PM, William Dunlap <wdunlap at tibco.com> wrote: > >> Try approx(), as in: >> >> df <- >> data.frame(A=c(10,11,12),B=c(5,5,4),C=c(3.3,4,3),time=as.Date(c("1990-01-01","1990-02-07","1990-02-14"))) >> with(df, approx(x=time, y=C, xout=seq(min(time), max(time), by="days"))) >> >> Do you notice how one can copy and paste that example out of the >> mail an into R to see how it works? It would help if your questions >> had that same property - show how the example data could be created. >> >> >> Bill Dunlap >> TIBCO Software >> wdunlap tibco.com >> >> On Thu, Jul 21, 2016 at 3:34 PM, lily li <chocold12 at gmail.com> wrote: >> >>> I have a question about interpolating missing values in a dataframe. The >>> dataframe is in the following, Column C has no data before 2009-01-05 and >>> after 2009-12-31, how to interpolate data for the blanks? That is to say, >>> interpolate linearly between these two gaps using 5.4 and 6.1? Thanks. >>> >>> >>> df >>> time A B C >>> 2009-01-01 3 4.5 >>> 2009-01-02 4 5 >>> 2009-01-03 3.3 6 >>> 2009-01-04 4.1 7 >>> 2009-01-05 4.4 6.2 5.4 >>> ... >>> >>> 2009-11-20 5.1 5.5 6.1 >>> 2009-11-21 5.4 4 >>> ... >>> 2009-12-31 4.5 6 >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi lili, The problem may lie in the fact that I think you are using "interpolate" when you mean "extrapolate". In that case, the best you can do is spread values beyond the points that you have. Find the slope of the line, put a point at each end of your time data (2009-01-01 and 2009-12-31) and use "approx" on all three gaps. Note that this slope is a slippery one indeed and few will accept that the values so generated mean anything. Jim On Fri, Jul 22, 2016 at 9:38 AM, Ismail SEZEN <sezenismail at gmail.com> wrote:> >> On 22 Jul 2016, at 01:54, lily li <chocold12 at gmail.com> wrote: >> >> Thanks, I meant if there are missing data at the beginning and end of a >> dataframe, how to interpolate according to available data? >> >> For example, the A column has missing values at the beginning and end, how >> to interpolate linearly between 10 and 12 for the missing values? >> >> df <- data.frame(A=c(NA, NA,10,11,12, NA),B=c(5,5,4,3,4,5),C=c(3.3,4,3,1.5, >> 2.2,4),time=as.Date(c("1990-01-01","1990-02- >> 07","1990-02-14","1990-02-28","1990-03-01","1990-03-20"))) >> > > As William was answered; > > with(df, approx(x=time, y=A, xout=seq(min(time, na.rm =T), max(time, na.rm = T), by="days"))) > > will help you interpolate linearly between knwon values even column has NA?s. > > >> >> On Thu, Jul 21, 2016 at 4:48 PM, William Dunlap <wdunlap at tibco.com> wrote: >> >>> Try approx(), as in: >>> >>> df <- >>> data.frame(A=c(10,11,12),B=c(5,5,4),C=c(3.3,4,3),time=as.Date(c("1990-01-01","1990-02-07","1990-02-14"))) >>> with(df, approx(x=time, y=C, xout=seq(min(time), max(time), by="days"))) >>> >>> Do you notice how one can copy and paste that example out of the >>> mail an into R to see how it works? It would help if your questions >>> had that same property - show how the example data could be created. >>> >>> >>> Bill Dunlap >>> TIBCO Software >>> wdunlap tibco.com >>> >>> On Thu, Jul 21, 2016 at 3:34 PM, lily li <chocold12 at gmail.com> wrote: >>> >>>> I have a question about interpolating missing values in a dataframe. The >>>> dataframe is in the following, Column C has no data before 2009-01-05 and >>>> after 2009-12-31, how to interpolate data for the blanks? That is to say, >>>> interpolate linearly between these two gaps using 5.4 and 6.1? Thanks. >>>> >>>> >>>> df >>>> time A B C >>>> 2009-01-01 3 4.5 >>>> 2009-01-02 4 5 >>>> 2009-01-03 3.3 6 >>>> 2009-01-04 4.1 7 >>>> 2009-01-05 4.4 6.2 5.4 >>>> ... >>>> >>>> 2009-11-20 5.1 5.5 6.1 >>>> 2009-11-21 5.4 4 >>>> ... >>>> 2009-12-31 4.5 6 >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.