Hi, I have a data frame with two columns of data, one an indexing column and the other a data column. My issue is, this data frame is incomplete and there are missing lines. I want to know how I can find and add data into these missing lines. See example below ## Example data data <- data.frame(index=c(1:4,6:10), data=c (1.5,4.3,5.6,6.7,7.1,12.5,14.5,16.8,3.4)) index data 1 1 1.5 2 2 4.3 3 3 5.6 4 4 6.7 5 6 7.1 6 7 12.5 7 8 14.5 8 9 16.8 9 10 3.4 ## note: index number 5 is missing ## What I want index data 1 1 1.5 2 2 4.3 3 3 5.6 4 4 6.7 5 5 NA 6 6 7.1 7 7 12.5 8 8 14.5 9 9 16.8 10 10 3.4 I'm running R2.6.0 on Mac OSX. Thanks in advance, Andrew Hoskins PhD Candidate Deakin University, Australia Email: ajhos@deakin.edu.au [[alternative HTML version deleted]]
On Thu, 2007-11-15 at 09:41 +1100, Andrew Hoskins wrote:> Hi, > > I have a data frame with two columns of data, one an indexing column > and the other a data column. My issue is, this data frame is > incomplete and there are missing lines. I want to know how I can > find and add data into these missing lines. See example below > > ## Example data > > data <- data.frame(index=c(1:4,6:10), data=c > (1.5,4.3,5.6,6.7,7.1,12.5,14.5,16.8,3.4)) > > index data > 1 1 1.5 > 2 2 4.3 > 3 3 5.6 > 4 4 6.7 > 5 6 7.1 > 6 7 12.5 > 7 8 14.5 > 8 9 16.8 > 9 10 3.4 > > ## note: index number 5 is missing > > ## What I want > > index data > 1 1 1.5 > 2 2 4.3 > 3 3 5.6 > 4 4 6.7 > 5 5 NA > 6 6 7.1 > 7 7 12.5 > 8 8 14.5 > 9 9 16.8 > 10 10 3.4 > > I'm running R2.6.0 on Mac OSX.How about this:> DFindex data 1 1 1.5 2 2 4.3 3 3 5.6 4 4 6.7 5 6 7.1 6 7 12.5 7 8 14.5 8 9 16.8 9 10 3.4 DF.NEW <- data.frame(index = seq(max(DF$index)))> DF.NEWindex 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 DF.NEW <- merge(DF.NEW, DF, all.x = TRUE)> DF.NEWindex data 1 1 1.5 2 2 4.3 3 3 5.6 4 4 6.7 5 5 NA 6 6 7.1 7 7 12.5 8 8 14.5 9 9 16.8 10 10 3.4 See ?merge for more information. HTH, Marc Schwartz
On Thu, 15 Nov 2007, Andrew Hoskins wrote:> Hi, > > I have a data frame with two columns of data, one an indexing column > and the other a data column. My issue is, this data frame is > incomplete and there are missing lines. I want to know how I can > find and add data into these missing lines. See example belowYou could use a "zoo" series (from the "zoo" package) which provides infrastructure for indexed observations. With your example data: data <- data.frame(index = c(1:4, 6:10), data = c(1.5,4.3,5.6,6.7,7.1,12.5,14.5,16.8,3.4)) you can create a series z <- zoo(data$data, data$index) end extend it to the grid 1:10 z <- merge(zoo(,1:10), z) which then has an NA at index 5. Then you could use linear interpolation to replace that NA na.approx(z) or replace it with some other number z[is.na(z)] <- 42 See vignette("zoo", package = "zoo") for more details. Z