Dear help list - I have light data with 5-min time-stamps. I would like to insert four 1-min time-stamps between each row and interpolate the light data on each new row. To do this I have come up with the following code: lightdata <- read.table("Test_light_data.csv", header = TRUE, sep = ",") # read data file into object "lightdata" library(chron) mins <- data.frame(times(1:1439/1440)) # generate a dataframe of 24 hours of 1-min timestamps Nth.delete <- function(dataframe, n)dataframe[-(seq(n, to=nrow(dataframe), by=n)),] # function for deleting nth row empty <- data.frame("1/9/13", Nth.delete(mins, 5), "NA") # delete all 5-min timestamps in a new dataframe colnames(empty) <- c("date", "time", "light") # add correct column name to empty timestamp dataframe newdata <- rbind(lightdata, empty) I get the following error message: Warning message: In `[<-.factor`(`*tmp*`, ri, value = c(0.000694444444444444, 0.00138888888888889, : invalid factor level, NAs generated Digging into this a little, I can see that the two time columns are doing what I need and APPEAR to be similar in format:> head(lightdata)date time light 1 1/9/13 0:00:00 -0.00040925 2 1/9/13 0:05:00 -0.00023386 3 1/9/13 0:10:00 -0.00032155 4 1/9/13 0:15:00 -0.00017539 5 1/9/13 0:20:00 -0.00029232 6 1/9/13 0:25:00 -0.00038002> head(empty)date time light 1 1/9/13 00:01:00 NA 2 1/9/13 00:02:00 NA 3 1/9/13 00:03:00 NA 4 1/9/13 00:04:00 NA 5 1/9/13 00:06:00 NA 6 1/9/13 00:07:00 NA but they clearly are not as far as R is concerned, as shown by str:> str(lightdata)'data.frame': 288 obs. of 3 variables: $ date : Factor w/ 1 level "1/9/13": 1 1 1 1 1 1 1 1 1 1 ... $ time : Factor w/ 288 levels "0:00:00","0:05:00",..: 1 2 3 4 5 6 7 8 9 10 ... $ light: num -0.000409 -0.000234 -0.000322 -0.000175 -0.000292 ...> str(empty)'data.frame': 1152 obs. of 3 variables: $ date : Factor w/ 1 level "1/9/13": 1 1 1 1 1 1 1 1 1 1 ... $ time :Class 'times' atomic [1:1152] 0.000694 0.001389 0.002083 0.002778 0.004167 ... .. ..- attr(*, "format")= chr "h:m:s" $ light: Factor w/ 1 level "NA": 1 1 1 1 1 1 1 1 1 1 ... In the first (original) dataframe, light is a factor, while in the dataframe of generated timestamps, the timestamps are actually still in fractions of a day. Presumably this is why rbind is not working? Can anyone help? By the way, I know I can use na.approx in zoo to do the eventual interpolation of the light data. It's getting there that has me stumped for now. Many thanks, Jon (new R user).
Hi Why you do not change date and time to POSIX object? It is simple and saves you a lot of frustration when merging two data frames. If you changed lightdata date and time to new column lightdata$newdate <- strptime(paste(lightdata$date, lightdata$time, sep=" "), format = "%d/%m/%y %H:%M:%S") generate empty empty <-data.frame(newdate= seq(firstdate, lastdate, by="min"), light=NA) see ?seq.POSIXt for details new <- merge(lightdata, empty, by="newdate", all=TRUE) shall result in merged dataframes Regards Petr> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Benstead, Jonathan > Sent: Tuesday, February 12, 2013 12:19 AM > To: r-help at r-project.org > Subject: [R] Inserting rows of interpolated data > > Dear help list - I have light data with 5-min time-stamps. I would like > to insert four 1-min time-stamps between each row and interpolate the > light data on each new row. To do this I have come up with the > following code: > > lightdata <- read.table("Test_light_data.csv", header = TRUE, sep > ",") # read data file into object "lightdata" > library(chron) > mins <- data.frame(times(1:1439/1440)) # generate a dataframe of 24 > hours of 1-min timestamps Nth.delete <- function(dataframe, > n)dataframe[-(seq(n, to=nrow(dataframe), by=n)),] # function for > deleting nth row empty <- data.frame("1/9/13", Nth.delete(mins, 5), > "NA") # delete all 5-min timestamps in a new dataframe > colnames(empty) <- c("date", "time", "light") # add correct column name > to empty timestamp dataframe newdata <- rbind(lightdata, empty) > > I get the following error message: > > Warning message: > In `[<-.factor`(`*tmp*`, ri, value = c(0.000694444444444444, > 0.00138888888888889, : > invalid factor level, NAs generated > > Digging into this a little, I can see that the two time columns are > doing what I need and APPEAR to be similar in format: > > > head(lightdata) > date time light > 1 1/9/13 0:00:00 -0.00040925 > 2 1/9/13 0:05:00 -0.00023386 > 3 1/9/13 0:10:00 -0.00032155 > 4 1/9/13 0:15:00 -0.00017539 > 5 1/9/13 0:20:00 -0.00029232 > 6 1/9/13 0:25:00 -0.00038002 > > > head(empty) > date time light > 1 1/9/13 00:01:00 NA > 2 1/9/13 00:02:00 NA > 3 1/9/13 00:03:00 NA > 4 1/9/13 00:04:00 NA > 5 1/9/13 00:06:00 NA > 6 1/9/13 00:07:00 NA > > but they clearly are not as far as R is concerned, as shown by str: > > > str(lightdata) > 'data.frame': 288 obs. of 3 variables: > $ date : Factor w/ 1 level "1/9/13": 1 1 1 1 1 1 1 1 1 1 ... > $ time : Factor w/ 288 levels "0:00:00","0:05:00",..: 1 2 3 4 5 6 7 8 > 9 10 ... > $ light: num -0.000409 -0.000234 -0.000322 -0.000175 -0.000292 ... > > > str(empty) > 'data.frame': 1152 obs. of 3 variables: > $ date : Factor w/ 1 level "1/9/13": 1 1 1 1 1 1 1 1 1 1 ... > $ time :Class 'times' atomic [1:1152] 0.000694 0.001389 0.002083 > 0.002778 0.004167 ... > .. ..- attr(*, "format")= chr "h:m:s" > $ light: Factor w/ 1 level "NA": 1 1 1 1 1 1 1 1 1 1 ... > > In the first (original) dataframe, light is a factor, while in the > dataframe of generated timestamps, the timestamps are actually still in > fractions of a day. > > Presumably this is why rbind is not working? Can anyone help? By the > way, I know I can use na.approx in zoo to do the eventual interpolation > of the light data. It's getting there that has me stumped for now. > > Many thanks, Jon (new R user). > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi Jon, zoo is great for tasks like this, not just for na.approx. :) I would approach the problem like this: library(zoo) # put lightdata into a zoo object z <- with(lightdata, zoo(light, as.POSIXct(paste(date, time), format="%m/%d/%y %H:%M:%S"))) # merge the above zoo object with an "empty" zoo # object that has all the index values you want Z <- merge(z, zoo(,seq(start(z),end(z),by="1 min"))) # interpolate between the 5-min observatoins Z <- na.approx(Z) HTH, -- Joshua Ulrich | about.me/joshuaulrich FOSS Trading | www.fosstrading.com R/Finance 2013: Applied Finance with R | www.RinFinance.com On Mon, Feb 11, 2013 at 5:19 PM, Benstead, Jonathan <jbenstead at as.ua.edu> wrote:> Dear help list - I have light data with 5-min time-stamps. I would like to insert four 1-min time-stamps between each row and interpolate the light data on each new row. To do this I have come up with the following code: > > lightdata <- read.table("Test_light_data.csv", header = TRUE, sep = ",") # read data file into object "lightdata" > library(chron) > mins <- data.frame(times(1:1439/1440)) # generate a dataframe of 24 hours of 1-min timestamps > Nth.delete <- function(dataframe, n)dataframe[-(seq(n, to=nrow(dataframe), by=n)),] # function for deleting nth row > empty <- data.frame("1/9/13", Nth.delete(mins, 5), "NA") # delete all 5-min timestamps in a new dataframe > colnames(empty) <- c("date", "time", "light") # add correct column name to empty timestamp dataframe > newdata <- rbind(lightdata, empty) > > I get the following error message: > > Warning message: > In `[<-.factor`(`*tmp*`, ri, value = c(0.000694444444444444, 0.00138888888888889, : > invalid factor level, NAs generated > > Digging into this a little, I can see that the two time columns are doing what I need and APPEAR to be similar in format: > >> head(lightdata) > date time light > 1 1/9/13 0:00:00 -0.00040925 > 2 1/9/13 0:05:00 -0.00023386 > 3 1/9/13 0:10:00 -0.00032155 > 4 1/9/13 0:15:00 -0.00017539 > 5 1/9/13 0:20:00 -0.00029232 > 6 1/9/13 0:25:00 -0.00038002 > >> head(empty) > date time light > 1 1/9/13 00:01:00 NA > 2 1/9/13 00:02:00 NA > 3 1/9/13 00:03:00 NA > 4 1/9/13 00:04:00 NA > 5 1/9/13 00:06:00 NA > 6 1/9/13 00:07:00 NA > > but they clearly are not as far as R is concerned, as shown by str: > >> str(lightdata) > 'data.frame': 288 obs. of 3 variables: > $ date : Factor w/ 1 level "1/9/13": 1 1 1 1 1 1 1 1 1 1 ... > $ time : Factor w/ 288 levels "0:00:00","0:05:00",..: 1 2 3 4 5 6 7 8 9 10 ... > $ light: num -0.000409 -0.000234 -0.000322 -0.000175 -0.000292 ... > >> str(empty) > 'data.frame': 1152 obs. of 3 variables: > $ date : Factor w/ 1 level "1/9/13": 1 1 1 1 1 1 1 1 1 1 ... > $ time :Class 'times' atomic [1:1152] 0.000694 0.001389 0.002083 0.002778 0.004167 ... > .. ..- attr(*, "format")= chr "h:m:s" > $ light: Factor w/ 1 level "NA": 1 1 1 1 1 1 1 1 1 1 ... > > In the first (original) dataframe, light is a factor, while in the dataframe of generated timestamps, the timestamps are actually still in fractions of a day. > > Presumably this is why rbind is not working? Can anyone help? By the way, I know I can use na.approx in zoo to do the eventual interpolation of the light data. It's getting there that has me stumped for now. > > Many thanks, Jon (new R user). > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.