Mike Lawrence
2007-Jul-19 18:14 UTC
[R] linear interpolation of multiple random time series
Hi all, Looking for tips on how I might more optimally solve this. I have time series data (samples from a force sensor) that are not guaranteed to be sampled at the same time values across trials. ex. trial time x 1 1 1 1 5 4 1 7 9 1 12 20 2 1 0 2 3 5 2 9 10 2 13 14 2 19 22 2 24 32 Within each trial I'd like to use linear interpolation between each successive time sample to fill in intermediary timepoints and x- values, ex. trial time x 1 1 1 1 2 1.75 1 3 2.5 1 4 3.25 1 5 4 1 6 6.5 1 7 9 1 8 11.2 1 9 13.4 1 10 15.6 1 11 17.8 1 12 20 2 1 0 2 2 2.5 2 3 5 2 4 5.83333333333333 2 5 6.66666666666667 2 6 7.5 2 7 8.33333333333333 2 8 9.16666666666667 2 9 10 2 10 11 2 11 12 2 12 13 2 13 14 2 14 15.3333333333333 2 15 16.6666666666667 2 16 18 2 17 19.3333333333333 2 18 20.6666666666667 2 19 22 2 20 24 2 21 26 2 22 28 2 23 30 2 24 32 The solution I've coded (below) involves going through the original data frame line by line and is thus very slow (indeed, I had to resort to writing to file as with a large data set I started running into memory issues if I tried to create the new data frame in memory). Any suggestions on a faster way to achieve what I'm trying to do? #assumes the first data frame above is stored as 'a' arows = (length(a$x)-1) write('', 'temp.txt') for(i in 1:arows){ if(a$time[i+1] > a$time[i]){ write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append = T) x1 = a$time[i] x2 = a$time[i+1] dx = x2-x1 if(dx != 1){ y1 = a$x[i] y2 = a$x[i+1] dy = y2-y1 slope = dy/dx int = -slope*x1+y1 temp=a[i,] for(j in (x1+1):(x2-1)){ temp$time = j temp$x = slope*j+int write.table(temp, 'temp.txt', row.names = F, col.names = F, append = T) } } }else{ write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append = T) } } i=i+1 write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append = T) b=read.table('temp.txt',skip=1) names(b)=names(a)
This should do it for you:> x <- read.table(textConnection("trial time x+ 1 1 1 + 1 5 4 + 1 7 9 + 1 12 20 + 2 1 0 + 2 3 5 + 2 9 10 + 2 13 14 + 2 19 22 + 2 24 32"), header=TRUE)> # compute for each trial > trial.list <- lapply(split(x, x$trial), function(set){+ .xval <- seq(min(set$time), max(set$time)) + .yval <- approx(set$time, set$x, xout=.xval)$y + cbind(trial=set$trial[1], time=.xval, x=.yval) + })> do.call('rbind', trial.list)trial time x [1,] 1 1 1.000000 [2,] 1 2 1.750000 [3,] 1 3 2.500000 [4,] 1 4 3.250000 [5,] 1 5 4.000000 [6,] 1 6 6.500000 [7,] 1 7 9.000000 [8,] 1 8 11.200000 [9,] 1 9 13.400000 [10,] 1 10 15.600000 [11,] 1 11 17.800000 [12,] 1 12 20.000000 [13,] 2 1 0.000000 [14,] 2 2 2.500000 [15,] 2 3 5.000000 [16,] 2 4 5.833333 [17,] 2 5 6.666667 [18,] 2 6 7.500000 [19,] 2 7 8.333333 [20,] 2 8 9.166667 [21,] 2 9 10.000000 [22,] 2 10 11.000000 [23,] 2 11 12.000000 [24,] 2 12 13.000000 [25,] 2 13 14.000000 [26,] 2 14 15.333333 [27,] 2 15 16.666667 [28,] 2 16 18.000000 [29,] 2 17 19.333333 [30,] 2 18 20.666667 [31,] 2 19 22.000000 [32,] 2 20 24.000000 [33,] 2 21 26.000000 [34,] 2 22 28.000000 [35,] 2 23 30.000000 [36,] 2 24 32.000000>On 7/19/07, Mike Lawrence <Mike.Lawrence at dal.ca> wrote:> Hi all, > > Looking for tips on how I might more optimally solve this. I have > time series data (samples from a force sensor) that are not > guaranteed to be sampled at the same time values across trials. ex. > > trial time x > 1 1 1 > 1 5 4 > 1 7 9 > 1 12 20 > 2 1 0 > 2 3 5 > 2 9 10 > 2 13 14 > 2 19 22 > 2 24 32 > > Within each trial I'd like to use linear interpolation between each > successive time sample to fill in intermediary timepoints and x- > values, ex. > > trial time x > 1 1 1 > 1 2 1.75 > 1 3 2.5 > 1 4 3.25 > 1 5 4 > 1 6 6.5 > 1 7 9 > 1 8 11.2 > 1 9 13.4 > 1 10 15.6 > 1 11 17.8 > 1 12 20 > 2 1 0 > 2 2 2.5 > 2 3 5 > 2 4 5.83333333333333 > 2 5 6.66666666666667 > 2 6 7.5 > 2 7 8.33333333333333 > 2 8 9.16666666666667 > 2 9 10 > 2 10 11 > 2 11 12 > 2 12 13 > 2 13 14 > 2 14 15.3333333333333 > 2 15 16.6666666666667 > 2 16 18 > 2 17 19.3333333333333 > 2 18 20.6666666666667 > 2 19 22 > 2 20 24 > 2 21 26 > 2 22 28 > 2 23 30 > 2 24 32 > > > The solution I've coded (below) involves going through the original > data frame line by line and is thus very slow (indeed, I had to > resort to writing to file as with a large data set I started running > into memory issues if I tried to create the new data frame in > memory). Any suggestions on a faster way to achieve what I'm trying > to do? > > #assumes the first data frame above is stored as 'a' > arows = (length(a$x)-1) > write('', 'temp.txt') > for(i in 1:arows){ > if(a$time[i+1] > a$time[i]){ > write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append > = T) > x1 = a$time[i] > x2 = a$time[i+1] > dx = x2-x1 > if(dx != 1){ > y1 = a$x[i] > y2 = a$x[i+1] > dy = y2-y1 > slope = dy/dx > int = -slope*x1+y1 > temp=a[i,] > for(j in (x1+1):(x2-1)){ > temp$time = j > temp$x = slope*j+int > write.table(temp, 'temp.txt', row.names = F, col.names = F, > append = T) > } > } > }else{ > write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append > = T) > } > } > i=i+1 > write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append = T) > > b=read.table('temp.txt',skip=1) > names(b)=names(a) > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?
Gabor Grothendieck
2007-Jul-19 19:01 UTC
[R] linear interpolation of multiple random time series
Thsi can be done compactly using the zoo package. The first statement after library converts the rows for each trial into a separate zoo object and then uses by to merge these into a single zoo object with one column per trial. The second statement converts it from zoo to ts which has the effect of filling in all missing times. na.approx does the actual linear interpolation. The result is an mts object (or you could use as.zoo to convert that to a zoo object if you prefer). library(zoo) z <- do.call("merge", by(a, a$trial, function(DF) zoo(DF$x, DF$time))) na.approx(as.ts(z), na.rm = FALSE) On 7/19/07, Mike Lawrence <Mike.Lawrence at dal.ca> wrote:> Hi all, > > Looking for tips on how I might more optimally solve this. I have > time series data (samples from a force sensor) that are not > guaranteed to be sampled at the same time values across trials. ex. > > trial time x > 1 1 1 > 1 5 4 > 1 7 9 > 1 12 20 > 2 1 0 > 2 3 5 > 2 9 10 > 2 13 14 > 2 19 22 > 2 24 32 > > Within each trial I'd like to use linear interpolation between each > successive time sample to fill in intermediary timepoints and x- > values, ex. > > trial time x > 1 1 1 > 1 2 1.75 > 1 3 2.5 > 1 4 3.25 > 1 5 4 > 1 6 6.5 > 1 7 9 > 1 8 11.2 > 1 9 13.4 > 1 10 15.6 > 1 11 17.8 > 1 12 20 > 2 1 0 > 2 2 2.5 > 2 3 5 > 2 4 5.83333333333333 > 2 5 6.66666666666667 > 2 6 7.5 > 2 7 8.33333333333333 > 2 8 9.16666666666667 > 2 9 10 > 2 10 11 > 2 11 12 > 2 12 13 > 2 13 14 > 2 14 15.3333333333333 > 2 15 16.6666666666667 > 2 16 18 > 2 17 19.3333333333333 > 2 18 20.6666666666667 > 2 19 22 > 2 20 24 > 2 21 26 > 2 22 28 > 2 23 30 > 2 24 32 > > > The solution I've coded (below) involves going through the original > data frame line by line and is thus very slow (indeed, I had to > resort to writing to file as with a large data set I started running > into memory issues if I tried to create the new data frame in > memory). Any suggestions on a faster way to achieve what I'm trying > to do? > > #assumes the first data frame above is stored as 'a' > arows = (length(a$x)-1) > write('', 'temp.txt') > for(i in 1:arows){ > if(a$time[i+1] > a$time[i]){ > write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append > = T) > x1 = a$time[i] > x2 = a$time[i+1] > dx = x2-x1 > if(dx != 1){ > y1 = a$x[i] > y2 = a$x[i+1] > dy = y2-y1 > slope = dy/dx > int = -slope*x1+y1 > temp=a[i,] > for(j in (x1+1):(x2-1)){ > temp$time = j > temp$x = slope*j+int > write.table(temp, 'temp.txt', row.names = F, col.names = F, > append = T) > } > } > }else{ > write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append > = T) > } > } > i=i+1 > write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append = T) > > b=read.table('temp.txt',skip=1) > names(b)=names(a) > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >