Cesar Terrer
2013-Mar-04 23:13 UTC
[R] Automatically fix big jumps in one variable due to anomalies
Hi, I am attaching a plot where you can see there are a few "jumps" (plots 1, 4, 5 and 6), due to incidents with the measuring sensors (basically someone touching the sensor). I need to revert those changes to have a plot without unreal measurements, so make those fragments go back to its original pattern before the jump. I have used the function cpt.mean {changepoints} so I can identify the jumps and the mean of each segment. Now I don't know how to automatically revert the jumps, probably subtracting one higher fragment mean by the mean of the previous one. Does it make sense? Example of data set TIMESTAMP variable diameter 38 2012-06-21 13:45:00 r4_3 NA 86 2012-06-21 14:00:00 r4_3 NA 134 2012-06-21 14:15:00 r4_3 246 182 2012-06-21 14:30:00 r4_3 251 230 2012-06-21 14:45:00 r4_3 250 278 2012-06-21 15:00:00 r4_3 255 326 2012-06-21 15:15:00 r4_3 5987 374 2012-06-21 15:30:00 r4_3 5991 422 2012-06-21 15:45:00 r4_3 5994 470 2012-06-21 16:00:00 r4_3 5999 As an example, this is the current diameter data: NA-NA-246-251-250-255-5987-5991-5994-599 I would need this series without the big jump, avoiding the jump and following the increase/decrease pattern, for example: NA-NA-246-251-250-255-255-259-262-267 Any other idea is welcome.
Duncan Mackay
2013-Mar-05 03:18 UTC
[R] Automatically fix big jumps in one variable due to anomalies
Hi Cesar Not sure what you actually want to accomplish ?rle may give you some ideas eg (I have added some to return to the good section) x = c(246,251,250,255,5987,5991,5994,599,255,259,262,267) xdiff = diff(x) xdiff [1] 5 -1 5 5732 4 3 -5395 -344 4 3 5 rle(xdiff) Run Length Encoding lengths: int [1:11] 1 1 1 1 1 1 1 1 1 1 ... values : num [1:11] 5 -1 5 5732 4 3 -5395 -344 4 3 ... which(abs(rle(xdiff)[[2]] ) > 50) [1] 4 7 8 rle(xdiff)[[2]][abs(rle(xdiff)[[2]] ) > 50] It is then a matter of removing the required sequences or applying a function to them or substituting values ?zoo::na.approx from memory HTH Duncan Duncan Mackay Department of Agronomy and Soil Science University of New England Armidale NSW 2351 Email: home: mackay at northnet.com.au At 09:13 5/03/2013, you wrote:>Hi, >I am attaching a plot where you can see there are a few "jumps" (plots 1, 4, >5 and 6), due to incidents with the measuring sensors (basically someone >touching the sensor). I need to revert those changes to have a plot without >unreal measurements, so make those fragments go back to its original pattern >before the jump. > >I have used the function cpt.mean {changepoints} so I can identify the jumps >and the mean of each segment. Now I don't know how to automatically revert >the jumps, probably subtracting one higher fragment mean by the mean of the >previous one. Does it make sense? > >Example of data set > > TIMESTAMP variable diameter >38 2012-06-21 13:45:00 r4_3 NA >86 2012-06-21 14:00:00 r4_3 NA >134 2012-06-21 14:15:00 r4_3 246 >182 2012-06-21 14:30:00 r4_3 251 >230 2012-06-21 14:45:00 r4_3 250 >278 2012-06-21 15:00:00 r4_3 255 >326 2012-06-21 15:15:00 r4_3 5987 >374 2012-06-21 15:30:00 r4_3 5991 >422 2012-06-21 15:45:00 r4_3 5994 >470 2012-06-21 16:00:00 r4_3 5999 > >As an example, this is the current diameter data: >NA-NA-246-251-250-255-5987-5991-5994-599 > >I would need this series without the big jump, avoiding the jump and >following the increase/decrease pattern, for example: >NA-NA-246-251-250-255-255-259-262-267 > >Any other idea is welcome. > > > > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
PIKAL Petr
2013-Mar-07 08:00 UTC
[R] Automatically fix big jumps in one variable due to anomalies
Hi Not sure if it solves all possible misbehavior with sensor but changing all jumps start to NA or 0, summing diferences and adding them to start can help you to polish your data> x[1] NA NA 246 251 250 255 5987 5991 5994 5999 xd<-diff(x) xd[xd>10]<-NA xd[is.na(xd)]<-0> cumsum(xd)[1] 0 0 5 4 9 9 13 16 21> 246+cumsum(xd)[1] 246 246 251 250 255 255 259 262 267 Regards Petr> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Cesar Terrer > Sent: Tuesday, March 05, 2013 12:13 AM > To: r-help at r-project.org > Subject: [R] Automatically fix big jumps in one variable due to > anomalies > > Hi, > I am attaching a plot where you can see there are a few "jumps" (plots > 1, 4, > 5 and 6), due to incidents with the measuring sensors (basically > someone touching the sensor). I need to revert those changes to have a > plot without unreal measurements, so make those fragments go back to > its original pattern before the jump. > > I have used the function cpt.mean {changepoints} so I can identify the > jumps and the mean of each segment. Now I don't know how to > automatically revert the jumps, probably subtracting one higher > fragment mean by the mean of the previous one. Does it make sense? > > Example of data set > > TIMESTAMP variable diameter > 38 2012-06-21 13:45:00 r4_3 NA > 86 2012-06-21 14:00:00 r4_3 NA > 134 2012-06-21 14:15:00 r4_3 246 > 182 2012-06-21 14:30:00 r4_3 251 > 230 2012-06-21 14:45:00 r4_3 250 > 278 2012-06-21 15:00:00 r4_3 255 > 326 2012-06-21 15:15:00 r4_3 5987 > 374 2012-06-21 15:30:00 r4_3 5991 > 422 2012-06-21 15:45:00 r4_3 5994 > 470 2012-06-21 16:00:00 r4_3 5999 > > As an example, this is the current diameter data: > NA-NA-246-251-250-255-5987-5991-5994-599 > > I would need this series without the big jump, avoiding the jump and > following the increase/decrease pattern, for example: > NA-NA-246-251-250-255-255-259-262-267 > > Any other idea is welcome. > >