Ramon Hofer
2013-Jun-06 14:12 UTC
[R] Autocorrelation and normal distribution of gaps for ping requests in an unstable network
Hi all I have a powerline network connection which I'm investigating. The test network contains some nodes to which I ping from one host. The source host is always the same and I split the data to get files for each connection. A lot of ping requests get lost and I'm trying to plot an autocorrelation of the data. Here's an example log: http://people.ee.ethz.ch/~hoferr/download/data-20130603-192.168.72.33.csv I tried to plot the autocorrelation graph: acf(A$pingRTT.ms.) which didn't work because of missing ping values. I found a post at stackoverflow [1] where they suggest to use coredata which didn't work for me. They also suggest to use "na.action = na.omit" or "na.action na.pass". The second option works for me. With these two commands I can draw an autocorrelation graph. A <- read.csv('data-20130603-192.168.72.33.csv') acf(A$pingRTT.ms., na.action = na.pass) But they also warn that: "acf works on regularly spaced data so acf first expands the time series to a regularly spaced one inserting NAs as needed to make it regularly spaced." This seems to me as if it introduces new periods of time where there's no ping value and thus no connection which means the autocorrelation graph I get is nonsense. Is my fear for no reason or is there a way to get a meaningful plot? I'd also like to plot a histogram with normal curve like the example from statmethods [2]. In their example they have the data directly available. In my case I need to prepare my data to get a list of gaps. E.g. TimestampStart,GapLength 2013-06-03_15:20:25.374096766,16.2s 2013-06-03_15:22:13.944293504,37.5s ... My plan is to program a loop like A$Timestamp <- strptime(as.character(A$Timestamp), "%Y-%m-%d_%H:%M:%S") B <- matrix(nrow = 0, ncol = 2) colnames(B) <- c("TimestampStart","GapLength[s]") j <- 1 gap.start <- A$Timestamp[0] for(i in 2:length(A$Timestamp)) { #For all rows if(is.na(A$pingRTT.ms.[i])) { #Currently no connection if(!is.na(A$pingRTT.ms.[i-1])) { #Connection lost now gap.start <- i } else if(!is.na(A$pingRTT.ms.[i+1])) { # Connection restores next time gap.end <- i+1 B <- rbind( B, c( A$Timestamp[gap.start], A$Timestamp[gap.end]-A$Timestamp[gap.start] ) ) } } } x <- B$GapLength h<-hist(x, xlab="Gap Length [s?]", There's a problem with the rbind function which I'm using wrong. Is this the right approach and could you please give me a hint on how to add the line? Or is there a better way to achieve this? Best Ramon [1] http://stackoverflow.com/questions/7309411/how-to-calculate-autocorrelation-in-r-zoo-object [2] http://www.statmethods.net/graphs/images/histogram3.jpg