Staples, Angela Dawn
2008-Jun-15 16:10 UTC
[R] Help finding the mode and maximum for a specified 'window' of time series data
I am relatively new to R, so apologize up front for my long question, particularly if there is too much or too little information. I have a large time series data set where each subject's behavior was originally coded on .25s intervals for 3min task. I am trying to determine if the findings are different depending on the coding interval (i.e. Compare .25s to 1s to 5s to 10s). I also need to see if it makes a difference if I use the maximal value within the interval or the modal value within the interval. In other words, I do not need the average value. Here a sample of my data with time as the first column: sample <-cbind((1:20)*.25,c(1,1,1,2,2,3,3,3,4,5,6,4,4,3,3,3,2,1,1,1)) [,1] [,2] [1,] 0.25 1 [2,] 0.50 1 [3,] 0.75 1 [4,] 1.00 2 [5,] 1.25 2 [6,] 1.50 3 [7,] 1.75 3 [8,] 2.00 3 [9,] 2.25 4 [10,] 2.50 5 [11,] 2.75 6 [12,] 3.00 4 [13,] 3.25 4 [14,] 3.50 3 [15,] 3.75 3 [16,] 4.00 3 [17,] 4.25 2 [18,] 4.50 1 [19,] 4.75 1 [20,] 5.00 1 I need help returning the maximum and minimum of a specified "window" such that the function/loop/etc would return: maximum (window=4, or 1s) would be (2, 3, 6, 4, 2) mode (window=4, or 1s) would be (1, 3, 4, 3, 2) Given the coding conventions in this research area, I need the values to be from adjacent, as opposed to overlapping, windows. There are likely to be situations where there is no clear mode (1,1,2,2). In those cases it would be fine to have 1.5 or 2 returned, but not NA. The data file is in the long format with each subject having 720 rows of data. I've tried playing with the row indices, but I cannot figure out how to 'move' the window. I would appreciate any help/suggestions. Since I'm new to this, the suggestion doesn't have to be pretty, I just need it to work. Sincerely, Angela ~~~~~~~~~~~~~~~~~~~~~~ Angela Staples Doctoral Candidate Psychological and Brain Sciences Indiana University 1101 E. 10th St. Bloomington, IN 47405 http://www.indiana.edu/~batessdl/ The plural of anecdote is not data. ~ Roger Brinner
jim holtman
2008-Jun-15 18:05 UTC
[R] Help finding the mode and maximum for a specified 'window' of time series data
This will give you an idea of how you might want to approach the problem:> sample <-cbind((1:20)*.25,c(1,1,1,2,2,3,3,3,4,5,6,4,4,3,3,3,2,1,1,1)) > sample[,1] [,2] [1,] 0.25 1 [2,] 0.50 1 [3,] 0.75 1 [4,] 1.00 2 [5,] 1.25 2 [6,] 1.50 3 [7,] 1.75 3 [8,] 2.00 3 [9,] 2.25 4 [10,] 2.50 5 [11,] 2.75 6 [12,] 3.00 4 [13,] 3.25 4 [14,] 3.50 3 [15,] 3.75 3 [16,] 4.00 3 [17,] 4.25 2 [18,] 4.50 1 [19,] 4.75 1 [20,] 5.00 1> # range of timing > s.r <- range(sample[,1]) > # create groupings for 1 minute intervals > s.cut <- seq(from=floor(s.r[1]), to=ceiling(s.r[2]), by=1) > s.cut[1] 0 1 2 3 4 5> # split the data > s.split <- split(sample[,2], cut(sample[,1], s.cut)) > s.split$`(0,1]` [1] 1 1 1 2 $`(1,2]` [1] 2 3 3 3 $`(2,3]` [1] 4 5 6 4 $`(3,4]` [1] 4 3 3 3 $`(4,5]` [1] 2 1 1 1> # determine maximum in interval > sapply(s.split, max, na.rm=TRUE)(0,1] (1,2] (2,3] (3,4] (4,5] 2 3 6 4 2> # mode (maximum # of occurances) > sapply(s.split, function(x) {+ .tab <- table(x) + as.numeric(names(.tab)[which.max(.tab)]) + }) (0,1] (1,2] (2,3] (3,4] (4,5] 1 3 4 3 1> >On Sun, Jun 15, 2008 at 12:10 PM, Staples, Angela Dawn <adstaple at indiana.edu> wrote:> I am relatively new to R, so apologize up front for my long question, > particularly if there is too much or too little information. > > I have a large time series data set where each subject's behavior was > originally coded on .25s intervals for 3min task. I am trying to determine > if the findings are different depending on the coding interval (i.e. Compare > .25s to 1s to 5s to 10s). > > I also need to see if it makes a difference if I use the maximal value > within the interval or the modal value within the interval. In other words, > I do not need the average value. > > Here a sample of my data with time as the first column: > > sample <-cbind((1:20)*.25,c(1,1,1,2,2,3,3,3,4,5,6,4,4,3,3,3,2,1,1,1)) > > [,1] [,2] > [1,] 0.25 1 > [2,] 0.50 1 > [3,] 0.75 1 > [4,] 1.00 2 > [5,] 1.25 2 > [6,] 1.50 3 > [7,] 1.75 3 > [8,] 2.00 3 > [9,] 2.25 4 > [10,] 2.50 5 > [11,] 2.75 6 > [12,] 3.00 4 > [13,] 3.25 4 > [14,] 3.50 3 > [15,] 3.75 3 > [16,] 4.00 3 > [17,] 4.25 2 > [18,] 4.50 1 > [19,] 4.75 1 > [20,] 5.00 1 > > I need help returning the maximum and minimum of a specified "window" such > that the function/loop/etc would return: > > maximum (window=4, or 1s) would be (2, 3, 6, 4, 2) > mode (window=4, or 1s) would be (1, 3, 4, 3, 2) > > Given the coding conventions in this research area, I need the values to be > from adjacent, as opposed to overlapping, windows. There are likely to be > situations where there is no clear mode (1,1,2,2). In those cases it would > be fine to have 1.5 or 2 returned, but not NA. The data file is in the long > format with each subject having 720 rows of data. I've tried playing with > the row indices, but I cannot figure out how to 'move' the window. > > I would appreciate any help/suggestions. Since I'm new to this, the > suggestion doesn't have to be pretty, I just need it to work. > > Sincerely, > > Angela > > ~~~~~~~~~~~~~~~~~~~~~~ > Angela Staples > Doctoral Candidate > Psychological and Brain Sciences > Indiana University > 1101 E. 10th St. > Bloomington, IN 47405 > http://www.indiana.edu/~batessdl/ > > The plural of anecdote is not data. > ~ Roger Brinner > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?