I am looking to use R in order to determine the number of extreme events for a high frequency (20 minutes) dataset of wave heights that spans 25 years (657,432) data points. I require the number, spacing and duration of the extreme events as an output. I have briefly used the clusters function in evd package. Can anyone suggest a more appropriate package to use for such a large dataset? Thanks, Doug -- View this message in context: http://r.789695.n4.nabble.com/Clustering-tp3017056p3017056.html Sent from the R help mailing list archive at Nabble.com.
I have worked with seismic data measured at 100hz, and had no trouble locating events in "long" records (several times the size of your dataset). 20 minutes is high frequency? what kind of waves are these? what is the wavelength? some details would help. albyn On Thu, Oct 28, 2010 at 05:00:10AM -0700, dpender wrote:> > I am looking to use R in order to determine the number of extreme events for > a high frequency (20 minutes) dataset of wave heights that spans 25 years > (657,432) data points. > > I require the number, spacing and duration of the extreme events as an > output. > > I have briefly used the clusters function in evd package. > > Can anyone suggest a more appropriate package to use for such a large > dataset? > > Thanks, > > Doug > > -- > View this message in context: http://r.789695.n4.nabble.com/Clustering-tp3017056p3017056.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Albyn Jones Reed College jones at reed.edu
On Oct 28, 2010, at 8:00 AM, dpender wrote:> > I am looking to use R in order to determine the number of extreme > events for > a high frequency (20 minutes) dataset of wave heights that spans 25 > years > (657,432) data points. > > I require the number, spacing and duration of the extreme events as an > output.If you created a "test" vector and then used rle on the "test", you may get what you want. This yields the intervals between "events" ( > greater than 0.9): > wave <- runif(100) > test <- wave > 0.9 > rle(test) Run Length Encoding lengths: int [1:11] 74 1 5 1 1 1 6 1 4 1 ... values : logi [1:11] FALSE TRUE FALSE TRUE FALSE TRUE ... > rle(test)$lengths[ !rle(test)$values ] [1] 74 5 1 6 4 5 You can also get the "duration" of an extreme event by not using the negation of the values. (Sorry for the double-negative.) -- David.> > I have briefly used the clusters function in evd package. > > Can anyone suggest a more appropriate package to use for such a large > dataset? >David Winsemius, MD West Hartford, CT