The small bit of script below is an example of what I'm attempting to do - find the day on which the 'center of mass' occurs. In case that is the wrong term, I'd like to know the day that essentially cuts the area under the curve in to two equal parts: set.seed(4004) Date <- seq(as.Date('2000-09-01'), as.Date('2000-09-30'), by='day') hyd <- ((100*(sin(seq(0.5,4.5,length.out=30))+10) + seq(45,1,length.out=30)) + rnorm(30)*8) - 800 # View the example curve plot(Date, hyd, las=1) # By trial-and-error, the day on which the center of mass occurs is the 11th day: # Add up the area under the curve for the first 11 days and compare # with the last 19 days: sum(hyd[1:11]) # 3546.364 sum(hyd[12:30]) # 3947.553 # Add up the area under the curve for the first 12 days and compare # with the last 18 days: sum(hyd[1:12]) # 3875.753 sum(hyd[13:30]) # 3618.164 By day 12, the halfway point has already been passed, so the answer that would be returned would be: Date[11] # "2000-09-11" For the larger problem, it'd be handy if the proposed function could process a multi-year time series (a runoff hydrograph) and return the day of the center of mass for each year in the time series. I appreciate any pointers...Eric [[alternative HTML version deleted]]
Hi Eric, How about match( TRUE, cumsum(hyd/sum(hyd)) > .5 ) - 1 HTH, Eric On Sat, Dec 16, 2017 at 3:18 PM, Morway, Eric <emorway at usgs.gov> wrote:> The small bit of script below is an example of what I'm attempting to do - > find the day on which the 'center of mass' occurs. In case that is the > wrong term, I'd like to know the day that essentially cuts the area under > the curve in to two equal parts: > > set.seed(4004) > Date <- seq(as.Date('2000-09-01'), as.Date('2000-09-30'), by='day') > hyd <- ((100*(sin(seq(0.5,4.5,length.out=30))+10) + > seq(45,1,length.out=30)) + rnorm(30)*8) - 800 > > # View the example curve > plot(Date, hyd, las=1) > > # By trial-and-error, the day on which the center of mass occurs is the > 11th day: > # Add up the area under the curve for the first 11 days and compare > # with the last 19 days: > > sum(hyd[1:11]) > # 3546.364 > sum(hyd[12:30]) > # 3947.553 > > # Add up the area under the curve for the first 12 days and compare > # with the last 18 days: > > sum(hyd[1:12]) > # 3875.753 > sum(hyd[13:30]) > # 3618.164 > > By day 12, the halfway point has already been passed, so the answer that > would be returned would be: > > Date[11] > # "2000-09-11" > > For the larger problem, it'd be handy if the proposed function could > process a multi-year time series (a runoff hydrograph) and return the day > of the center of mass for each year in the time series. > > I appreciate any pointers...Eric > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Thierry Onkelinx
2017-Dec-16 16:32 UTC
[R] Finding center of mass in a hydrologic time series
Slightly faster: sum(cumsum(hyd) <= .5 * sum(hyd)) Best regards, ir. Thierry Onkelinx Statisticus / Statistician Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance thierry.onkelinx at inbo.be Kliniekstraat 25, B-1070 Brussel www.inbo.be /////////////////////////////////////////////////////////////////////////////////////////// To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey /////////////////////////////////////////////////////////////////////////////////////////// Van 14 tot en met 19 december 2017 verhuizen we uit onze vestiging in Brussel naar het Herman Teirlinckgebouw op de site Thurn & Taxis. Vanaf dan ben je welkom op het nieuwe adres: Havenlaan 88 bus 73, 1000 Brussel. /////////////////////////////////////////////////////////////////////////////////////////// 2017-12-16 14:32 GMT+01:00 Eric Berger <ericjberger at gmail.com>:> Hi Eric, > How about > > match( TRUE, cumsum(hyd/sum(hyd)) > .5 ) - 1 > > HTH, > Eric > > > On Sat, Dec 16, 2017 at 3:18 PM, Morway, Eric <emorway at usgs.gov> wrote: > >> The small bit of script below is an example of what I'm attempting to do - >> find the day on which the 'center of mass' occurs. In case that is the >> wrong term, I'd like to know the day that essentially cuts the area under >> the curve in to two equal parts: >> >> set.seed(4004) >> Date <- seq(as.Date('2000-09-01'), as.Date('2000-09-30'), by='day') >> hyd <- ((100*(sin(seq(0.5,4.5,length.out=30))+10) + >> seq(45,1,length.out=30)) + rnorm(30)*8) - 800 >> >> # View the example curve >> plot(Date, hyd, las=1) >> >> # By trial-and-error, the day on which the center of mass occurs is the >> 11th day: >> # Add up the area under the curve for the first 11 days and compare >> # with the last 19 days: >> >> sum(hyd[1:11]) >> # 3546.364 >> sum(hyd[12:30]) >> # 3947.553 >> >> # Add up the area under the curve for the first 12 days and compare >> # with the last 18 days: >> >> sum(hyd[1:12]) >> # 3875.753 >> sum(hyd[13:30]) >> # 3618.164 >> >> By day 12, the halfway point has already been passed, so the answer that >> would be returned would be: >> >> Date[11] >> # "2000-09-11" >> >> For the larger problem, it'd be handy if the proposed function could >> process a multi-year time series (a runoff hydrograph) and return the day >> of the center of mass for each year in the time series. >> >> I appreciate any pointers...Eric >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/ >> posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Eric B's response provided just the kind of quick & simple solution I was hoping for (appears as the function com below). However, I once again failed to take advantage of the power of R and have reverted back to using a for loop for the next step of the processing. The example below (which requires the library EGRET for pulling an example dataset) works, but probably can be replaced with some version of the apply functionality? So far, I've been unable to figure out how to enter the arguments to the apply function. The idea is this: for each unique water year (variable 'wyrs' in example below) in a 27 year continuous time series of daily values, find the date of the 'center of mass', and build a vector of those dates. Thanks, -Eric M library(EGRET) StartDate <- "1990-10-01" EndDate <- "2017-09-30" siteNumber <- "10310000" QParameterCd <- "00060" Daily <- readNWISDaily(siteNumber, QParameterCd, StartDate, EndDate) # Define 'center of mass' function com <- function(x) { match(TRUE, cumsum(x/sum(x)) > 0.5) - 1 } wyrs <- unique(Daily$waterYear) for(i in (1:length(wyrs))){ OneYr <- Daily[Daily$waterYear==wyrs[i], ] mid <- com(OneYr$Q) if(i==1){ midPts <- as.Date(OneYr$Date[mid]) } else { midPts <- c(midPts, as.Date(OneYr$Date[mid])) } } Eric Morway Research Hydrologist Nevada Water Science Center U.S. Geological Survey 2730 N. Deer Run Rd. Carson City, NV 89701 (775) 887-7668 *orcid*: 0000-0002-8553-6140 <http://orcid.org/0000-0002-8553-6140> On Sat, Dec 16, 2017 at 5:32 AM, Eric Berger <ericjberger at gmail.com> wrote:> Hi Eric, > How about > > match( TRUE, cumsum(hyd/sum(hyd)) > .5 ) - 1 > > HTH, > Eric > > > On Sat, Dec 16, 2017 at 3:18 PM, Morway, Eric <emorway at usgs.gov> wrote: > >> The small bit of script below is an example of what I'm attempting to do - >> find the day on which the 'center of mass' occurs. In case that is the >> wrong term, I'd like to know the day that essentially cuts the area under >> the curve in to two equal parts: >> >> set.seed(4004) >> Date <- seq(as.Date('2000-09-01'), as.Date('2000-09-30'), by='day') >> hyd <- ((100*(sin(seq(0.5,4.5,length.out=30))+10) + >> seq(45,1,length.out=30)) + rnorm(30)*8) - 800 >> >> # View the example curve >> plot(Date, hyd, las=1) >> >> # By trial-and-error, the day on which the center of mass occurs is the >> 11th day: >> # Add up the area under the curve for the first 11 days and compare >> # with the last 19 days: >> >> sum(hyd[1:11]) >> # 3546.364 >> sum(hyd[12:30]) >> # 3947.553 >> >> # Add up the area under the curve for the first 12 days and compare >> # with the last 18 days: >> >> sum(hyd[1:12]) >> # 3875.753 >> sum(hyd[13:30]) >> # 3618.164 >> >> By day 12, the halfway point has already been passed, so the answer that >> would be returned would be: >> >> Date[11] >> # "2000-09-11" >> >> For the larger problem, it'd be handy if the proposed function could >> process a multi-year time series (a runoff hydrograph) and return the day >> of the center of mass for each year in the time series. >> >> I appreciate any pointers...Eric >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posti >> ng-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > >[[alternative HTML version deleted]]