Morway, Eric
2016-Jun-01 16:03 UTC
[R] Trimming time series to only include complete years
Hello Jeff, thank you very much for following up with me on this. It definitely helped me get on my way with my analysis. It figures your from UC Davis (I'm guessing from your email address), I've been helped out by them often! -Eric Eric Morway Hydrologist 2730 N. Deer Run Rd. Carson City, NV 89701 (775) 887-7668 On Mon, May 30, 2016 at 3:15 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:> Sorry, I put too many bugs (opportunities for excellence!) in this on my > first pass on this to leave it alone :-( > > isPartialWaterYear2 <- function( d ) { > dtl <- as.POSIXlt( d ) > wy1 <- cumsum( ( 9 == dtl$mon ) & ( 1 == dtl$mday ) ) > # any 0 in wy1 corresponds to first partial water year > result <- 0 == wy1 > # if last day is not Sep 30, mark last water year as partial > if ( 8 != dtl$mon[ length( d ) ] > | 30 != dtl$mday[ length( d ) ] ) { > result[ wy1[ length( d ) ] == wy1 ] <- TRUE > } > result > } > > dat2 <- dat[ !isPartialWaterYear( dat$Date ), ] > > On Sat, 28 May 2016, Jeff Newmiller wrote: > > # read about POSIXlt at ?DateTimeClasses >> # note that the "mon" element is 0-11 >> isPartialWaterYear <- function( d ) { >> dtl <- as.POSIXlt( dat$Date ) >> wy1 <- cumsum( ( 9 == dtl$mon ) & ( 1 == dtl$mday ) ) >> ( 0 == wy1 # first partial year >> | ( 8 != dtl$mon[ nrow( dat ) ] # end partial year >> & 30 != dtl$mday[ nrow( dat ) ] >> ) & wy1[ nrow( dat ) ] == wy1 >> ) >> } >> >> dat2 <- dat[ !isPartialWaterYear( dat$Date ), ] >> >> The above assumes that, as you said, the data are continuous at one-day >> intervals, such that the only partial years will occur at the beginning and >> end. The "diff" function could be used to identify irregular data within >> the data interval if needed. >> >> On Fri, 27 May 2016, Morway, Eric wrote: >> >> In bulk processing streamflow data available from an online database, I'm >>> wanting to trim the beginning and end of the time series so that daily >>> data >>> associated with incomplete "water years" (defined as extending from Oct >>> 1st >>> to the following September 30th) is trimmed off the beginning and end of >>> the series. >>> >>> For a small reproducible example, the time series below starts on >>> 2010-01-01 and ends on 2011-11-05. So the data between 2010-01-01 and >>> 2010-09-30 and also between 2011-10-01 and 2011-11-05 is not associated >>> with a complete set of data for their respective water years. With the >>> real data, the initial date of collection is arbitrary, could be 1901 or >>> 1938, etc. Because I'm cycling through potentially thousands of >>> records, I >>> need help in designing a function that is efficient. >>> >>> dat <- >>> >>> data.frame(Date=seq(as.Date("2010-01-01"),as.Date("2011-11-05"),by="day")) >>> dat$Q <- rnorm(nrow(dat)) >>> >>> dat$wyr <- as.numeric(format(dat$Date,"%Y")) >>> is.nxt <- as.numeric(format(dat$Date,"%m")) %in% 1:9 >>> dat$wyr[!is.nxt] <- dat$wyr[!is.nxt] + 1 >>> >>> >>> function(dat) { >>> ... >>> returns a subset of dat such that dat$Date > xxxx-09-30 & dat$Date < >>> yyyy-10-01 >>> ... >>> } >>> >>> where the years between xxxx-yyyy are "complete" (no missing days). In >>> the >>> example above, the returned dat would extend from 2010-10-01 to >>> 2011-09-30 >>> >>> Any offered guidance is very much appreciated. >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> >> --------------------------------------------------------------------------- >> Jeff Newmiller The ..... ..... Go >> Live... >> DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live >> Go... >> Live: OO#.. Dead: OO#.. Playing >> Research Engineer (Solar/Batteries O.O#. #.O#. with >> /Software/Embedded Controllers) .OO#. .OO#. >> rocks...1k >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > --------------------------------------------------------------------------- > Jeff Newmiller The ..... ..... Go Live... > DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live > Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/Batteries O.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. rocks...1k > --------------------------------------------------------------------------- >[[alternative HTML version deleted]]