On Thu, 2 Sep 2021, Rich Shepard wrote:> If I correctly understand the output of as.POSIXlt each date and time > element is separate, so input such as 2016-03-03 12:00 would now be 2016 03 > 03 12 00 (I've not read how the elements are separated). (The TZ is not > important because all data are either PST or PDT.)Using this script: discharge <- read.csv('../data/water/discharge.dat', header = TRUE, sep = ',', stringsAsFactors = FALSE) discharge$sampdate <- as.POSIXlt(discharge$sampdate, tz = "", format = '%Y-%m-%d %H:%M', optional = 'logical') discharge$cfs <- as.numeric(discharge$cfs, length = 6) I get this result:> head(discharge)sampdate cfs 1 2016-03-03 12:00:00 149000 2 2016-03-03 12:10:00 150000 3 2016-03-03 12:20:00 151000 4 2016-03-03 12:30:00 156000 5 2016-03-03 12:40:00 154000 6 2016-03-03 12:50:00 150000 I'm completely open to suggestions on using this output to calculate monthly means and sds. If dplyr:summarize() will do so please show me how to modify this command: disc_monthly <- ( discharge %>% group_by(sampdate) %>% summarize(exp_value = mean(cfs, na.rm = TRUE)) because it produces daily means, not monthly means. TIA, Rich
Andrew Simmons
2021-Sep-02 19:10 UTC
[R] Calculate daily means from 5-minute interval data
You could use 'split' to create a list of data frames, and then apply a function to each to get the means and sds. cols <- "cfs" # add more as necessary S <- split(discharge[cols], format(discharge$sampdate, format = "%Y-%m")) means <- do.call("rbind", lapply(S, colMeans, na.rm = TRUE)) sds <- do.call("rbind", lapply(S, function(xx) sapply(xx, sd, na.rm TRUE))) On Thu, Sep 2, 2021 at 3:01 PM Rich Shepard <rshepard at appl-ecosys.com> wrote:> On Thu, 2 Sep 2021, Rich Shepard wrote: > > > If I correctly understand the output of as.POSIXlt each date and time > > element is separate, so input such as 2016-03-03 12:00 would now be 2016 > 03 > > 03 12 00 (I've not read how the elements are separated). (The TZ is not > > important because all data are either PST or PDT.) > > Using this script: > discharge <- read.csv('../data/water/discharge.dat', header = TRUE, sep > ',', stringsAsFactors = FALSE) > discharge$sampdate <- as.POSIXlt(discharge$sampdate, tz = "", > format = '%Y-%m-%d %H:%M', > optional = 'logical') > discharge$cfs <- as.numeric(discharge$cfs, length = 6) > > I get this result: > > head(discharge) > sampdate cfs > 1 2016-03-03 12:00:00 149000 > 2 2016-03-03 12:10:00 150000 > 3 2016-03-03 12:20:00 151000 > 4 2016-03-03 12:30:00 156000 > 5 2016-03-03 12:40:00 154000 > 6 2016-03-03 12:50:00 150000 > > I'm completely open to suggestions on using this output to calculate > monthly > means and sds. > > If dplyr:summarize() will do so please show me how to modify this command: > disc_monthly <- ( discharge > %>% group_by(sampdate) > %>% summarize(exp_value = mean(cfs, na.rm = TRUE)) > because it produces daily means, not monthly means. > > TIA, > > Rich > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]