On Thu, 2 Sep 2021, Rich Shepard wrote:> If I correctly understand the output of as.POSIXlt each date and time > element is separate, so input such as 2016-03-03 12:00 would now be 2016 03 > 03 12 00 (I've not read how the elements are separated). (The TZ is not > important because all data are either PST or PDT.)Using this script: discharge <- read.csv('../data/water/discharge.dat', header = TRUE, sep = ',', stringsAsFactors = FALSE) discharge$sampdate <- as.POSIXlt(discharge$sampdate, tz = "", format = '%Y-%m-%d %H:%M', optional = 'logical') discharge$cfs <- as.numeric(discharge$cfs, length = 6) I get this result:> head(discharge)sampdate cfs 1 2016-03-03 12:00:00 149000 2 2016-03-03 12:10:00 150000 3 2016-03-03 12:20:00 151000 4 2016-03-03 12:30:00 156000 5 2016-03-03 12:40:00 154000 6 2016-03-03 12:50:00 150000 I'm completely open to suggestions on using this output to calculate monthly means and sds. If dplyr:summarize() will do so please show me how to modify this command: disc_monthly <- ( discharge %>% group_by(sampdate) %>% summarize(exp_value = mean(cfs, na.rm = TRUE)) because it produces daily means, not monthly means. TIA, Rich
Andrew Simmons
2021-Sep-02 19:10 UTC
[R] Calculate daily means from 5-minute interval data
You could use 'split' to create a list of data frames, and then apply a
function to each to get the means and sds.
cols <- "cfs" # add more as necessary
S <- split(discharge[cols], format(discharge$sampdate, format =
"%Y-%m"))
means <- do.call("rbind", lapply(S, colMeans, na.rm = TRUE))
sds <- do.call("rbind", lapply(S, function(xx) sapply(xx, sd,
na.rm TRUE)))
On Thu, Sep 2, 2021 at 3:01 PM Rich Shepard <rshepard at appl-ecosys.com>
wrote:
> On Thu, 2 Sep 2021, Rich Shepard wrote:
>
> > If I correctly understand the output of as.POSIXlt each date and time
> > element is separate, so input such as 2016-03-03 12:00 would now be
2016
> 03
> > 03 12 00 (I've not read how the elements are separated). (The TZ
is not
> > important because all data are either PST or PDT.)
>
> Using this script:
> discharge <- read.csv('../data/water/discharge.dat', header =
TRUE, sep > ',', stringsAsFactors = FALSE)
> discharge$sampdate <- as.POSIXlt(discharge$sampdate, tz = "",
> format = '%Y-%m-%d %H:%M',
> optional = 'logical')
> discharge$cfs <- as.numeric(discharge$cfs, length = 6)
>
> I get this result:
> > head(discharge)
> sampdate cfs
> 1 2016-03-03 12:00:00 149000
> 2 2016-03-03 12:10:00 150000
> 3 2016-03-03 12:20:00 151000
> 4 2016-03-03 12:30:00 156000
> 5 2016-03-03 12:40:00 154000
> 6 2016-03-03 12:50:00 150000
>
> I'm completely open to suggestions on using this output to calculate
> monthly
> means and sds.
>
> If dplyr:summarize() will do so please show me how to modify this command:
> disc_monthly <- ( discharge
> %>% group_by(sampdate)
> %>% summarize(exp_value = mean(cfs, na.rm = TRUE))
> because it produces daily means, not monthly means.
>
> TIA,
>
> Rich
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]