thr3ads.net - R help - [R] Calculate daily means from 5-minute interval data [Sep 2021]

If this information is useful, please help other people find it:
Share via:

Rich Shepard

2021-Sep-02 18:16 UTC

[R] Calculate daily means from 5-minute interval data

On Mon, 30 Aug 2021, Richard O'Keefe wrote:
>> x <- rnorm(samples.per.day * 365)
>> length(x)
> [1] 105120
>
> Reshape the fake data into a matrix where each row represents one
> 24-hour period.
>
>> m <- matrix(x, ncol=samples.per.day, byrow=TRUE)
Richard,

Now I understand the need to keep the date and time as a single datetime
column; separately dplyr's sumamrize() provides daily means (too many data
points to plot over 3-5 years). I reformatted the data to provide a
sampledatetime column and a values column.

If I correctly understand the output of as.POSIXlt each date and time
element is separate, so input such as 2016-03-03 12:00 would now be 2016 03
03 12 00 (I've not read how the elements are separated). (The TZ is not
important because all data are either PST or PDT.)
> Now we can summarise the rows any way we want.
> The basic tool here is ?apply.
> ?rowMeans is said to be faster than using apply to calculate means,
> so we'll use that.  There is no *rowSds so we have to use apply
> for the standard deviation.  I use ?head because I don't want to
> post tens of thousands of meaningless numbers.
If I create a matrix using the above syntax the resulting rows contain all
recorded values for a specific day. What would be the syntax to collect all
values for each month?

This would result in 12 rows per year; the periods of record for the five
variables availble from that gauge station vary in length.

Regards,

Rich

Rich Shepard

2021-Sep-02 18:42 UTC

head link

[R] Calculate daily means from 5-minute interval data

On Thu, 2 Sep 2021, Rich Shepard wrote:
> If I correctly understand the output of as.POSIXlt each date and time
> element is separate, so input such as 2016-03-03 12:00 would now be 2016 03
> 03 12 00 (I've not read how the elements are separated). (The TZ is not
> important because all data are either PST or PDT.)
Using this script:
discharge <- read.csv('../data/water/discharge.dat', header = TRUE,
sep = ',', stringsAsFactors = FALSE)
discharge$sampdate <- as.POSIXlt(discharge$sampdate, tz = "",
                                  format = '%Y-%m-%d %H:%M',
                                  optional = 'logical')
discharge$cfs <- as.numeric(discharge$cfs, length = 6)

I get this result:> head(discharge)              sampdate    cfs
1 2016-03-03 12:00:00 149000
2 2016-03-03 12:10:00 150000
3 2016-03-03 12:20:00 151000
4 2016-03-03 12:30:00 156000
5 2016-03-03 12:40:00 154000
6 2016-03-03 12:50:00 150000

I'm completely open to suggestions on using this output to calculate monthly
means and sds.

If dplyr:summarize() will do so please show me how to modify this command:
disc_monthly <- ( discharge
         %>% group_by(sampdate)
         %>% summarize(exp_value = mean(cfs, na.rm = TRUE))
because it produces daily means, not monthly means.

TIA,

Rich

R help - Sep 2021 - Calculate daily means from 5-minute interval data

[R] Calculate daily means from 5-minute interval data

[R] Calculate daily means from 5-minute interval data