Hi:
Here are two ways to do it - one with ddply() in the plyr package and
another with package data.table.
# Toy data frame:
tsdf <- data.frame(year = rep(c(1960:1963), c(366, rep(365, 3))),
jday = c(1:366, rep(1:365, 3)),
y = rnorm(4*365 + 1))
# A function to output maximum response and the day on which it occurs
# For use in ddply(), f() needs to input a data frame df and output a data frame
f <- function(df) data.frame(max_day = df$jday[which.max(df$y)],
ymax = max(df$y))
ddply(tsdf, .(year), f)
# In data.table, one can pass the core of f() in as a list instead:
library(data.table)
tsdt <- data.table(tsdf, key = 'year')
tsdt[, list(max_day = jday[which.max(y)], ymax = max(y)), by = 'year']
If you intend to do a lot of data summarization, these two packages,
along with reshape2 and doBy, are worth being familiar with.
HTH,
Dennis
On Mon, Jun 13, 2011 at 1:30 PM, Kara Przeczek <przeczek at unbc.ca>
wrote:> Dear All,
> I have several sets of data such as this:
>
> ?year jday ?avg_m3s
> 1 1960 ? ?1 4.262307
> 2 1960 ? ?2 4.242308
> 3 1960 ? ?3 4.216923
> 4 1960 ? ?4 4.185385
> 5 1960 ? ?5 4.151538
> 6 1960 ? ?6 4.133846
> ?...
>
> There is a value for each day of multiple years. In this particular data
set it goes up to 1974. I am am looking to obtain the minimum and maximum values
for each year, but also know on which julian day ("jday") they
occurred.
> I can get the maximum value for each year with:
>
>> mx = aggregate(ddat$avg_m3s, list(Year=ddat$year), max, na.rm=T)
>> colnames(mx) <- c("year","max_daily")
>
> ? year max_daily
> 1 ?1960 ?60.24615
> 2 ?1961 ?73.90000
> 3 ?1962 ?56.40000
> ...
>
>
> But I want to output the max with the corresponding day on which it
occurred, such as:
> ?year jday ?avg_m3s
> 1 1960 ? ?136 60.24615
> 2 1961 ? ?129 73.90000
> 3 1962 ? ?111 56.40000
>
>
> I haven't been able to determine how to keep those ties without
aggregating by both year *and day, which is what happened with:
> aggregate(ddat$avg_m3s, list(Year=ddat$year, Day = ddat$jday), max,
na.rm=T),
> resulting in a value output for every single day of each year.
>
> Other attempts to get both columns to output failed.
>
> Any help would be greatly appreciated!
> Kara
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>