I am trying to calculate quantiles of a data frame column split up by two factors: # Calculate the quantiles quarts = tapply(gdf$tt, list(gdf$Runway, gdf$OnHour), FUN=quantile, na.rm = TRUE) This does not work:> quarts04L 04R 15R 22L 22R 27 32 33L 33R 0 NULL Numeric,5 NULL Numeric,5 NULL Numeric,5 NULL Numeric,5 NULL 1 NULL Numeric,5 NULL Numeric,5 NULL NULL NULL Numeric,5 NULL 2 NULL NULL NULL Numeric,5 NULL NULL NULL NULL NULL 3 NULL NULL NULL NULL NULL NULL NULL Numeric,5 NULL 4 NULL NULL NULL NULL NULL NULL NULL NULL NULL 5 NULL NULL NULL NULL NULL NULL NULL NULL NULL 6 NULL NULL NULL NULL NULL NULL NULL NULL NULL 7 NULL Numeric,5 NULL NULL NULL Numeric,5 NULL Numeric,5 NULL 8 NULL Numeric,5 NULL Numeric,5 NULL Numeric,5 NULL Numeric,5 NULL . . . But if I leave out either of the two factors, it does work> quarts = tapply(gdf$tt, list(gdf$Runway), FUN=quantile, na.rm = TRUE) > quarts$`04L` 0% 25% 50% 75% 100% 4 8 9 10 20 $`04R` 0% 25% 50% 75% 100% 0 9 10 11 28 . . . . How can I get this to work? Thanks, Jim Rome
Hi James, I don't know how to solve it with "tapply" (something with split I think..), but you could use "plyr" (from Hadley Wickham). library(plyr) # Generate some data set.seed(321) myD <- data.frame( Place = sample(c("AWQ","DFR", "WEQ"), 10, replace=T), Light = sample(LETTERS[1:2], 15, replace=T), value=rnorm(30) ) myD[c(3,12,29), "value"] <- NA # data.frame to data.frame ddply(myD, .(Place, Light), summarise, quan_value = quantile(value, na.rm=TRUE)) # data.frame to list quant <- function(df) quantile(df$value, na.rm=TRUE) dlply(myD, .(Place, Light), quant) Cheers Patrick Am 09.04.2010 03:24, schrieb James Rome:> I am trying to calculate quantiles of a data frame column split up by > two factors: > # Calculate the quantiles > quarts = tapply(gdf$tt, list(gdf$Runway, gdf$OnHour), FUN=quantile, > na.rm = TRUE) > This does not work: >> quarts > 04L 04R 15R 22L 22R 27 32 > 33L 33R > 0 NULL Numeric,5 NULL Numeric,5 NULL Numeric,5 NULL > Numeric,5 NULL > 1 NULL Numeric,5 NULL Numeric,5 NULL NULL NULL > Numeric,5 NULL > 2 NULL NULL NULL Numeric,5 NULL NULL NULL > NULL NULL > 3 NULL NULL NULL NULL NULL NULL NULL > Numeric,5 NULL > 4 NULL NULL NULL NULL NULL NULL NULL > NULL NULL > 5 NULL NULL NULL NULL NULL NULL NULL > NULL NULL > 6 NULL NULL NULL NULL NULL NULL NULL > NULL NULL > 7 NULL Numeric,5 NULL NULL NULL Numeric,5 NULL > Numeric,5 NULL > 8 NULL Numeric,5 NULL Numeric,5 NULL Numeric,5 NULL > Numeric,5 NULL > . . . > > But if I leave out either of the two factors, it does work >> quarts = tapply(gdf$tt, list(gdf$Runway), FUN=quantile, na.rm = TRUE) >> quarts > $`04L` > 0% 25% 50% 75% 100% > 4 8 9 10 20 > > $`04R` > 0% 25% 50% 75% 100% > 0 9 10 11 28 > . . . . > > How can I get this to work? > > Thanks, > Jim Rome > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On Thu, 8 Apr 2010, James Rome wrote:> I am trying to calculate quantiles of a data frame column split up by > two factors: > # Calculate the quantiles > quarts = tapply(gdf$tt, list(gdf$Runway, gdf$OnHour), FUN=quantile, > na.rm = TRUE) > This does not work:It seems like it did work. It returned a matrix list of the results, some of which are NULL and some of which are numeric vectors of length 5. Try str( quarts ) to get a sense of what is going on. HTH, Chuck p.s. providing commented, minimal, self-contained, reproducible code (as requested) will give you more informative answers.>> quarts > 04L 04R 15R 22L 22R 27 32 > 33L 33R > 0 NULL Numeric,5 NULL Numeric,5 NULL Numeric,5 NULL > Numeric,5 NULL > 1 NULL Numeric,5 NULL Numeric,5 NULL NULL NULL > Numeric,5 NULL > 2 NULL NULL NULL Numeric,5 NULL NULL NULL > NULL NULL > 3 NULL NULL NULL NULL NULL NULL NULL > Numeric,5 NULL > 4 NULL NULL NULL NULL NULL NULL NULL > NULL NULL > 5 NULL NULL NULL NULL NULL NULL NULL > NULL NULL > 6 NULL NULL NULL NULL NULL NULL NULL > NULL NULL > 7 NULL Numeric,5 NULL NULL NULL Numeric,5 NULL > Numeric,5 NULL > 8 NULL Numeric,5 NULL Numeric,5 NULL Numeric,5 NULL > Numeric,5 NULL > . . . > > But if I leave out either of the two factors, it does work >> quarts = tapply(gdf$tt, list(gdf$Runway), FUN=quantile, na.rm = TRUE) >> quarts > $`04L` > 0% 25% 50% 75% 100% > 4 8 9 10 20 > > $`04R` > 0% 25% 50% 75% 100% > 0 9 10 11 28 > . . . . > > How can I get this to work? > > Thanks, > Jim Rome > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901