thr3ads.net - similar to: "aggregating using several functions"

Displaying 20 results from an estimated 20000 matches similar to: "aggregating using several functions"

2009 Jul 28

aggregating strings

I am currently summarising a data set by collapsing data based on common identifiers in a column. I am using the 'aggregate' function to summarise numeric columns, i.e. "aggregate(dat[,3], list(dat$gene), mean)". I also wish to summarise text columns e.g. by concatenating values in a comma separated list, but the aggregate function can only return scalar values and so something

testing two-factor anova effects using model comparison approach with lm() and anova()

2009 Feb 27

testing two-factor anova effects using model comparison approach with lm() and anova()

I wonder if someone could explain the behavior of the anova() and lm() functions in the following situation: I have a standard 3x2 factorial design, factorA has 3 levels, factorB has 2 levels, they are fully crossed. I have a dependent variable DV. Of course I can do the following to get the usual anova table: > anova(lm(DV~factorA+factorB+factorA:factorB)) Analysis of Variance Table

(IT WAS) Aggregating an its series

2006 Apr 09

(IT WAS) Aggregating an its series

Just strip off the hours component of the dates, then take a subset of the data where the hour is <= 12. I did not execute this, so you might need to change it a bit: hours <- as.integer(format(dates(base),"%H")) new.data <- base[hours <= 12,] aggregate(new.data,by=list(as.factor(format(dates(new.data),"%Y%m%d"))),mean,na.rm=T) -----Original Message-----

aggregating columns in a data frame in different ways

2006 Apr 28

aggregating columns in a data frame in different ways

I would like to use aggregate() to combine statistics for several days in a data frame. My data frame looks similar to this: date type count value 1 2006-04-01 A 10 99.6 2 2006-04-01 B 4 33.2 3 2006-04-02 A 22 43.2 4 2006-04-02 B 8 44.9 5 2006-04-03 A 12 12.4 6 2006-04-03 B 14 18.5 ('date' is a factor, and my

Behaviour of interactions in glm

2008 Mar 25

Behaviour of interactions in glm

Dear All, I'm struggling a little with the behaviour of R with GLM interactions. In particular, I have a dataset with two factors - call them factor A and factor B, where I would like to fit a GLM that is factor A + (grouped factor A):factor B. To try to isolate this, I've ignored the original "factor A" part, as that I have this as a separate column in my data. So, it

aggregating a daily zoo object to a weekly zoo object

2010 Aug 01

aggregating a daily zoo object to a weekly zoo object

Dear R People: I'm trying to convert a daily zoo object to a weekly zoo object: xdate <- seq(as.Date("2002-01-01"),as.Date("2010-07-10"),by="day") library(zoo) length(xdate) xt <- zoo(rnorm(3113),order=xdate) xdat2 <- seq(index(xt)[1],index(xt)[3113],by="week") xt.w <- aggregate(xt,by=xdat2,mean) Error: length(time(x)) ==

Aggregating zoo object with NAs in multiple column

2008 Jul 23

Aggregating zoo object with NAs in multiple column

I would like to run an aggregation on a zoo object that has multiple series in it, with one of more series having NA values. The problem is that by default the aggregate function will produce an NA value in each aggregated period that contains an NA. For instance, if I run aggregate(x, as.yearmon(index(x)), mean) on the example object "x" which is printed below, I will just get a bunch

aggregating values at discreet irregular time intervals into hourly values

2009 Aug 18

aggregating values at discreet irregular time intervals into hourly values

Hello R users, I'm a newby to R (and programming software at large) and I would need some help to sum up event data at discreet time and irregular time interval into a hourly frequency. Here is an example of my time series frame (irregular time-serie object - irts in the tseries package): time value 2008-12-19 19:11:03 GMT 1 2008-12-19 19:12:00 GMT 0 2008-12-19

Different way of aggregating

2010 Aug 18

Different way of aggregating

Hi Usually "aggregate" is used to calculate things such as the sum of all data on the first day, the sum next day, and so on. But how can I calculate the mean of the first hour of all days, the mean of the second hour of all days, and so on. ??? That's Most examples: today at 1am + today at 2am + today at 3am +.... -> sum today tomorrow at 1am + tomorrow at

How to preserve date format while aggregating

2008 Sep 08

How to preserve date format while aggregating

Hi I have a dataframe in which some subjects appear in more than one row. I want to extract the subject-rows which have the minimum date per subject. I tried the following aggregate function. attach(dataframe.xy) aggregate(Date,list(SubjectID),min) Unfortunately, the format of the Date-column changes to numeric, when I'm applying this function. How can I preserve the date format? Thanks

Aggregating data

2011 Aug 05

Aggregating data

I aggregated my data: aggresults <-aggregate(results, by=list(results$a, results$b, results$c), FUN=mean, na.rm=TRUE) results has about 8000 lines of data, and aggresults has about 80 lines. I would like to create a separate variable for each of the 80 aggregates, each containing the 100 lines that were aggregated. I would also like to create plots for each of those 80 datasets. Is

Problem with bargraph.CI in Sciplot package

2009 Apr 10

Problem with bargraph.CI in Sciplot package

Hi there, I wonder if anyone can help me. I'm trying to use bargraph.CI in the Sciplot package when there is a missing combination of the factor levels. Unfortunately the standard errors on the plot do not appear to be correct. Consider an analysis consisting of two factors A and B. When all factor level combinations are present all appears fine: library(sciplot) #all data

Two basic data manipulation questions (counting and aggregating)

2007 Apr 13

Two basic data manipulation questions (counting and aggregating)

Dear R users, I hav two basic data manipulations questions that I can't resolve. My data is a data frame which look like the following : id type 10002 "7" 10061 "1" 10061 "1" 10061 "4" 10065 "7" 10114 "1" 10114 "1" 10114 "4" 10136 "7" 10136 "2" 10136 "2" First, I

aggregating specific parts in zoo index column to perform sliding average

2012 Feb 28

aggregating specific parts in zoo index column to perform sliding average

Here's my code: http://pastebin.com/0yRxEVtm The important parts are uncommented and should be easy to find using the link above. For the following line of code, I plan on looking for a way to offset it up 7 rows so that the 15 minute timestamp would be considered the "median" of the subset being averaged to find the mean: avgCool = aggregate(intCool, trunc(time(intCool),

aggregating along bins and bin-quantiles

2008 Oct 20

aggregating along bins and bin-quantiles

Dear all, I would like to aggregate a data frame (consisting of 2 columns - one for the bins, say factors, and one for the values) along bins and quantiles within the bins. I have tried aggregate(data.frame$values, list(bin = data.frame $bin,Quantile=cut2(data.frame$bin,g=10)),sum) but then the quantiles apply to the population as a whole and not the individual bins. Upon this

Aggregating a List

2002 Nov 06

Aggregating a List

Hi all, There must be a really obvious R solution to this, but I can't figure out how to aggregate a list. For instance, if I read.table the following from a file: Val1 Val2 A 3 4 A 5 6 B 4 4 I would like to take the mean (or median) across any/all rows of type "A" to end up with the structure: Val1 Val2 A 4 5 B 4 4 in this case. How would I go about doign that w/o doing a

aggregating using 'with' function

2010 Feb 20

aggregating using 'with' function

Hi All, I am interested in aggregating a data frame based on 2 categories--mean effect size (r) for each 'id's' 'mod1'. The 'with' function works well when aggregating on one category (e.g., based on 'id' below) but doesnt work if I try 2 categories. How can this be accomplished? # sample data id<-c(1,1,1,rep(4:12)) n<-c(10,20,13,22,28,12,12,36,19,12,

Aggregating an its series

2006 Apr 07

Aggregating an its series

I'm using a very long irregular time-series of air temperature and relative humidity of this kind (this is an extract only) its.format("% Y%d%m %X) > base T H 20020601 12.00.00 27.1 47 20020601 15.00.00 29.1 39 20020601 18.00.00 27.4 39 20020601 21.00.00 24.0 40 20020602 0.00.00 22.0 73 20020602 3.00.00 19.2 49 20020602 6.00.00 19.5 74 20020602

Aggregating Data

2017 Nov 14

Aggregating Data

R-Help Please disregard as I figure something out, unless there is a more elegant way ... myData.sum <- aggregate(x = myData[c("s72","s79","s82","s83","s116","s119")], FUN = sum, by = list(Group.date = myData$shortdate)) > head(myData.sum) Group.date s72 s79 s82 s83 s116 s119 1

Aggregating Survey responses for weighting

2011 Oct 17

Aggregating Survey responses for weighting

I have about 27,000 survey responses from across about 150 Bus Routes, each with potentially 100 stops. I've recorded the total Ons and Offs for each stop on each bus run, as well as the stop pair each survey response corresponds to. I wish to create weights based on the On and Off stop for each line and direction. This will create a very sparse "half table" (observations by

similar to: aggregating using several functions