similar to: aggregating using several functions

Displaying 20 results from an estimated 20000 matches similar to: "aggregating using several functions"

2009 Jul 28
2
aggregating strings
I am currently summarising a data set by collapsing data based on common identifiers in a column. I am using the 'aggregate' function to summarise numeric columns, i.e. "aggregate(dat[,3], list(dat$gene), mean)". I also wish to summarise text columns e.g. by concatenating values in a comma separated list, but the aggregate function can only return scalar values and so something
2009 Feb 27
1
testing two-factor anova effects using model comparison approach with lm() and anova()
I wonder if someone could explain the behavior of the anova() and lm() functions in the following situation: I have a standard 3x2 factorial design, factorA has 3 levels, factorB has 2 levels, they are fully crossed. I have a dependent variable DV. Of course I can do the following to get the usual anova table: > anova(lm(DV~factorA+factorB+factorA:factorB)) Analysis of Variance Table
2006 Apr 09
0
(IT WAS) Aggregating an its series
Just strip off the hours component of the dates, then take a subset of the data where the hour is <= 12. I did not execute this, so you might need to change it a bit: hours <- as.integer(format(dates(base),"%H")) new.data <- base[hours <= 12,] aggregate(new.data,by=list(as.factor(format(dates(new.data),"%Y%m%d"))),mean,na.rm=T) -----Original Message-----
2006 Apr 28
3
aggregating columns in a data frame in different ways
I would like to use aggregate() to combine statistics for several days in a data frame. My data frame looks similar to this: date type count value 1 2006-04-01 A 10 99.6 2 2006-04-01 B 4 33.2 3 2006-04-02 A 22 43.2 4 2006-04-02 B 8 44.9 5 2006-04-03 A 12 12.4 6 2006-04-03 B 14 18.5 ('date' is a factor, and my
2008 Mar 25
0
Behaviour of interactions in glm
Dear All, I'm struggling a little with the behaviour of R with GLM interactions. In particular, I have a dataset with two factors - call them factor A and factor B, where I would like to fit a GLM that is factor A + (grouped factor A):factor B. To try to isolate this, I've ignored the original "factor A" part, as that I have this as a separate column in my data. So, it
2010 Aug 01
1
aggregating a daily zoo object to a weekly zoo object
Dear R People: I'm trying to convert a daily zoo object to a weekly zoo object: xdate <- seq(as.Date("2002-01-01"),as.Date("2010-07-10"),by="day") library(zoo) length(xdate) xt <- zoo(rnorm(3113),order=xdate) xdat2 <- seq(index(xt)[1],index(xt)[3113],by="week") xt.w <- aggregate(xt,by=xdat2,mean) Error: length(time(x)) ==
2008 Jul 23
1
Aggregating zoo object with NAs in multiple column
I would like to run an aggregation on a zoo object that has multiple series in it, with one of more series having NA values. The problem is that by default the aggregate function will produce an NA value in each aggregated period that contains an NA. For instance, if I run aggregate(x, as.yearmon(index(x)), mean) on the example object "x" which is printed below, I will just get a bunch
2009 Aug 18
1
aggregating values at discreet irregular time intervals into hourly values
Hello R users, I'm a newby to R (and programming software at large) and I would need some help to sum up event data at discreet time and irregular time interval into a hourly frequency. Here is an example of my time series frame (irregular time-serie object - irts in the tseries package): time value 2008-12-19 19:11:03 GMT 1 2008-12-19 19:12:00 GMT 0 2008-12-19
2010 Aug 18
2
Different way of aggregating
Hi Usually "aggregate" is used to calculate things such as the sum of all data on the first day, the sum next day, and so on. But how can I calculate the mean of the first hour of all days, the mean of the second hour of all days, and so on. ??? That's Most examples: today at 1am + today at 2am + today at 3am +.... -> sum today tomorrow at 1am + tomorrow at
2008 Sep 08
2
How to preserve date format while aggregating
Hi I have a dataframe in which some subjects appear in more than one row. I want to extract the subject-rows which have the minimum date per subject. I tried the following aggregate function. attach(dataframe.xy) aggregate(Date,list(SubjectID),min) Unfortunately, the format of the Date-column changes to numeric, when I'm applying this function. How can I preserve the date format? Thanks
2011 Aug 05
2
Aggregating data
I aggregated my data: aggresults <-aggregate(results, by=list(results$a, results$b, results$c), FUN=mean, na.rm=TRUE) results has about 8000 lines of data, and aggresults has about 80 lines. I would like to create a separate variable for each of the 80 aggregates, each containing the 100 lines that were aggregated. I would also like to create plots for each of those 80 datasets. Is
2009 Apr 10
2
Problem with bargraph.CI in Sciplot package
Hi there, I wonder if anyone can help me. I'm trying to use bargraph.CI in the Sciplot package when there is a missing combination of the factor levels. Unfortunately the standard errors on the plot do not appear to be correct. Consider an analysis consisting of two factors A and B. When all factor level combinations are present all appears fine: library(sciplot) #all data
2007 Apr 13
2
Two basic data manipulation questions (counting and aggregating)
Dear R users, I hav two basic data manipulations questions that I can't resolve. My data is a data frame which look like the following : id type 10002 "7" 10061 "1" 10061 "1" 10061 "4" 10065 "7" 10114 "1" 10114 "1" 10114 "4" 10136 "7" 10136 "2" 10136 "2" First, I
2012 Feb 28
1
aggregating specific parts in zoo index column to perform sliding average
Here's my code: http://pastebin.com/0yRxEVtm The important parts are uncommented and should be easy to find using the link above. For the following line of code, I plan on looking for a way to offset it up 7 rows so that the 15 minute timestamp would be considered the "median" of the subset being averaged to find the mean: avgCool = aggregate(intCool, trunc(time(intCool),
2008 Oct 20
4
aggregating along bins and bin-quantiles
Dear all, I would like to aggregate a data frame (consisting of 2 columns - one for the bins, say factors, and one for the values) along bins and quantiles within the bins. I have tried aggregate(data.frame$values, list(bin = data.frame $bin,Quantile=cut2(data.frame$bin,g=10)),sum) but then the quantiles apply to the population as a whole and not the individual bins. Upon this
2002 Nov 06
1
Aggregating a List
Hi all, There must be a really obvious R solution to this, but I can't figure out how to aggregate a list. For instance, if I read.table the following from a file: Val1 Val2 A 3 4 A 5 6 B 4 4 I would like to take the mean (or median) across any/all rows of type "A" to end up with the structure: Val1 Val2 A 4 5 B 4 4 in this case. How would I go about doign that w/o doing a
2010 Feb 20
3
aggregating using 'with' function
Hi All, I am interested in aggregating a data frame based on 2 categories--mean effect size (r) for each 'id's' 'mod1'. The 'with' function works well when aggregating on one category (e.g., based on 'id' below) but doesnt work if I try 2 categories. How can this be accomplished? # sample data id<-c(1,1,1,rep(4:12)) n<-c(10,20,13,22,28,12,12,36,19,12,
2006 Apr 07
1
Aggregating an its series
I'm using a very long irregular time-series of air temperature and relative humidity of this kind (this is an extract only) its.format("% Y%d%m %X) > base T H 20020601 12.00.00 27.1 47 20020601 15.00.00 29.1 39 20020601 18.00.00 27.4 39 20020601 21.00.00 24.0 40 20020602 0.00.00 22.0 73 20020602 3.00.00 19.2 49 20020602 6.00.00 19.5 74 20020602
2017 Nov 14
0
Aggregating Data
R-Help Please disregard as I figure something out, unless there is a more elegant way ... myData.sum <- aggregate(x = myData[c("s72","s79","s82","s83","s116","s119")], FUN = sum, by = list(Group.date = myData$shortdate)) > head(myData.sum) Group.date s72 s79 s82 s83 s116 s119 1
2011 Oct 17
0
Aggregating Survey responses for weighting
I have about 27,000 survey responses from across about 150 Bus Routes, each with potentially 100 stops. I've recorded the total Ons and Offs for each stop on each bus run, as well as the stop pair each survey response corresponds to. I wish to create weights based on the On and Off stop for each line and direction. This will create a very sparse "half table" (observations by