Displaying 20 results from an estimated 30000 matches similar to: "Aggregate issues with subset"
2008 Oct 02
2
Multiple hist(ograms) - One plot
Hello,
I am trying to plot multiple histograms with the same scales, etc into one
plot. The commands below produce a 3 page PDF with each histogram occupying
the upper right quadrant. And use slightly different scales on the X and Y
axes.
> s21 <- dat[dat$sc_recov=="21",]
> s21.ED <- subset(s21, select=(bbED))
> s31 <- all[all$sc_recov=="31",]
> s31.ED
2008 Dec 07
5
How to force aggregate to exclude NA ?
The aggregate function does "almost" all that I need to summarize a datasets, except that I can't specify exclusion of NAs without a little bit of hassle.
> set.seed(143)
> m <- data.frame(A=sample(LETTERS[1:5], 20, T), B=sample(LETTERS[1:10], 20, T), C=sample(c(NA, 1:4), 20, T), D=sample(c(NA,1:4), 20, T))
> m
A B C D
1 E I 1 NA
2 A C NA NA
3 D I NA 3
4 C I
2008 Sep 24
2
keep the row indexes/names when do aggregate
Hi, R-users,
If I have a data frame like this:
>x<-data.frame(g=c("g1","g2","g1","g1","g2"),v=c(1,7,3,2,8))
g v
1 g1 1
2 g2 7
3 g1 3
4 g1 2
5 g2 8
It contains two groups, g1 and g2. Now for each group I want the max v:
> aggregate(x$v,list(g=x$g),max)
g x
1 g1 3
2 g2 8
Beautiful. But what if I want to keep the row index of (g1
2010 Mar 15
2
aggregate without removing empty subset
Hi the list,
As it is say in its doc, the aggregate function remove empty subsets. Is
it possible to NOT remove empty subset ?
--- 8< -------
m <- matrix(1:12,4)
part <- factor(c("A","B","A","B"),levels=c("A","B","C"))
aggregate(m,list(part),mean)
### I get:
# Group.1 V1 V2 V3
# 1 A 2 6 10
# 2 B 3 7
2009 Apr 30
1
Using 'aggregate' when dependent on row value increments
Dear all,
I have a data frame of three columns, which I have sorted by Latitude as follows:
> test2[60:80,]
Latitude Longitude Sim_1986
61948 85.25 -29.25 2.175345
61957 85.25 -28.75 8.750486
61967 85.25 -28.25 33.569305
61977 85.25 -27.75 23.702572
61988 85.25 -27.25 26.488602
62000 85.25 -26.75 23.915724
62012 85.25 -26.25 25.055082
62027
2012 Jul 03
2
Data manipulation with aggregate
Hi everyone.
I have these data :
myData = data.frame(Name = c('a', 'a', 'b', 'b'), length = c(1,2,3,4), type
= c('x','x','y','z'))
which gives me:
Name length type
1 a 1 x
2 a 2 x
3 b 3 y
4 b 4 z
I would group (mean) this DF using 'Name' as grouping factor. However, I
have a
2007 Dec 16
2
question about the aggregate function with respect to order of levels of grouping elements
Hi,
I am using aggregate() to add up groups of data according to year and month.
It seems that the function aggregate() automatically sorts the levels of
factors of the grouping elements, even if the order of the levels of factors
is supplied. I am wondering if this is a bug, or if I missed something
important. Below is an example that shows what I mean. Does anyone know if
this is just the way
2008 Sep 15
4
getting data into correct format for summarizing ... reshape, aggregate, or...
I would like to reformat this data frame into something that I can
produce some descriptive statistics. I have been playing around with
the reshape package and maybe this is not the best way to proceed. I
would like to use RiverMile and constituent as the grouping variables
to get the summary statistics:
198a 198b
mean mean
sd sd
... ...
etc. for all of these.
I have tried
2013 Jan 11
3
aggregate data.frame based on column class
Hi,
When using the aggregate function to aggregate a data.frame by one or more grouping variables I often have the problem, that I want the mean for some numeric variables but the unique value for factor variables.
So for example in this data-frame:
data <- data.frame(x = rnorm(10,1,2), group = c(rep(1,5), rep(2,5)), gender =c(rep('m',5), rep('f',5)))
aggregate(data,
2006 Oct 01
3
aggregate function with 'NA'
Dear r-help reader,
I have some problems with the aggregate function.
My datframe looks like
>frame
Day Time V1 V2
1 M 0 3 NA
2 M 0 4 NA
3 M 0 5 2
4 M 1 NA 4
5 M 1 10 6
6 T 0 4 45
7 T 1 4 3
8 T 1 3 2
9 T 1 6 1
I used the aggegate function to obtain the mean in V1 and V2 over the
grouping variable
Time and Day
2012 Sep 25
1
mean-aggregate – but use unique for factor variables
Hi,
I have a data.frame which I want to aggregate.
There are some grouping variables and some continuous variables for which I would like to have the mean.
However there are also some factor-variables in the data-frame that are not grouping variables and I actually would like to aggregate these variables with the unique() function.
Is that possible with the standard aggregate-function?
If I
2010 Jan 30
2
aggregate by factor
I have a data frame with two columns, a factor and a numeric. I want to create data frame with the factor, its frequency and the median of the numeric column
> head(motifList)
events score
1 aeijm -0.25000000
2 begjm -0.25000000
3 afgjm -0.25000000
4 afhjm -0.25000000
5 aeijm -0.25000000
6 aehjm 0.08333333
To get the frequency table of events:
> motifTable <-
2013 Mar 11
2
aggregate(), tapply(): Why is the order of the grouping variables not kept?
Dear expeRts,
The question is rather simple: Why does aggregate (or similarly tapply()) not keep the order of the grouping variable(s)?
Here is an example:
x <- data.frame(group = rep(LETTERS[1:2], each=10),
year = rep(rep(2001:2005, each=2), 2),
value = rep(1:10, each=2))
## => sorted according to group, then year
aggregate(value ~ group + year, data=x,
2012 Jan 17
2
Using Aggregate() with FUN arguments, which require more than one input variables
Dear all,
I am trying to apply the aggregate() function to calculate correlations for
subsets of a dataframe. My argument x is supposed to consist of 2 numerical
vectors, which represent x and y for the cor() function.
The following error results when calling the aggregate function: Error in
FUN(X[[1L]], ...) : supply both 'x' and 'y' or a matrix-like 'x'. I think
the
2010 Jul 16
2
aggregate(...) with multiple functions
hi all - i'm just wondering what sort of code people write to
essentially performa an aggregate call, but with different functions
being applied to the various columns.
for example, if i have a data frame x and would like to marginalize by
a factor f for the rows, but apply mean() to col1 and median() to
col2.
if i wanted to apply mean() to both columns, i would call:
aggregate(x, list(f),
2009 Mar 24
1
aggregate() example fails]
Hi R users and developers on debian platforms.
I compile the R version 2.8.1 Patched (2009-03-18 r48193)
on my UBUNTU linux distribution.
But when I ask for the aggregate example it fails.
What am I missing?
example(aggregate)
aggrgt> ## Compute the averages for the variables in 'state.x77',grouped
aggrgt> ## according to the region (Northeast, South, North Central,West) that
2008 Feb 21
2
Problems with aggregate
Hello list,
I'm new to this list, so please forgive my ignorance. I have searched
R-help for some hints into what might be my problem, but I truly have no
idea where to go from here.
I have an object of approximately 15,000 rows and 2 columns. There are
many duplicates in the first column, all with different corresponding
values in the second column. For example (2 is duplicated):
2009 Mar 24
1
Is aggregate() function changing?
Hi R developers and debian users:
Finally I found how to work with aggregate() function
on the last patched version fo R.
I you use this command it fails:
aggregate(state.x77, list(Region = state.region), mean)
But if you modify it in this way, it works!:
aggregate(state.x77, list(Region = state.region), function(x) mean(x) )
Is it necesary to change the example?
What is changing in
2009 Mar 24
1
Is aggregate() function changing?
Hi R developers and debian users:
Finally I found how to work with aggregate() function
on the last patched version fo R.
I you use this command it fails:
aggregate(state.x77, list(Region = state.region), mean)
But if you modify it in this way, it works!:
aggregate(state.x77, list(Region = state.region), function(x) mean(x) )
Is it necesary to change the example?
What is changing in
2008 Jul 08
1
aggregate() function and na.rm = TRUE
All,
I've been using aggregate() to compute means and standard deviations at
time/treatment combinations for a longitudinal dataset, using na.rm = TRUE
for missing data.
This was working fine before, but now when I re-run some old code it isn't.
I've backtracked my steps and can't seem to find out why it was working
before but not now. In any event, below is a reproducible