Displaying 20 results from an estimated 30000 matches similar to: "aggregate data.frame based on column class"
2011 Jul 11
2
best way to aggregate / rearrange data.frame with different data types
Hi,
I have a data.frame that looks like this:
Subject <- c(rep(1,4), rep(2,4), rep(3,4))
y <- rnorm(12, 3, 2)
gender <- c(rep("w",4), rep("m",4), rep("w",4))
comment <- c(rep("comment A",4), rep("comment B",4), rep("comment C",4))
data <- data.frame(Subject,y,gender,comment)
data
Subject y gender
2010 Jan 04
4
function in aggregate applied to specific columns only
I want to use aggregate with the mean function on specific columns
gender <- factor(c("m", "m", "f", "f", "m"))
student <- c(0001, 0002, 0003, 0003, 0001)
score <- c(50, 60, 70, 65, 60)
basicSub <- data.frame(student, gender, score)
basicSubMean <- aggregate(basicSub, by=list(basicSub$student), FUN=mean, na.rm=TRUE)
This
2012 Sep 25
1
mean-aggregate – but use unique for factor variables
Hi,
I have a data.frame which I want to aggregate.
There are some grouping variables and some continuous variables for which I would like to have the mean.
However there are also some factor-variables in the data-frame that are not grouping variables and I actually would like to aggregate these variables with the unique() function.
Is that possible with the standard aggregate-function?
If I
2012 Dec 25
5
aggregate / collapse big data frame efficiently
Hi,
I need to aggregate rows of a data.frame by computing the mean for rows with the same factor-level on one factor-variable;
here is the sample code:
x <- data.frame(rep(letters,2), rnorm(52), rnorm(52), rnorm(52))
aggregate(x, list(x[,1]), mean)
Now my problem is, that the actual data-set is much bigger (120 rows and approximately 100.000 columns) ? and it takes very very long
2010 Feb 05
2
sum a particular column by group
Dear all,
I have a table like this:
> eds
R.ID Region Gender Agegr Time nvisits
1 1 A F 60--64 1:00 1
2 2 O F 55--59 1:20 1
3 3 O F 55--59 3:45 3
4 4 S M 60--64 1:10 3
5 5 W F 55--59 12:30 1
6
2011 Aug 02
2
Help with aggregate syntax for a multi-column function please.
Dear R-experts:
I am using a function called AUC whose arguments are data, time, id, and
dv.
data is the name of the dataframe,
time is the independent variable column name,
id is the subject id and
dv is the dependent variable.
The function computes area under the curve by trapezoidal rule, for each
subject id.
I would like to embed this in aggregate to further subset by each
2010 Feb 11
1
aggregate function / custom column names?
This question is about column names returned by the aggregate function. Consider the following example
df <- data.frame(
id = c(rep('11',30),rep('22',30),rep('33',30)),
value = c(rnorm(30,2,0.5), rnorm(30,3,0.5), rnorm(30,6,0.5))
)
aggregate(df[,c("value"),drop=FALSE], by=list(id=df$id), max)
output:
id value
1 11 2.693528
2 22 3.868400
3 33
2006 Feb 24
1
(Newbie) Aggregate for NA values
Folks,
Sorry if this question has been answered before or is obvious (or
worse, statistically "bad"). I don't understand what was said in one
of the search results that seems somewhat related.
I use aggregate to get a quick summary of the data. Part of what I am
looking for in the summary is, how much influence might the NA's have
had, if they were included, and is excluding
2006 Dec 28
2
Aggregation using list with Hmisc summarize function
Hi All,
I'm using the Hmisc summarize function and used list instead of llist to
provide the by variables. It generated an error message. Is this a bug,
or do I misunderstand how Hmisc works with lists? The program below
demonstrates the error message.
Thanks,
Bob
x<-1:8
group <- c(1,1,1,1,2,2,2,2)
gender<- c(1,2,1,2,1,2,1,2)
mydata<-data.frame(x,group,gender)
2007 Dec 16
2
question about the aggregate function with respect to order of levels of grouping elements
Hi,
I am using aggregate() to add up groups of data according to year and month.
It seems that the function aggregate() automatically sorts the levels of
factors of the grouping elements, even if the order of the levels of factors
is supplied. I am wondering if this is a bug, or if I missed something
important. Below is an example that shows what I mean. Does anyone know if
this is just the way
2006 Oct 01
3
aggregate function with 'NA'
Dear r-help reader,
I have some problems with the aggregate function.
My datframe looks like
>frame
Day Time V1 V2
1 M 0 3 NA
2 M 0 4 NA
3 M 0 5 2
4 M 1 NA 4
5 M 1 10 6
6 T 0 4 45
7 T 1 4 3
8 T 1 3 2
9 T 1 6 1
I used the aggegate function to obtain the mean in V1 and V2 over the
grouping variable
Time and Day
2010 Aug 13
2
Lattice xyplots plots with multiple lines per cell
Hello,
I need to plot the means of some outcome for two groups (control vs
intervention) over time (discrete) on the same plot, for various subsets
such as gender and grade level. What I have been doing is creating all
possible subsets first, using the aggregate function to create the means
over time, then plotting the means over time (as a simple line plot with
both control & intervention
2008 Jun 17
1
re sultant column names from reshape::cast, with a fun.aggregate vector
try this:
scores.melt = data.frame(grade = floor(runif(100, 1,10)), variable =
'score', value = rnorm(100));
cast(scores.melt, grade ~ variable, fun.aggregate = c(mean, length))
it has the nice column names of:
grade score_mean score_length
1 1 0.08788535 8
2 2 0.16720313 15
3 3 0.41046299 7
4 4 0.13928356 13
...
but
2012 Oct 14
6
transforming a .csv file column names as per a particular column rows using R code
Hello all,
I have a .csv file like below.
Tool,Step_Number,Data1,Data2... etc up to 100 columns.
A,1,0,1
A,2,3,1
A,3,2,1
.
.
B,1,3,2
B,2,1,2
B,3,3,2
.
.
...... so on upto 50 rows
where the column "*Tool*" has distinct steps in second column
"*Step_Number*",but both have same entries in Step_Number column.
I want the output like below.
2010 Jul 16
2
aggregate(...) with multiple functions
hi all - i'm just wondering what sort of code people write to
essentially performa an aggregate call, but with different functions
being applied to the various columns.
for example, if i have a data frame x and would like to marginalize by
a factor f for the rows, but apply mean() to col1 and median() to
col2.
if i wanted to apply mean() to both columns, i would call:
aggregate(x, list(f),
2009 Mar 07
2
ttest in R
Dear list,
i am a biologist who needs to do some ttest between disease and non disease,
sex, genotype and the serum levels of proteins on a large number of
individuals. i have been using excel for a long time but it is very tedious
and time consuming. i am posting the data below and ask your help in
generating a code to get this analysis done in R. thanks
gender disease genotype data
M N CC
2013 Mar 11
2
aggregate(), tapply(): Why is the order of the grouping variables not kept?
Dear expeRts,
The question is rather simple: Why does aggregate (or similarly tapply()) not keep the order of the grouping variable(s)?
Here is an example:
x <- data.frame(group = rep(LETTERS[1:2], each=10),
year = rep(rep(2001:2005, each=2), 2),
value = rep(1:10, each=2))
## => sorted according to group, then year
aggregate(value ~ group + year, data=x,
2010 Jan 18
2
column selection for aggregate()
Hi everybody!
I'm working on R today so I have a lot of questions (you may have
noticed that it's the 3rd email today). I'm new on R, so please excuse
the "spam"!
I have a dataset "ssfa" with many rows and the column names are:
> names(ssfa)
[1] "SPECSHOR" "BONE" "TO_POS" "MEASUREM" "FACETTE"
2010 Oct 07
3
aggregate text column by a few rows
Hi, R function aggregate can only take summary stats functions, can I
aggregate text columns? For example, for the dataframe below,
> a <- rbind(data.frame(id=1, name='Tom',
hobby='fishing'),data.frame(id=1, name='Tom',
hobby='reading'),data.frame(id=2, name='Mary',
hobby='reading'),data.frame(id=3, name='John',
2009 May 24
2
Assigning variable names from one object to another object
Hello
I have 2 datasets say Data1 and Data2 both are of different dimesions.
Data1:
120 rows and 6 columns (Varname, Vartype, Labels, Description, ....)
The column Varname has 120 rows which has variable names such id, age,
gender,.....so on
Data2:
12528 rows and 120 columns
The column names in this case are V1, V2, ......... V120 (which are default
names in R when we say head=F in read.csv)