thr3ads.net - R help - [R] aggregate data.frame based on column class [Jan 2013]

If this information is useful, please help other people find it:
Share via:

Martin Batholdy

2013-Jan-11 15:07 UTC

[R] aggregate data.frame based on column class

Hi,

When using the aggregate function to aggregate a data.frame by one or more
grouping variables I often have the problem, that I want the mean for some
numeric variables but the unique value for factor variables.

So for example in this data-frame:

data <- data.frame(x = rnorm(10,1,2), group = c(rep(1,5), rep(2,5)), gender
=c(rep('m',5), rep('f',5)))
aggregate(data, by=list(data$group), FUN=mean)


I would like to have 'm' and 'f' in the third column, not NA.


I see the problem, that it could happen that there is no unique factor level in
a group ?
but is there an alternative function who at least tries what I am aiming at?

That is;

"aggregate the data.frame by a list of grouping variables,
for numeric variables compute the mean,
for factor variables return the unique factor value"


Thanks!

arun

2013-Jan-11 15:24 UTC

head link

[R] aggregate data.frame based on column class

Hi,
Hope this is what you meant.
#data1
aggregate(.~group+gender,data=data1,mean)
#? group gender???????? x
#1???? 2????? f? 1.750686
#2???? 1????? m -1.074343
A.K.




----- Original Message -----
From: Martin Batholdy <batholdy at googlemail.com>
To: "r-help at r-project.org" <r-help at r-project.org>
Cc: 
Sent: Friday, January 11, 2013 10:07 AM
Subject: [R] aggregate data.frame based on column class

Hi,

When using the aggregate function to aggregate a data.frame by one or more
grouping variables I often have the problem, that I want the mean for some
numeric variables but the unique value for factor variables.

So for example in this data-frame:

data <- data.frame(x = rnorm(10,1,2), group = c(rep(1,5), rep(2,5)), gender
=c(rep('m',5), rep('f',5)))
aggregate(data, by=list(data$group), FUN=mean)


I would like to have 'm' and 'f' in the third column, not NA.


I see the problem, that it could happen that there is no unique factor level in
a group ?
but is there an alternative function who at least tries what I am aiming at?

That is;

"aggregate the data.frame by a list of grouping variables,
for numeric variables compute the mean,
for factor variables return the unique factor value"


Thanks!
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Ista Zahn

2013-Jan-11 15:27 UTC

head link

[R] aggregate data.frame based on column class

Please see in line.

On Fri, Jan 11, 2013 at 10:07 AM, Martin Batholdy
<batholdy at googlemail.com> wrote:> Hi,
>
> When using the aggregate function to aggregate a data.frame by one or more
grouping variables I often have the problem, that I want the mean for some
numeric variables but the unique value for factor variables.
>
> So for example in this data-frame:
>
> data <- data.frame(x = rnorm(10,1,2), group = c(rep(1,5), rep(2,5)),
gender =c(rep('m',5), rep('f',5)))
> aggregate(data, by=list(data$group), FUN=mean)
>
>
> I would like to have 'm' and 'f' in the third column, not
NA.
>
>
> I see the problem, that it could happen that there is no unique factor
level in a group ?
> but is there an alternative function who at least tries what I am aiming
at?
>
> That is;
>
> "aggregate the data.frame by a list of grouping variables,
> for numeric variables compute the mean,
> for factor variables return the unique factor value"
R is a language, so you just have to do the translation:

mt <- function(x) {
  if(is.numeric(x)) { # if x is numeric
    return(mean(x)) # compute the mean
  } else { # otherwise
    tab <- table(x) # tabulate x
    return(paste(paste(names(tab), # and format it for display
                       tab, sep=": "),
                 collapse=", "))
  }
}

aggregate(Dat, by=list(Dat$group), FUN=mt)

Best,
Ista>
>
> Thanks!
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

arun

2013-Jan-11 16:09 UTC

head link

[R] aggregate data.frame based on column class

Hi,

May be I misunderstood ur question.
You could do this:
res<-aggregate(.~group,data=data1,mean)
res$gender<-data1$gender[match(res$gender,as.numeric(data1$gender))]
?res
#? group???????? x gender
#1???? 1 -1.074343????? m
#2???? 2? 1.750686????? f
A.K.




----- Original Message -----
From: Martin Batholdy <batholdy at googlemail.com>
To: "r-help at r-project.org" <r-help at r-project.org>
Cc: 
Sent: Friday, January 11, 2013 10:07 AM
Subject: [R] aggregate data.frame based on column class

Hi,

When using the aggregate function to aggregate a data.frame by one or more
grouping variables I often have the problem, that I want the mean for some
numeric variables but the unique value for factor variables.

So for example in this data-frame:

data <- data.frame(x = rnorm(10,1,2), group = c(rep(1,5), rep(2,5)), gender
=c(rep('m',5), rep('f',5)))
aggregate(data, by=list(data$group), FUN=mean)


I would like to have 'm' and 'f' in the third column, not NA.


I see the problem, that it could happen that there is no unique factor level in
a group ?
but is there an alternative function who at least tries what I am aiming at?

That is;

"aggregate the data.frame by a list of grouping variables,
for numeric variables compute the mean,
for factor variables return the unique factor value"


Thanks!
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Maybe Matching Threads

Search for more maybe matching threads

R help - Jan 2013 - aggregate data.frame based on column class

[R] aggregate data.frame based on column class

[R] aggregate data.frame based on column class

[R] aggregate data.frame based on column class

[R] aggregate data.frame based on column class

Maybe Matching Threads