thr3ads.net - R help - [R] function in aggregate applied to specific columns only [Jan 2010]

If this information is useful, please help other people find it:
Share via:

david hilton shanabrook

2010-Jan-04 03:46 UTC

[R] function in aggregate applied to specific columns only

I want to use aggregate with the mean function on specific columns

gender <- factor(c("m", "m", "f",
"f", "m"))
student <- c(0001, 0002, 0003, 0003, 0001)
score <- c(50, 60, 70, 65, 60)
basicSub <- data.frame(student, gender, score)
basicSubMean <- aggregate(basicSub, by=list(basicSub$student), FUN=mean,
na.rm=TRUE)

This doesn't work, one cannot take the mean of a factor (gender).  Is there
any way of specifying which columns to use for the mean?  I want to aggregate by
student, obtaining mean scores, and assume any other factors are unchanging in a
specific student, ie. gender.

Thanks
	[[alternative HTML version deleted]]

David Winsemius

2010-Jan-04 03:58 UTC

head link

[R] function in aggregate applied to specific columns only

On Jan 3, 2010, at 10:46 PM, david hilton shanabrook wrote:
> I want to use aggregate with the mean function on specific columns
>
> gender <- factor(c("m", "m", "f",
"f", "m"))
> student <- c(0001, 0002, 0003, 0003, 0001)
> score <- c(50, 60, 70, 65, 60)
> basicSub <- data.frame(student, gender, score)
> basicSubMean <- aggregate(basicSub, by=list(basicSub$student),  
> FUN=mean, na.rm=TRUE)
 > basicSubMean <- aggregate(basicSub$score, by=list(basicSub 
$student), FUN=mean, na.rm=TRUE)
 > basicSubMean
   Group.1    x
1       1 55.0
2       2 60.0
3       3 67.5
>
> This doesn't work, one cannot take the mean of a factor (gender).   
> Is there any way of specifying which columns to use for the mean?  I  
> want to aggregate by student, obtaining mean scores, and assume any  
> other factors are unchanging in a specific student, ie. gender.
>
> Thanks
> 	[[alternative HTML version deleted]]-- 

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

Dennis Murphy

2010-Jan-04 03:59 UTC

head link

[R] function in aggregate applied to specific columns only

Hi:

Perhaps the plyr package would be useful. It contains functions colwise(),
numcolwise() and
catcolwise() that will perform the same operation on the stated type of
object. In this case,
numcolwise() is appropriate:
> str(basicSub)'data.frame':   5 obs. of  3 variables:
 $ student: num  1 2 3 3 1
 $ gender : Factor w/ 2 levels "f","m": 2 2 1 1 2
 $ score  : num  50 60 70 65 60> basicSub$student <- factor(basicSub$student)  # convert student to
factor
> library(plyr)# First argument is data frame, the next is the grouping variable, the
# third is the function to apply.> ddply(basicSub, .(student), numcolwise(mean))  student score
1       1  55.0
2       2  60.0
3       3  67.5

HTH,
Dennis

On Sun, Jan 3, 2010 at 7:46 PM, david hilton shanabrook <
dhshanab@acad.umass.edu> wrote:
> I want to use aggregate with the mean function on specific columns
>
> gender <- factor(c("m", "m", "f",
"f", "m"))
> student <- c(0001, 0002, 0003, 0003, 0001)
> score <- c(50, 60, 70, 65, 60)
> basicSub <- data.frame(student, gender, score)
> basicSubMean <- aggregate(basicSub, by=list(basicSub$student), FUN=mean,
> na.rm=TRUE)
>
> This doesn't work, one cannot take the mean of a factor (gender).  Is
there
> any way of specifying which columns to use for the mean?  I want to
> aggregate by student, obtaining mean scores, and assume any other factors
> are unchanging in a specific student, ie. gender.
>
> Thanks
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

milton ruser

2010-Jan-04 04:04 UTC

head link

[R] function in aggregate applied to specific columns only

You want this?
> basicSubMean <- aggregate(basicSub[c("score")],
by=list(basicSub$student),
FUN=mean, na.rm=TRUE)> basicSubMean  Group.1 score
1       1  55.0
2       2  60.0
3       3  67.5

bests
milton

On Sun, Jan 3, 2010 at 10:46 PM, david hilton shanabrook <
dhshanab@acad.umass.edu> wrote:
> I want to use aggregate with the mean function on specific columns
>
> gender <- factor(c("m", "m", "f",
"f", "m"))
> student <- c(0001, 0002, 0003, 0003, 0001)
> score <- c(50, 60, 70, 65, 60)
> basicSub <- data.frame(student, gender, score)
> basicSubMean <- aggregate(basicSub, by=list(basicSub$student), FUN=mean,
> na.rm=TRUE)
>
> This doesn't work, one cannot take the mean of a factor (gender).  Is
there
> any way of specifying which columns to use for the mean?  I want to
> aggregate by student, obtaining mean scores, and assume any other factors
> are unchanging in a specific student, ie. gender.
>
> Thanks
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
>
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Gabor Grothendieck

2010-Jan-04 04:14 UTC

head link

[R] function in aggregate applied to specific columns only

Here are 6 ways:

1. aggregate
> aggregate(basicSub["score"], basicSub["student"], mean)  student score
1       1  55.0
2       2  60.0
3       3  67.5

2. tapply
> with(basicSub, tapply(score, student, mean))   1    2    3
55.0 60.0 67.5

3. summaryBy in doBy package
> library(doBy)
> summaryBy(. ~ student, basicSub)  student score.mean
1       1       55.0
2       2       60.0
3       3       67.5

4. sqldf in sqldf package.  Uses SQL:
> library(sqldf)
> sqldf("select student, avg(score) from basicSub group by
student")  student avg(score)
1       1       55.0
2       2       60.0
3       3       67.5

5. summary.formula in Hmisc
> summary(score ~ student, basicSub)score    N=5

+-------+-+-+-----+
|       | |N|score|
+-------+-+-+-----+
|student|1|2|55.0 |
|       |2|1|60.0 |
|       |3|2|67.5 |
+-------+-+-+-----+
|Overall| |5|61.0 |
+-------+-+-+-----+

6. plyr (see Dennis Murphy's solution in this thread)


On Sun, Jan 3, 2010 at 10:46 PM, david hilton shanabrook
<dhshanab at acad.umass.edu> wrote:> I want to use aggregate with the mean function on specific columns
>
> gender <- factor(c("m", "m", "f",
"f", "m"))
> student <- c(0001, 0002, 0003, 0003, 0001)
> score <- c(50, 60, 70, 65, 60)
> basicSub <- data.frame(student, gender, score)
> basicSubMean <- aggregate(basicSub, by=list(basicSub$student), FUN=mean,
na.rm=TRUE)
>
> This doesn't work, one cannot take the mean of a factor (gender). ?Is
there any way of specifying which columns to use for the mean? ?I want to
aggregate by student, obtaining mean scores, and assume any other factors are
unchanging in a specific student, ie. gender.
>
> Thanks
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Possibly Parallel Threads

Search for more seemingly similar threads

R help - Jan 2010 - function in aggregate applied to specific columns only

[R] function in aggregate applied to specific columns only

[R] function in aggregate applied to specific columns only

[R] function in aggregate applied to specific columns only

[R] function in aggregate applied to specific columns only

[R] function in aggregate applied to specific columns only

Possibly Parallel Threads