thr3ads.net - R help - [R] How to take the average of multiple rows [Jun 2008]

If this information is useful, please help other people find it:
Share via:

2008-Jun-15 18:18 UTC

[R] How to take the average of multiple rows

Dear all,


I have a matrix, called newdata1,
> dim(newdata1)[1] 34176    83

It looks like:

            EntrezID    Name      S1  S2   S3    S4 S5.....
1           4076       CAPRIN1  0.1  0.2   0.3...
2           139170    WDR40B  0.4  0.5   0.6...
3            5505    PPP1R2P1  0.3  0.3   0.7...
4            4076       CAPRIN1  0.7  0.3   0.2...
5           139170    WDR40B  null  0.8   0.4...
6           139170    WDR40B  null  null   0.75...


If there are rows whose "EntrezID" and "Name" are exactly
the same,
I want to take the average for these rows.

There might be some 'null's in the data set. For example, there are
three rows of the same "EntrezID" and "Name" as 139170 and
WDR40B.
For the sample called 'S1', there are three values, 0.4, null, null. For
this
scenario, I want to keep the final value as 0.4 For the sample 'S2', the
values
are 0.5, 0.8, null. For this, I want to ignore 'null', and take the
average
of
0.5 and 0.8, the final value is (0.5+0.8)/2=0.65. For the sample 'S3',
there
is no 'null', so just take the average as (0.6+0.4+0.75)/3=0.5833.

Can you show me how to do this?

I appreciate!

Best,
      Allen

	[[alternative HTML version deleted]]

Marc Schwartz

2008-Jun-15 18:38 UTC

head link

[R] How to take the average of multiple rows

on 06/15/2008 01:18 PM ss wrote:> Dear all,
> 
> 
> I have a matrix, called newdata1,
> 
>> dim(newdata1)
> [1] 34176    83
> 
> It looks like:
> 
>             EntrezID    Name      S1  S2   S3    S4 S5.....
> 1           4076       CAPRIN1  0.1  0.2   0.3...
> 2           139170    WDR40B  0.4  0.5   0.6...
> 3            5505    PPP1R2P1  0.3  0.3   0.7...
> 4            4076       CAPRIN1  0.7  0.3   0.2...
> 5           139170    WDR40B  null  0.8   0.4...
> 6           139170    WDR40B  null  null   0.75...
> 
> 
> If there are rows whose "EntrezID" and "Name" are
exactly the same,
> I want to take the average for these rows.
> 
> There might be some 'null's in the data set. For example, there are
> three rows of the same "EntrezID" and "Name" as 139170
and WDR40B.
> For the sample called 'S1', there are three values, 0.4, null,
null. For
> this
> scenario, I want to keep the final value as 0.4 For the sample
'S2', the
> values
> are 0.5, 0.8, null. For this, I want to ignore 'null', and take the
average
> of
> 0.5 and 0.8, the final value is (0.5+0.8)/2=0.65. For the sample
'S3', there
> is no 'null', so just take the average as (0.6+0.4+0.75)/3=0.5833.
> 
> Can you show me how to do this?
> 
> I appreciate!
If your data file is exactly the way you have it above, you first want 
to convert the 'null' entries to NA so that they are treated as missing 
values by R.

Thus:

DF <- read.table(YourFileName, header = TRUE, na.strings = "null")

 > DF
   EntrezID     Name  S1  S2   S3
1     4076  CAPRIN1 0.1 0.2 0.30
2   139170   WDR40B 0.4 0.5 0.60
3     5505 PPP1R2P1 0.3 0.3 0.70
4     4076  CAPRIN1 0.7 0.3 0.20
5   139170   WDR40B  NA 0.8 0.40
6   139170   WDR40B  NA  NA 0.75


Then use aggregate():

 > aggregate(DF[, -c(1:2)], by = list(DF$EntrezID, DF$Name),
             mean, na.rm = TRUE)
   Group.1  Group.2  S1   S2        S3
1    4076  CAPRIN1 0.4 0.25 0.2500000
2    5505 PPP1R2P1 0.3 0.30 0.7000000
3  139170   WDR40B 0.4 0.65 0.5833333


See ?read.table, ?aggregate and ?mean for more information. Take note of 
the 'na.rm' argument in ?mean.

HTH,

Marc Schwartz

Newbie19_02

2010-Apr-07 13:30 UTC

head link

[R] How to take the average of multiple rows

The aggregate function is great.  THanks for this post.
-- 
View this message in context:
http://n4.nabble.com/How-to-take-the-average-of-multiple-rows-tp848880p1754315.html
Sent from the R help mailing list archive at Nabble.com.

Seemingly Similar Threads

Search for more possibly parallel threads

R help - Jun 2008 - How to take the average of multiple rows

[R] How to take the average of multiple rows

[R] How to take the average of multiple rows

[R] How to take the average of multiple rows

Seemingly Similar Threads