thr3ads.net - R help - [R] summing a large, partitioned data frame [Jan 2010]

If this information is useful, please help other people find it:
Share via:

james.foadi at diamond.ac.uk

2010-Jan-25 16:07 UTC

[R] summing a large, partitioned data frame

Dear R community,
I'm trying to develop a fast way of summing specific rows of a large data
frame.
Here is an example of the kind of data frames I'm dealing with:
> refls      H K L M/ISYM BATCH          I     SIGI
43247 1 0 5     21    79   61.44117  2.20553
1040  1 0 5    257     6   15.16316  0.54431
2324  1 0 5    257     5   46.76152  1.67858
31515 1 0 5    259    60   57.97305  2.08104
35158 1 0 5    259    61    3.15614  0.11329
51575 1 0 6    259    88  380.04477  8.08878
51846 1 0 6    259    89  624.90802 13.30038
28946 1 1 4      1    42 2517.79492 55.37144
23199 1 1 4      5    31 2525.67407 55.54472
23198 1 1 4     21    39 2519.44653 55.40777
............................................
............................................

I need to add up all I's with same H, K, L and M/ISYM.
The new data frame coming out of this partial summing should look, in this case,
like:

      H K L M/ISYM BATCH          I     SIGI
43247 1 0 5     21    79   61.44117  2.20553
1040  1 0 5    257     6   61.92468  0.54431
31515 1 0 5    259    60   61.12919  2.08104
51575 1 0 6    259    88 1004.95279  8.08878
28946 1 1 4      1    42 2517.79492 55.37144
23199 1 1 4      5    31 2525.67407 55.54472
23198 1 1 4     21    39 2519.44653 55.40777
............................................
............................................


Essentially I only add those I's with same H, K, L, M/ISYM and replace the
sum
in a unique row in the new data frame. In other words there's first a
partition and then
a sum.

I have tried with a for loop, but it really takes too long.

I was wondering whether anyone knows of a better and faster way of doing this
operation.


J



Dr James Foadi PhD
Membrane Protein Laboratory (MPL)
Diamond Light Source Ltd
Diamond House
Harewell Science and Innovation Campus
Chilton, Didcot
Oxfordshire OX11 0DE

Email    :  james.foadi at diamond.ac.uk
Alt Email:  j.foadi at imperial.ac.uk

-- 
This e-mail and any attachments may contain confidential...{{dropped:8}}

Benilton Carvalho

2010-Jan-25 17:18 UTC

head link

[R] summing a large, partitioned data frame

check aggregate()   (the examples are quite helpful)

b


On Mon, Jan 25, 2010 at 4:07 PM,  <james.foadi at diamond.ac.uk>
wrote:> Dear R community,
> I'm trying to develop a fast way of summing specific rows of a large
data frame.
> Here is an example of the kind of data frames I'm dealing with:
>
>> refls
>      H K L M/ISYM BATCH          I     SIGI
> 43247 1 0 5     21    79   61.44117  2.20553
> 1040  1 0 5    257     6   15.16316  0.54431
> 2324  1 0 5    257     5   46.76152  1.67858
> 31515 1 0 5    259    60   57.97305  2.08104
> 35158 1 0 5    259    61    3.15614  0.11329
> 51575 1 0 6    259    88  380.04477  8.08878
> 51846 1 0 6    259    89  624.90802 13.30038
> 28946 1 1 4      1    42 2517.79492 55.37144
> 23199 1 1 4      5    31 2525.67407 55.54472
> 23198 1 1 4     21    39 2519.44653 55.40777
> ............................................
> ............................................
>
> I need to add up all I's with same H, K, L and M/ISYM.
> The new data frame coming out of this partial summing should look, in this
case, like:
>
>      H K L M/ISYM BATCH          I     SIGI
> 43247 1 0 5     21    79   61.44117  2.20553
> 1040  1 0 5    257     6   61.92468  0.54431
> 31515 1 0 5    259    60   61.12919  2.08104
> 51575 1 0 6    259    88 1004.95279  8.08878
> 28946 1 1 4      1    42 2517.79492 55.37144
> 23199 1 1 4      5    31 2525.67407 55.54472
> 23198 1 1 4     21    39 2519.44653 55.40777
> ............................................
> ............................................
>
>
> Essentially I only add those I's with same H, K, L, M/ISYM and replace
the sum
> in a unique row in the new data frame. In other words there's first a
partition and then
> a sum.
>
> I have tried with a for loop, but it really takes too long.
>
> I was wondering whether anyone knows of a better and faster way of doing
this operation.
>
>
> J
>
>
>
> Dr James Foadi PhD
> Membrane Protein Laboratory (MPL)
> Diamond Light Source Ltd
> Diamond House
> Harewell Science and Innovation Campus
> Chilton, Didcot
> Oxfordshire OX11 0DE
>
> Email    :  james.foadi at diamond.ac.uk
> Alt Email:  j.foadi at imperial.ac.uk
>
> --
> This e-mail and any attachments may contain confidential...{{dropped:8}}
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

R help - Jan 2010 - summing a large, partitioned data frame

[R] summing a large, partitioned data frame

[R] summing a large, partitioned data frame

Seemingly Similar Threads