thr3ads.net - R help - [R] Summing identical IDs [Oct 2009]

If this information is useful, please help other people find it:
Share via:

PDXRugger

2009-Oct-29 19:03 UTC

[R] Summing identical IDs

Hello All, 
   I would like to select records with identical IDs then sum an attribute
then and return them to the data frame as a single record.  Please consider


Acres<-c(100,101,100,130,156,.5,293,300,.09)
Bldgid<-c(1,2,3,4,5,5,6,7,7)

DF=cbind(Acres,Bldgid)
DF<-as.data.frame(DF)

So that:

  Acres Bldgid
1 100.00      1
2 101.00      2
3 100.00      3
4 130.00      4
5 156.00      5
6   0.50      5
7 293.00      6
8 300.00      7
9   0.09      7

Becomes

  Acres Bldgid
1 100.00      1
2 101.00      2
3 100.00      3
4 130.00      4
5 156.50      5
7 293.00      6
8 300.09      7

dup<-unique(DF$Bldgid[duplicated(Bldgid)])
dupbuild<-DF[DF$Bldgid %in% dup,]
dupbuild..dupareasum<-sum(dupbuild$Acres[duplicated(dupbuild$Bldgid)])

This sums the unique Ids of the duplicated records, not whati want.  Thanks
ahead of time

JR


-- 
View this message in context:
http://www.nabble.com/Summing-identical-IDs-tp26118922p26118922.html
Sent from the R help mailing list archive at Nabble.com.

Dimitris Rizopoulos

2009-Oct-29 19:23 UTC

head link

[R] Summing identical IDs

one option is the following:

Acres <- c(100,101,100,130,156,.5,293,300,.09)
Bldgid <- c(1,2,3,4,5,5,6,7,7)
DF <- data.frame(Acres, Bldgid)

aggregate(DF, list(Bldgid), sum)


I hope it helps.

Best,
Dimitris


PDXRugger wrote:> Hello All, 
>    I would like to select records with identical IDs then sum an attribute
> then and return them to the data frame as a single record.  Please consider
> 
> 
> Acres<-c(100,101,100,130,156,.5,293,300,.09)
> Bldgid<-c(1,2,3,4,5,5,6,7,7)
> 
> DF=cbind(Acres,Bldgid)
> DF<-as.data.frame(DF)
> 
> So that:
> 
>   Acres Bldgid
> 1 100.00      1
> 2 101.00      2
> 3 100.00      3
> 4 130.00      4
> 5 156.00      5
> 6   0.50      5
> 7 293.00      6
> 8 300.00      7
> 9   0.09      7
> 
> Becomes
> 
>   Acres Bldgid
> 1 100.00      1
> 2 101.00      2
> 3 100.00      3
> 4 130.00      4
> 5 156.50      5
> 7 293.00      6
> 8 300.09      7
> 
> dup<-unique(DF$Bldgid[duplicated(Bldgid)])
> dupbuild<-DF[DF$Bldgid %in% dup,]
> dupbuild..dupareasum<-sum(dupbuild$Acres[duplicated(dupbuild$Bldgid)])
> 
> This sums the unique Ids of the duplicated records, not whati want.  Thanks
> ahead of time
> 
> JR
> 
> 
-- 
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

PDXRugger

2009-Oct-29 21:23 UTC

head link

[R] Summing identical IDs

Terrific help thank you.  
dupbuild<-aggregate(DF$Acres, list(Bldgid), sum)
This line worked best.

Now im going to challenge everyone (i think?)

Consider the following:


Acres<-c(100,101,100,130,156,.5,293,300,.09,100,12.5)
Bldgid<-c(1,2,3,4,5,5,6,7,7,8,8)
Year<-c(1946,1952,1922,1910,1955,1955,1999,1990,1991,2000,2000)
ImpValue<-c(1000,1400,1300,900,5000,1200,500,1000,300,1000,1000)
DF=cbind(Acres,Bldgid,Year,ImpValue)
DF<-as.data.frame(DF)

I would like to do the same, except there are some rules i want to follow. 
I only want to aggregate the Acres if :
a) The Years are not identical
b) The ImpValues are not identical
c) The Years are identical and the ImpValue are not 
d)The ImpValues are identical and the Years are not

but if the Acres and ImpValues are identical i would still like to add the
Acres together and form one case.
If the cases are put together i would also like to add the ImpValues
together.  So the below 

    Acres Bldgid Year ImpValue
1  100.00      1 1946     1000
2  101.00      2 1952     1400
3  100.00      3 1922     1300
4  130.00      4 1910      900
5  156.00      5 1955     5000
6    0.50      5 1955     1200
7  293.00      6 1999      500
8  300.00      7 1990     1000
9    0.09      7 1991      300
10 100.00      8 2000     1000
11  12.50      8 2000     1000

would become 

    Acres Bldgid Year ImpValue
1  100.00      1 1946     1000
2  101.00      2 1952     1400
3  100.00      3 1922     1300
4  130.00      4 1910      900
5  156.50     5 1955     6200
7  293.00      6 1999      500
8  300.09     7 1990     1300
10 112.50      8 2000     1000

Thanks, i gave it a bunch of shots but nothing worth posting.





PDXRugger wrote:> 
> Hello All, 
>    I would like to select records with identical IDs then sum an attribute
> then and return them to the data frame as a single record.  Please
> consider
> 
> 
> Acres<-c(100,101,100,130,156,.5,293,300,.09)
> Bldgid<-c(1,2,3,4,5,5,6,7,7)
> 
> DF=cbind(Acres,Bldgid)
> DF<-as.data.frame(DF)
> 
> So that:
> 
>   Acres Bldgid
> 1 100.00      1
> 2 101.00      2
> 3 100.00      3
> 4 130.00      4
> 5 156.00      5
> 6   0.50      5
> 7 293.00      6
> 8 300.00      7
> 9   0.09      7
> 
> Becomes
> 
>   Acres Bldgid
> 1 100.00      1
> 2 101.00      2
> 3 100.00      3
> 4 130.00      4
> 5 156.50      5
> 7 293.00      6
> 8 300.09      7
> 
> dup<-unique(DF$Bldgid[duplicated(Bldgid)])
> dupbuild<-DF[DF$Bldgid %in% dup,]
> dupbuild..dupareasum<-sum(dupbuild$Acres[duplicated(dupbuild$Bldgid)])
> 
> This sums the unique Ids of the duplicated records, not whati want. 
> Thanks ahead of time
> 
> JR
> 
> 
> 
-- 
View this message in context:
http://www.nabble.com/Summing-identical-IDs-tp26118922p26121056.html
Sent from the R help mailing list archive at Nabble.com.

Jorge Ivan Velez

2009-Oct-30 02:29 UTC

head link

[R] Summing identical IDs

Hi JR,

Try also

with(DF, tapply(Acres, Bldgid, sum) )

HTH,
Jorge


On Thu, Oct 29, 2009 at 3:03 PM, PDXRugger <> wrote:
>
> Hello All,
>   I would like to select records with identical IDs then sum an attribute
> then and return them to the data frame as a single record.  Please consider
>
>
> Acres<-c(100,101,100,130,156,.5,293,300,.09)
> Bldgid<-c(1,2,3,4,5,5,6,7,7)
>
> DF=cbind(Acres,Bldgid)
> DF<-as.data.frame(DF)
>
> So that:
>
>  Acres Bldgid
> 1 100.00      1
> 2 101.00      2
> 3 100.00      3
> 4 130.00      4
> 5 156.00      5
> 6   0.50      5
> 7 293.00      6
> 8 300.00      7
> 9   0.09      7
>
> Becomes
>
>  Acres Bldgid
> 1 100.00      1
> 2 101.00      2
> 3 100.00      3
> 4 130.00      4
> 5 156.50      5
> 7 293.00      6
> 8 300.09      7
>
> dup<-unique(DF$Bldgid[duplicated(Bldgid)])
> dupbuild<-DF[DF$Bldgid %in% dup,]
> dupbuild..dupareasum<-sum(dupbuild$Acres[duplicated(dupbuild$Bldgid)])
>
> This sums the unique Ids of the duplicated records, not whati want.  Thanks
> ahead of time
>
> JR
>
>
> --
> View this message in context:
> http://www.nabble.com/Summing-identical-IDs-tp26118922p26118922.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

David Winsemius

2009-Oct-30 13:43 UTC

head link

[R] Summing identical IDs

On Oct 29, 2009, at 5:23 PM, PDXRugger wrote:
>
> Terrific help thank you.
> dupbuild<-aggregate(DF$Acres, list(Bldgid), sum)
> This line worked best.
>
> Now im going to challenge everyone (i think?)
>
> Consider the following:
>
>
> Acres<-c(100,101,100,130,156,.5,293,300,.09,100,12.5)
> Bldgid<-c(1,2,3,4,5,5,6,7,7,8,8)
> Year<-c(1946,1952,1922,1910,1955,1955,1999,1990,1991,2000,2000)
> ImpValue<-c(1000,1400,1300,900,5000,1200,500,1000,300,1000,1000)
> DF=cbind(Acres,Bldgid,Year,ImpValue)
> DF<-as.data.frame(DF)
>
> I would like to do the same, except there are some rules i want to  
> follow.
> I only want to aggregate the Acres if :
> a) The Years are not identical
> b) The ImpValues are not identical
> c) The Years are identical and the ImpValue are not
> d)The ImpValues are identical and the Years are not
As I review your Boolean logic, I run into serious problems.

c) and d) cannot be true if a and b) are true.

So no cases satisfy all 4 specs. In particular both of the pairs you  
say you want aggregated (5+6) and 10+11) violate rule a) and the  
second pair also violates b).

-- 
David>
> but if the Acres and ImpValues are identical i would still like to  
> add the
> Acres together and form one case.
> If the cases are put together i would also like to add the ImpValues
> together.  So the below
>
>    Acres Bldgid Year ImpValue
> 1  100.00      1 1946     1000
> 2  101.00      2 1952     1400
> 3  100.00      3 1922     1300
> 4  130.00      4 1910      900
> 5  156.00      5 1955     5000
> 6    0.50      5 1955     1200
> 7  293.00      6 1999      500
> 8  300.00      7 1990     1000
> 9    0.09      7 1991      300
> 10 100.00      8 2000     1000
> 11  12.50      8 2000     1000
>
> would become
>
>    Acres Bldgid Year ImpValue
> 1  100.00      1 1946     1000
> 2  101.00      2 1952     1400
> 3  100.00      3 1922     1300
> 4  130.00      4 1910      900
> 5  156.50     5 1955     6200
> 7  293.00      6 1999      500
> 8  300.09     7 1990     1300
> 10 112.50      8 2000     1000
>
> Thanks, i gave it a bunch of shots but nothing worth posting.
>
>
>
>
>
> PDXRugger wrote:
>>
>> Hello All,
>>   I would like to select records with identical IDs then sum an  
>> attribute
>> then and return them to the data frame as a single record.  Please
>> consider
>>
>>
>> Acres<-c(100,101,100,130,156,.5,293,300,.09)
>> Bldgid<-c(1,2,3,4,5,5,6,7,7)
>>
>> DF=cbind(Acres,Bldgid)
>> DF<-as.data.frame(DF)
>>
>> So that:
>>
>>  Acres Bldgid
>> 1 100.00      1
>> 2 101.00      2
>> 3 100.00      3
>> 4 130.00      4
>> 5 156.00      5
>> 6   0.50      5
>> 7 293.00      6
>> 8 300.00      7
>> 9   0.09      7
>>
>> Becomes
>>
>>  Acres Bldgid
>> 1 100.00      1
>> 2 101.00      2
>> 3 100.00      3
>> 4 130.00      4
>> 5 156.50      5
>> 7 293.00      6
>> 8 300.09      7
>>
>> dup<-unique(DF$Bldgid[duplicated(Bldgid)])
>> dupbuild<-DF[DF$Bldgid %in% dup,]
>> dupbuild..dupareasum<-sum(dupbuild$Acres[duplicated(dupbuild 
>> $Bldgid)])
>>
>> This sums the unique Ids of the duplicated records, not whati want.
>> Thanks ahead of time
>>
>> JR
>>
>>
>>
>
> -- 
> View this message in context:
http://www.nabble.com/Summing-identical-IDs-tp26118922p26121056.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

PDXRugger

2009-Oct-30 19:29 UTC

head link

[R] Summing identical IDs

David,
  You are correct.  I think the frist two assumptions can be thrown out and
only the latter two (c,d) can be considered.  So how would i combine Acres
for matching Bldgids based on assumptions c,d?



David Winsemius wrote:> 
> 
> On Oct 29, 2009, at 5:23 PM, PDXRugger wrote:
> 
>>
>> Terrific help thank you.
>> dupbuild<-aggregate(DF$Acres, list(Bldgid), sum)
>> This line worked best.
>>
>> Now im going to challenge everyone (i think?)
>>
>> Consider the following:
>>
>>
>> Acres<-c(100,101,100,130,156,.5,293,300,.09,100,12.5)
>> Bldgid<-c(1,2,3,4,5,5,6,7,7,8,8)
>> Year<-c(1946,1952,1922,1910,1955,1955,1999,1990,1991,2000,2000)
>> ImpValue<-c(1000,1400,1300,900,5000,1200,500,1000,300,1000,1000)
>> DF=cbind(Acres,Bldgid,Year,ImpValue)
>> DF<-as.data.frame(DF)
>>
>> I would like to do the same, except there are some rules i want to  
>> follow.
>> I only want to aggregate the Acres if :
>> a) The Years are not identical
>> b) The ImpValues are not identical
>> c) The Years are identical and the ImpValue are not
>> d)The ImpValues are identical and the Years are not
> 
> As I review your Boolean logic, I run into serious problems.
> 
> c) and d) cannot be true if a and b) are true.
> 
> So no cases satisfy all 4 specs. In particular both of the pairs you  
> say you want aggregated (5+6) and 10+11) violate rule a) and the  
> second pair also violates b).
> 
> -- 
> David
>>
>> but if the Acres and ImpValues are identical i would still like to  
>> add the
>> Acres together and form one case.
>> If the cases are put together i would also like to add the ImpValues
>> together.  So the below
>>
>>    Acres Bldgid Year ImpValue
>> 1  100.00      1 1946     1000
>> 2  101.00      2 1952     1400
>> 3  100.00      3 1922     1300
>> 4  130.00      4 1910      900
>> 5  156.00      5 1955     5000
>> 6    0.50      5 1955     1200
>> 7  293.00      6 1999      500
>> 8  300.00      7 1990     1000
>> 9    0.09      7 1991      300
>> 10 100.00      8 2000     1000
>> 11  12.50      8 2000     1000
>>
>> would become
>>
>>    Acres Bldgid Year ImpValue
>> 1  100.00      1 1946     1000
>> 2  101.00      2 1952     1400
>> 3  100.00      3 1922     1300
>> 4  130.00      4 1910      900
>> 5  156.50     5 1955     6200
>> 7  293.00      6 1999      500
>> 8  300.09     7 1990     1300
>> 10 112.50      8 2000     1000
>>
>> Thanks, i gave it a bunch of shots but nothing worth posting.
>>
>>
>>
>>
>>
>> PDXRugger wrote:
>>>
>>> Hello All,
>>>   I would like to select records with identical IDs then sum an  
>>> attribute
>>> then and return them to the data frame as a single record.  Please
>>> consider
>>>
>>>
>>> Acres<-c(100,101,100,130,156,.5,293,300,.09)
>>> Bldgid<-c(1,2,3,4,5,5,6,7,7)
>>>
>>> DF=cbind(Acres,Bldgid)
>>> DF<-as.data.frame(DF)
>>>
>>> So that:
>>>
>>>  Acres Bldgid
>>> 1 100.00      1
>>> 2 101.00      2
>>> 3 100.00      3
>>> 4 130.00      4
>>> 5 156.00      5
>>> 6   0.50      5
>>> 7 293.00      6
>>> 8 300.00      7
>>> 9   0.09      7
>>>
>>> Becomes
>>>
>>>  Acres Bldgid
>>> 1 100.00      1
>>> 2 101.00      2
>>> 3 100.00      3
>>> 4 130.00      4
>>> 5 156.50      5
>>> 7 293.00      6
>>> 8 300.09      7
>>>
>>> dup<-unique(DF$Bldgid[duplicated(Bldgid)])
>>> dupbuild<-DF[DF$Bldgid %in% dup,]
>>> dupbuild..dupareasum<-sum(dupbuild$Acres[duplicated(dupbuild 
>>> $Bldgid)])
>>>
>>> This sums the unique Ids of the duplicated records, not whati want.
>>> Thanks ahead of time
>>>
>>> JR
>>>
>>>
>>>
>>
>> -- 
>> View this message in context:
>> http://www.nabble.com/Summing-identical-IDs-tp26118922p26121056.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
-- 
View this message in context:
http://old.nabble.com/Summing-identical-IDs-tp26118922p26135732.html
Sent from the R help mailing list archive at Nabble.com.

Reasonably Related Threads

Search for more apparently analagous threads

R help - Oct 2009 - Summing identical IDs

[R] Summing identical IDs

[R] Summing identical IDs

[R] Summing identical IDs

[R] Summing identical IDs

[R] Summing identical IDs

[R] Summing identical IDs

Reasonably Related Threads