Hello All, I would like to select records with identical IDs then sum an attribute then and return them to the data frame as a single record. Please consider Acres<-c(100,101,100,130,156,.5,293,300,.09) Bldgid<-c(1,2,3,4,5,5,6,7,7) DF=cbind(Acres,Bldgid) DF<-as.data.frame(DF) So that: Acres Bldgid 1 100.00 1 2 101.00 2 3 100.00 3 4 130.00 4 5 156.00 5 6 0.50 5 7 293.00 6 8 300.00 7 9 0.09 7 Becomes Acres Bldgid 1 100.00 1 2 101.00 2 3 100.00 3 4 130.00 4 5 156.50 5 7 293.00 6 8 300.09 7 dup<-unique(DF$Bldgid[duplicated(Bldgid)]) dupbuild<-DF[DF$Bldgid %in% dup,] dupbuild..dupareasum<-sum(dupbuild$Acres[duplicated(dupbuild$Bldgid)]) This sums the unique Ids of the duplicated records, not whati want. Thanks ahead of time JR -- View this message in context: http://www.nabble.com/Summing-identical-IDs-tp26118922p26118922.html Sent from the R help mailing list archive at Nabble.com.
one option is the following: Acres <- c(100,101,100,130,156,.5,293,300,.09) Bldgid <- c(1,2,3,4,5,5,6,7,7) DF <- data.frame(Acres, Bldgid) aggregate(DF, list(Bldgid), sum) I hope it helps. Best, Dimitris PDXRugger wrote:> Hello All, > I would like to select records with identical IDs then sum an attribute > then and return them to the data frame as a single record. Please consider > > > Acres<-c(100,101,100,130,156,.5,293,300,.09) > Bldgid<-c(1,2,3,4,5,5,6,7,7) > > DF=cbind(Acres,Bldgid) > DF<-as.data.frame(DF) > > So that: > > Acres Bldgid > 1 100.00 1 > 2 101.00 2 > 3 100.00 3 > 4 130.00 4 > 5 156.00 5 > 6 0.50 5 > 7 293.00 6 > 8 300.00 7 > 9 0.09 7 > > Becomes > > Acres Bldgid > 1 100.00 1 > 2 101.00 2 > 3 100.00 3 > 4 130.00 4 > 5 156.50 5 > 7 293.00 6 > 8 300.09 7 > > dup<-unique(DF$Bldgid[duplicated(Bldgid)]) > dupbuild<-DF[DF$Bldgid %in% dup,] > dupbuild..dupareasum<-sum(dupbuild$Acres[duplicated(dupbuild$Bldgid)]) > > This sums the unique Ids of the duplicated records, not whati want. Thanks > ahead of time > > JR > >-- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014
Terrific help thank you. dupbuild<-aggregate(DF$Acres, list(Bldgid), sum) This line worked best. Now im going to challenge everyone (i think?) Consider the following: Acres<-c(100,101,100,130,156,.5,293,300,.09,100,12.5) Bldgid<-c(1,2,3,4,5,5,6,7,7,8,8) Year<-c(1946,1952,1922,1910,1955,1955,1999,1990,1991,2000,2000) ImpValue<-c(1000,1400,1300,900,5000,1200,500,1000,300,1000,1000) DF=cbind(Acres,Bldgid,Year,ImpValue) DF<-as.data.frame(DF) I would like to do the same, except there are some rules i want to follow. I only want to aggregate the Acres if : a) The Years are not identical b) The ImpValues are not identical c) The Years are identical and the ImpValue are not d)The ImpValues are identical and the Years are not but if the Acres and ImpValues are identical i would still like to add the Acres together and form one case. If the cases are put together i would also like to add the ImpValues together. So the below Acres Bldgid Year ImpValue 1 100.00 1 1946 1000 2 101.00 2 1952 1400 3 100.00 3 1922 1300 4 130.00 4 1910 900 5 156.00 5 1955 5000 6 0.50 5 1955 1200 7 293.00 6 1999 500 8 300.00 7 1990 1000 9 0.09 7 1991 300 10 100.00 8 2000 1000 11 12.50 8 2000 1000 would become Acres Bldgid Year ImpValue 1 100.00 1 1946 1000 2 101.00 2 1952 1400 3 100.00 3 1922 1300 4 130.00 4 1910 900 5 156.50 5 1955 6200 7 293.00 6 1999 500 8 300.09 7 1990 1300 10 112.50 8 2000 1000 Thanks, i gave it a bunch of shots but nothing worth posting. PDXRugger wrote:> > Hello All, > I would like to select records with identical IDs then sum an attribute > then and return them to the data frame as a single record. Please > consider > > > Acres<-c(100,101,100,130,156,.5,293,300,.09) > Bldgid<-c(1,2,3,4,5,5,6,7,7) > > DF=cbind(Acres,Bldgid) > DF<-as.data.frame(DF) > > So that: > > Acres Bldgid > 1 100.00 1 > 2 101.00 2 > 3 100.00 3 > 4 130.00 4 > 5 156.00 5 > 6 0.50 5 > 7 293.00 6 > 8 300.00 7 > 9 0.09 7 > > Becomes > > Acres Bldgid > 1 100.00 1 > 2 101.00 2 > 3 100.00 3 > 4 130.00 4 > 5 156.50 5 > 7 293.00 6 > 8 300.09 7 > > dup<-unique(DF$Bldgid[duplicated(Bldgid)]) > dupbuild<-DF[DF$Bldgid %in% dup,] > dupbuild..dupareasum<-sum(dupbuild$Acres[duplicated(dupbuild$Bldgid)]) > > This sums the unique Ids of the duplicated records, not whati want. > Thanks ahead of time > > JR > > >-- View this message in context: http://www.nabble.com/Summing-identical-IDs-tp26118922p26121056.html Sent from the R help mailing list archive at Nabble.com.
Hi JR, Try also with(DF, tapply(Acres, Bldgid, sum) ) HTH, Jorge On Thu, Oct 29, 2009 at 3:03 PM, PDXRugger <> wrote:> > Hello All, > I would like to select records with identical IDs then sum an attribute > then and return them to the data frame as a single record. Please consider > > > Acres<-c(100,101,100,130,156,.5,293,300,.09) > Bldgid<-c(1,2,3,4,5,5,6,7,7) > > DF=cbind(Acres,Bldgid) > DF<-as.data.frame(DF) > > So that: > > Acres Bldgid > 1 100.00 1 > 2 101.00 2 > 3 100.00 3 > 4 130.00 4 > 5 156.00 5 > 6 0.50 5 > 7 293.00 6 > 8 300.00 7 > 9 0.09 7 > > Becomes > > Acres Bldgid > 1 100.00 1 > 2 101.00 2 > 3 100.00 3 > 4 130.00 4 > 5 156.50 5 > 7 293.00 6 > 8 300.09 7 > > dup<-unique(DF$Bldgid[duplicated(Bldgid)]) > dupbuild<-DF[DF$Bldgid %in% dup,] > dupbuild..dupareasum<-sum(dupbuild$Acres[duplicated(dupbuild$Bldgid)]) > > This sums the unique Ids of the duplicated records, not whati want. Thanks > ahead of time > > JR > > > -- > View this message in context: > http://www.nabble.com/Summing-identical-IDs-tp26118922p26118922.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
On Oct 29, 2009, at 5:23 PM, PDXRugger wrote:> > Terrific help thank you. > dupbuild<-aggregate(DF$Acres, list(Bldgid), sum) > This line worked best. > > Now im going to challenge everyone (i think?) > > Consider the following: > > > Acres<-c(100,101,100,130,156,.5,293,300,.09,100,12.5) > Bldgid<-c(1,2,3,4,5,5,6,7,7,8,8) > Year<-c(1946,1952,1922,1910,1955,1955,1999,1990,1991,2000,2000) > ImpValue<-c(1000,1400,1300,900,5000,1200,500,1000,300,1000,1000) > DF=cbind(Acres,Bldgid,Year,ImpValue) > DF<-as.data.frame(DF) > > I would like to do the same, except there are some rules i want to > follow. > I only want to aggregate the Acres if : > a) The Years are not identical > b) The ImpValues are not identical > c) The Years are identical and the ImpValue are not > d)The ImpValues are identical and the Years are notAs I review your Boolean logic, I run into serious problems. c) and d) cannot be true if a and b) are true. So no cases satisfy all 4 specs. In particular both of the pairs you say you want aggregated (5+6) and 10+11) violate rule a) and the second pair also violates b). -- David> > but if the Acres and ImpValues are identical i would still like to > add the > Acres together and form one case. > If the cases are put together i would also like to add the ImpValues > together. So the below > > Acres Bldgid Year ImpValue > 1 100.00 1 1946 1000 > 2 101.00 2 1952 1400 > 3 100.00 3 1922 1300 > 4 130.00 4 1910 900 > 5 156.00 5 1955 5000 > 6 0.50 5 1955 1200 > 7 293.00 6 1999 500 > 8 300.00 7 1990 1000 > 9 0.09 7 1991 300 > 10 100.00 8 2000 1000 > 11 12.50 8 2000 1000 > > would become > > Acres Bldgid Year ImpValue > 1 100.00 1 1946 1000 > 2 101.00 2 1952 1400 > 3 100.00 3 1922 1300 > 4 130.00 4 1910 900 > 5 156.50 5 1955 6200 > 7 293.00 6 1999 500 > 8 300.09 7 1990 1300 > 10 112.50 8 2000 1000 > > Thanks, i gave it a bunch of shots but nothing worth posting. > > > > > > PDXRugger wrote: >> >> Hello All, >> I would like to select records with identical IDs then sum an >> attribute >> then and return them to the data frame as a single record. Please >> consider >> >> >> Acres<-c(100,101,100,130,156,.5,293,300,.09) >> Bldgid<-c(1,2,3,4,5,5,6,7,7) >> >> DF=cbind(Acres,Bldgid) >> DF<-as.data.frame(DF) >> >> So that: >> >> Acres Bldgid >> 1 100.00 1 >> 2 101.00 2 >> 3 100.00 3 >> 4 130.00 4 >> 5 156.00 5 >> 6 0.50 5 >> 7 293.00 6 >> 8 300.00 7 >> 9 0.09 7 >> >> Becomes >> >> Acres Bldgid >> 1 100.00 1 >> 2 101.00 2 >> 3 100.00 3 >> 4 130.00 4 >> 5 156.50 5 >> 7 293.00 6 >> 8 300.09 7 >> >> dup<-unique(DF$Bldgid[duplicated(Bldgid)]) >> dupbuild<-DF[DF$Bldgid %in% dup,] >> dupbuild..dupareasum<-sum(dupbuild$Acres[duplicated(dupbuild >> $Bldgid)]) >> >> This sums the unique Ids of the duplicated records, not whati want. >> Thanks ahead of time >> >> JR >> >> >> > > -- > View this message in context: http://www.nabble.com/Summing-identical-IDs-tp26118922p26121056.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD Heritage Laboratories West Hartford, CT
David, You are correct. I think the frist two assumptions can be thrown out and only the latter two (c,d) can be considered. So how would i combine Acres for matching Bldgids based on assumptions c,d? David Winsemius wrote:> > > On Oct 29, 2009, at 5:23 PM, PDXRugger wrote: > >> >> Terrific help thank you. >> dupbuild<-aggregate(DF$Acres, list(Bldgid), sum) >> This line worked best. >> >> Now im going to challenge everyone (i think?) >> >> Consider the following: >> >> >> Acres<-c(100,101,100,130,156,.5,293,300,.09,100,12.5) >> Bldgid<-c(1,2,3,4,5,5,6,7,7,8,8) >> Year<-c(1946,1952,1922,1910,1955,1955,1999,1990,1991,2000,2000) >> ImpValue<-c(1000,1400,1300,900,5000,1200,500,1000,300,1000,1000) >> DF=cbind(Acres,Bldgid,Year,ImpValue) >> DF<-as.data.frame(DF) >> >> I would like to do the same, except there are some rules i want to >> follow. >> I only want to aggregate the Acres if : >> a) The Years are not identical >> b) The ImpValues are not identical >> c) The Years are identical and the ImpValue are not >> d)The ImpValues are identical and the Years are not > > As I review your Boolean logic, I run into serious problems. > > c) and d) cannot be true if a and b) are true. > > So no cases satisfy all 4 specs. In particular both of the pairs you > say you want aggregated (5+6) and 10+11) violate rule a) and the > second pair also violates b). > > -- > David >> >> but if the Acres and ImpValues are identical i would still like to >> add the >> Acres together and form one case. >> If the cases are put together i would also like to add the ImpValues >> together. So the below >> >> Acres Bldgid Year ImpValue >> 1 100.00 1 1946 1000 >> 2 101.00 2 1952 1400 >> 3 100.00 3 1922 1300 >> 4 130.00 4 1910 900 >> 5 156.00 5 1955 5000 >> 6 0.50 5 1955 1200 >> 7 293.00 6 1999 500 >> 8 300.00 7 1990 1000 >> 9 0.09 7 1991 300 >> 10 100.00 8 2000 1000 >> 11 12.50 8 2000 1000 >> >> would become >> >> Acres Bldgid Year ImpValue >> 1 100.00 1 1946 1000 >> 2 101.00 2 1952 1400 >> 3 100.00 3 1922 1300 >> 4 130.00 4 1910 900 >> 5 156.50 5 1955 6200 >> 7 293.00 6 1999 500 >> 8 300.09 7 1990 1300 >> 10 112.50 8 2000 1000 >> >> Thanks, i gave it a bunch of shots but nothing worth posting. >> >> >> >> >> >> PDXRugger wrote: >>> >>> Hello All, >>> I would like to select records with identical IDs then sum an >>> attribute >>> then and return them to the data frame as a single record. Please >>> consider >>> >>> >>> Acres<-c(100,101,100,130,156,.5,293,300,.09) >>> Bldgid<-c(1,2,3,4,5,5,6,7,7) >>> >>> DF=cbind(Acres,Bldgid) >>> DF<-as.data.frame(DF) >>> >>> So that: >>> >>> Acres Bldgid >>> 1 100.00 1 >>> 2 101.00 2 >>> 3 100.00 3 >>> 4 130.00 4 >>> 5 156.00 5 >>> 6 0.50 5 >>> 7 293.00 6 >>> 8 300.00 7 >>> 9 0.09 7 >>> >>> Becomes >>> >>> Acres Bldgid >>> 1 100.00 1 >>> 2 101.00 2 >>> 3 100.00 3 >>> 4 130.00 4 >>> 5 156.50 5 >>> 7 293.00 6 >>> 8 300.09 7 >>> >>> dup<-unique(DF$Bldgid[duplicated(Bldgid)]) >>> dupbuild<-DF[DF$Bldgid %in% dup,] >>> dupbuild..dupareasum<-sum(dupbuild$Acres[duplicated(dupbuild >>> $Bldgid)]) >>> >>> This sums the unique Ids of the duplicated records, not whati want. >>> Thanks ahead of time >>> >>> JR >>> >>> >>> >> >> -- >> View this message in context: >> http://www.nabble.com/Summing-identical-IDs-tp26118922p26121056.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > Heritage Laboratories > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- View this message in context: http://old.nabble.com/Summing-identical-IDs-tp26118922p26135732.html Sent from the R help mailing list archive at Nabble.com.