After I sent this, a colleague referred me to the GPT-4 interface on Bing. I entered the exact email query below and it provided the following solution, which worked for the toy example and was successfully adapted to my application: # Define the cutoffs cutoffs <- seq(0, 0.15, by = 0.01) # Create a new column with cutoffs dummydata$Cutoff <- cut(dummydata$Pct, breaks = cutoffs, labels = cutoffs[-1]) # Aggregate Totpop by Cutoff result <- aggregate(Totpop ~ Cutoff, data = dummydata, sum) # Rename the columns names(result) <- c("Cutoff", "Pop") # Print the result result So thanks to all for considering this query?we're in a brave new world of AI-generated coding. Message: 3 Date: Fri, 13 Oct 2023 20:13:56 +0000 From: "Jason Stout, M.D." <jason.stout at duke.edu> To: "r-help at r-project.org" <r-help at r-project.org> Subject: [R] Create new data frame with conditional sums Message-ID: <BN8PR05MB60663B9D27B40E8C240A387981D2A at BN8PR05MB6066.namprd05.prod.outlook.com> Content-Type: text/plain; charset="utf-8" This seems like it should be simple but I can't get it to work properly. I'm starting with a data frame like this: Tract Pct Totpop 1 0.05 4000 2 0.03 3500 3 0.01 4500 4 0.12 4100 5 0.21 3900 6 0.04 4250 7 0.07 5100 8 0.09 4700 9 0.06 4950 10 0.03 4800 And I want to end up with a data frame with two columns, a "Cutoff" column that is a simple sequence of equally spaced cutoffs (let's say in this case from 0-0.15 by 0.01) and a "Pop" column which equals the sum of "Totpop" in the prior data frame in which "Pct" is greater than or equal to "cutoff." So in this toy example, this is what I want for a result: Cutoff Pop 1 0.00 43800 2 0.01 43800 3 0.02 39300 4 0.03 39300 5 0.04 31000 6 0.05 26750 7 0.06 22750 8 0.07 17800 9 0.08 12700 10 0.09 12700 11 0.10 8000 12 0.11 8000 13 0.12 8000 14 0.13 3900 15 0.14 3900 16 0.15 3900 I can do this with a for loop but it seems there should be an easier, vectorized way that would be more efficient. Here is a reproducible example: dummydata<-data.frame(Tract=seq(1,10,by=1),Pct=c(0.05,0.03,0.01,0.12,0.21,0.04,0.07,0.09,0.06,0.03),Totpop=c(4000,3500,4500,4100, 3900,4250,5100,4700, 4950,4800)) dfrm<-data.frame(matrix(ncol=2,nrow=0,dimnames=list(NULL,c("Cutoff","Pop")))) for (i in seq(0,0.15,by=0.01)) { temp<-sum(dummydata[dummydata$Pct>=i,"Totpop"]) dfrm[nrow(dfrm)+1,]<-c(i,temp) } Jason Stout, MD, MHS Box 102359-DUMC Durham, NC 27710 FAX 919-681-7494 [[alternative HTML version deleted]]