Hey Folks, Could somebody help me rewrite the following code? I am looping through all records across 5 fields to calculate the cumulative percentage of each record (relative to each individual field). Is there a way to rewrite it so I don't have to loop through each individual record? ##### tdat is my data frame ##### j is my field index ##### k is my record index ##### tsum is the sum of all values in field j ##### tmp is a vector containing the values in field j ##### tdat[k,paste("cpct,j,sep="")] creates new fields "cpct1",...,"cpct5" for(j in 1:5) { tsum<- sum(tdat[,j]); for(k in 1:nrow(tdat)) { td<- tdat[k,j]; tmp<-tdat[,j]; ##### sum values <= to current value and divide by the total sum tdat[k,paste("cpct,j,sep="")]<- sum(tmp[tmp <= td]) / tsum; } } Thanks, TLowe -- View this message in context: http://www.nabble.com/Help-rewriting-looping-structure--tf4957267.html#a14196412 Sent from the R help mailing list archive at Nabble.com.
How about this example? ## sample data frame with two columns df <- data.frame(x = abs(rnorm(20)), y=abs(rnorm(20,2))) ## create new variables in df with an lapply call df[c("cpctx","cptcty")] <- lapply(df, function(x) cumsum(x)/sum(x)) A possible improvement would be to construct the new column names in the data frame automatically. Best, Erik TLowe wrote:> Hey Folks, > > Could somebody help me rewrite the following code? > > I am looping through all records across 5 fields to calculate the cumulative > percentage of each record (relative to each individual field). > > Is there a way to rewrite it so I don't have to loop through each individual > record? > > ##### tdat is my data frame > ##### j is my field index > ##### k is my record index > ##### tsum is the sum of all values in field j > ##### tmp is a vector containing the values in field j > ##### tdat[k,paste("cpct,j,sep="")] creates new fields "cpct1",...,"cpct5" > > > for(j in 1:5) { > tsum<- sum(tdat[,j]); > for(k in 1:nrow(tdat)) { > td<- tdat[k,j]; > tmp<-tdat[,j]; > ##### sum values <= to current value and divide by the total sum > tdat[k,paste("cpct,j,sep="")]<- sum(tmp[tmp <= td]) / tsum; > } > } > > > Thanks, > TLowe
Is this basically what you want to do? (Please include commented, minimal, self-contained, reproducible code so we don't have to guess at what you want)> x <- data.frame(a=runif(10), b=runif(10))# do for one column> cumsum(x$a)/sum(x$a)[1] 0.05892073 0.08129611 0.11067218 0.28640268 0.28969826 0.44477544 0.55195101 0.76500220 0.85234025 [10] 1.00000000>If this is the case, you can extend it. On Dec 6, 2007 9:02 AM, TLowe <rcl7820 at warnell.uga.edu> wrote:> > Hey Folks, > > Could somebody help me rewrite the following code? > > I am looping through all records across 5 fields to calculate the cumulative > percentage of each record (relative to each individual field). > > Is there a way to rewrite it so I don't have to loop through each individual > record? > > ##### tdat is my data frame > ##### j is my field index > ##### k is my record index > ##### tsum is the sum of all values in field j > ##### tmp is a vector containing the values in field j > ##### tdat[k,paste("cpct,j,sep="")] creates new fields "cpct1",...,"cpct5" > > > for(j in 1:5) { > tsum<- sum(tdat[,j]); > for(k in 1:nrow(tdat)) { > td<- tdat[k,j]; > tmp<-tdat[,j]; > ##### sum values <= to current value and divide by the total sum > tdat[k,paste("cpct,j,sep="")]<- sum(tmp[tmp <= td]) / tsum; > } > } > > > Thanks, > TLowe > -- > View this message in context: http://www.nabble.com/Help-rewriting-looping-structure--tf4957267.html#a14196412 > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?
Thank you all. That's exactly what I was looking for. TLowe wrote:> > Hey Folks, > > Could somebody help me rewrite the following code? > > I am looping through all records across 5 fields to calculate the > cumulative > percentage of each record (relative to each individual field). > > Is there a way to rewrite it so I don't have to loop through each > individual > record? > > ##### tdat is my data frame > ##### j is my field index > ##### k is my record index > ##### tsum is the sum of all values in field j > ##### tmp is a vector containing the values in field j > ##### tdat[k,paste("cpct,j,sep="")] creates new fields "cpct1",...,"cpct5" > > > for(j in 1:5) { > tsum<- sum(tdat[,j]); > for(k in 1:nrow(tdat)) { > td<- tdat[k,j]; > tmp<-tdat[,j]; > ##### sum values <= to current value and divide by the total sum > tdat[k,paste("cpct,j,sep="")]<- sum(tmp[tmp <= td]) / tsum; > } > } > > > Thanks, > TLowe >-- View this message in context: http://www.nabble.com/Help-rewriting-looping-structure--tf4957267.html#a14198294 Sent from the R help mailing list archive at Nabble.com.
This will give you the percents in the same order as your original data (as this is what your original code did) apply(tdat, 2, function(x) { o <- order(x) oldo <- order(o) prc <- cumsum(x[o]) / sum(x) prc[oldo] }) Jason Law Statistician City of Portland, Bureau of Environmental Services Water Pollution Control Laboratory 6543 N Burlington Avenue Portland, OR 97203 -5452 jlaw at bes.ci.portland.or.us -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]On Behalf Of TLowe Sent: Thursday, December 06, 2007 10:28 AM To: r-help at r-project.org Subject: Re: [R] Help rewriting looping structure? Thank you all. That's exactly what I was looking for. TLowe wrote:> > Hey Folks, > > Could somebody help me rewrite the following code? > > I am looping through all records across 5 fields to calculate the > cumulative > percentage of each record (relative to each individual field). > > Is there a way to rewrite it so I don't have to loop through each > individual > record? > > ##### tdat is my data frame > ##### j is my field index > ##### k is my record index > ##### tsum is the sum of all values in field j > ##### tmp is a vector containing the values in field j > ##### tdat[k,paste("cpct,j,sep="")] creates new fields "cpct1",...,"cpct5"> > > for(j in 1:5) { > tsum<- sum(tdat[,j]); > for(k in 1:nrow(tdat)) { > td<- tdat[k,j]; > tmp<-tdat[,j]; > ##### sum values <= to current value and divide by the total sum > tdat[k,paste("cpct,j,sep="")]<- sum(tmp[tmp <= td]) / tsum; > } > } > > > Thanks, > TLowe >-- View this message in context: http://www.nabble.com/Help-rewriting-looping-structure--tf4957267.html#a1419 8294 Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi TLowe, I'm not quite sure if I understand what you are trying to do. If you are trying to get the cumulative sum of your data frame along each column you can simply do rcumsum=function(x){cumsum(x)/sum(x)} apply(tdat,2,rcumsum) Yet that is not what your code is doing. With a bit of clarification I may help you some more. Julian TLowe wrote:> Hey Folks, > > Could somebody help me rewrite the following code? > > I am looping through all records across 5 fields to calculate the cumulative > percentage of each record (relative to each individual field). > > Is there a way to rewrite it so I don't have to loop through each individual > record? > > ##### tdat is my data frame > ##### j is my field index > ##### k is my record index > ##### tsum is the sum of all values in field j > ##### tmp is a vector containing the values in field j > ##### tdat[k,paste("cpct,j,sep="")] creates new fields "cpct1",...,"cpct5" > >for(j in 1:5) { tsum<- sum(tdat[,j]); for(k in 1:nrow(tdat)) { td<- tdat[k,j]; tmp<-tdat[,j]; ##### sum values <= to current value and divide by the total sum tdat[k,paste("cpct",j,sep="")]<- sum(tmp[tmp <= td]) / tsum; } }> > > Thanks, > TLowe