Hey Folks,
Could somebody help me rewrite the following code?
I am looping through all records across 5 fields to calculate the cumulative
percentage of each record (relative to each individual field).
Is there a way to rewrite it so I don't have to loop through each individual
record?
##### tdat is my data frame
##### j is my field index
##### k is my record index
##### tsum is the sum of all values in field j
##### tmp is a vector containing the values in field j
##### tdat[k,paste("cpct,j,sep="")] creates new fields
"cpct1",...,"cpct5"
for(j in 1:5) {
tsum<- sum(tdat[,j]);
for(k in 1:nrow(tdat)) {
td<- tdat[k,j];
tmp<-tdat[,j];
##### sum values <= to current value and divide by the total sum
tdat[k,paste("cpct,j,sep="")]<- sum(tmp[tmp <= td]) /
tsum;
}
}
Thanks,
TLowe
--
View this message in context:
http://www.nabble.com/Help-rewriting-looping-structure--tf4957267.html#a14196412
Sent from the R help mailing list archive at Nabble.com.
How about this example?
## sample data frame with two columns
df <- data.frame(x = abs(rnorm(20)), y=abs(rnorm(20,2)))
## create new variables in df with an lapply call
df[c("cpctx","cptcty")] <- lapply(df, function(x)
cumsum(x)/sum(x))
A possible improvement would be to construct the new column names in
the data frame automatically.
Best,
Erik
TLowe wrote:> Hey Folks,
>
> Could somebody help me rewrite the following code?
>
> I am looping through all records across 5 fields to calculate the
cumulative
> percentage of each record (relative to each individual field).
>
> Is there a way to rewrite it so I don't have to loop through each
individual
> record?
>
> ##### tdat is my data frame
> ##### j is my field index
> ##### k is my record index
> ##### tsum is the sum of all values in field j
> ##### tmp is a vector containing the values in field j
> ##### tdat[k,paste("cpct,j,sep="")] creates new fields
"cpct1",...,"cpct5"
>
>
> for(j in 1:5) {
> tsum<- sum(tdat[,j]);
> for(k in 1:nrow(tdat)) {
> td<- tdat[k,j];
> tmp<-tdat[,j];
> ##### sum values <= to current value and divide by the total sum
> tdat[k,paste("cpct,j,sep="")]<- sum(tmp[tmp <=
td]) / tsum;
> }
> }
>
>
> Thanks,
> TLowe
Is this basically what you want to do? (Please include commented, minimal, self-contained, reproducible code so we don't have to guess at what you want)> x <- data.frame(a=runif(10), b=runif(10))# do for one column> cumsum(x$a)/sum(x$a)[1] 0.05892073 0.08129611 0.11067218 0.28640268 0.28969826 0.44477544 0.55195101 0.76500220 0.85234025 [10] 1.00000000>If this is the case, you can extend it. On Dec 6, 2007 9:02 AM, TLowe <rcl7820 at warnell.uga.edu> wrote:> > Hey Folks, > > Could somebody help me rewrite the following code? > > I am looping through all records across 5 fields to calculate the cumulative > percentage of each record (relative to each individual field). > > Is there a way to rewrite it so I don't have to loop through each individual > record? > > ##### tdat is my data frame > ##### j is my field index > ##### k is my record index > ##### tsum is the sum of all values in field j > ##### tmp is a vector containing the values in field j > ##### tdat[k,paste("cpct,j,sep="")] creates new fields "cpct1",...,"cpct5" > > > for(j in 1:5) { > tsum<- sum(tdat[,j]); > for(k in 1:nrow(tdat)) { > td<- tdat[k,j]; > tmp<-tdat[,j]; > ##### sum values <= to current value and divide by the total sum > tdat[k,paste("cpct,j,sep="")]<- sum(tmp[tmp <= td]) / tsum; > } > } > > > Thanks, > TLowe > -- > View this message in context: http://www.nabble.com/Help-rewriting-looping-structure--tf4957267.html#a14196412 > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?
Thank you all. That's exactly what I was looking for. TLowe wrote:> > Hey Folks, > > Could somebody help me rewrite the following code? > > I am looping through all records across 5 fields to calculate the > cumulative > percentage of each record (relative to each individual field). > > Is there a way to rewrite it so I don't have to loop through each > individual > record? > > ##### tdat is my data frame > ##### j is my field index > ##### k is my record index > ##### tsum is the sum of all values in field j > ##### tmp is a vector containing the values in field j > ##### tdat[k,paste("cpct,j,sep="")] creates new fields "cpct1",...,"cpct5" > > > for(j in 1:5) { > tsum<- sum(tdat[,j]); > for(k in 1:nrow(tdat)) { > td<- tdat[k,j]; > tmp<-tdat[,j]; > ##### sum values <= to current value and divide by the total sum > tdat[k,paste("cpct,j,sep="")]<- sum(tmp[tmp <= td]) / tsum; > } > } > > > Thanks, > TLowe >-- View this message in context: http://www.nabble.com/Help-rewriting-looping-structure--tf4957267.html#a14198294 Sent from the R help mailing list archive at Nabble.com.
This will give you the percents in the same order as your original data (as
this is what your original code did)
apply(tdat, 2,
function(x) {
o <- order(x)
oldo <- order(o)
prc <- cumsum(x[o]) / sum(x)
prc[oldo]
})
Jason Law
Statistician
City of Portland, Bureau of Environmental Services
Water Pollution Control Laboratory
6543 N Burlington Avenue
Portland, OR 97203 -5452
jlaw at bes.ci.portland.or.us
-----Original Message-----
From: r-help-bounces at r-project.org
[mailto:r-help-bounces at r-project.org]On Behalf Of TLowe
Sent: Thursday, December 06, 2007 10:28 AM
To: r-help at r-project.org
Subject: Re: [R] Help rewriting looping structure?
Thank you all. That's exactly what I was looking for.
TLowe wrote:>
> Hey Folks,
>
> Could somebody help me rewrite the following code?
>
> I am looping through all records across 5 fields to calculate the
> cumulative
> percentage of each record (relative to each individual field).
>
> Is there a way to rewrite it so I don't have to loop through each
> individual
> record?
>
> ##### tdat is my data frame
> ##### j is my field index
> ##### k is my record index
> ##### tsum is the sum of all values in field j
> ##### tmp is a vector containing the values in field j
> ##### tdat[k,paste("cpct,j,sep="")] creates new fields
"cpct1",...,"cpct5"
>
>
> for(j in 1:5) {
> tsum<- sum(tdat[,j]);
> for(k in 1:nrow(tdat)) {
> td<- tdat[k,j];
> tmp<-tdat[,j];
> ##### sum values <= to current value and divide by the total sum
> tdat[k,paste("cpct,j,sep="")]<- sum(tmp[tmp <=
td]) / tsum;
> }
> }
>
>
> Thanks,
> TLowe
>
--
View this message in context:
http://www.nabble.com/Help-rewriting-looping-structure--tf4957267.html#a1419
8294
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Hi TLowe,
I'm not quite sure if I understand what you are trying to do. If you
are trying to get the cumulative sum of your data frame along each
column you can simply do
rcumsum=function(x){cumsum(x)/sum(x)}
apply(tdat,2,rcumsum)
Yet that is not what your code is doing. With a bit of clarification I
may help you some more.
Julian
TLowe wrote:> Hey Folks,
>
> Could somebody help me rewrite the following code?
>
> I am looping through all records across 5 fields to calculate the
cumulative
> percentage of each record (relative to each individual field).
>
> Is there a way to rewrite it so I don't have to loop through each
individual
> record?
>
> ##### tdat is my data frame
> ##### j is my field index
> ##### k is my record index
> ##### tsum is the sum of all values in field j
> ##### tmp is a vector containing the values in field j
> ##### tdat[k,paste("cpct,j,sep="")] creates new fields
"cpct1",...,"cpct5"
>
>
for(j in 1:5) {
tsum<- sum(tdat[,j]);
for(k in 1:nrow(tdat)) {
td<- tdat[k,j];
tmp<-tdat[,j];
##### sum values <= to current value and divide by the total sum
tdat[k,paste("cpct",j,sep="")]<- sum(tmp[tmp <=
td]) / tsum;
}
}>
>
> Thanks,
> TLowe