Hello, I am a newbie to R coming from SAS background. I am trying to program the following: I have a monthly data frame with 2 variables: client pct_total A 15% B 10% C 10% D 9% E 8% F 6% G 4% I need to come up w/ a monthly list of clients that make 50% or just above it every month so I can pass them to the rest of the program. In this case the list would contain the first 4 rows. top <- client[c(1,4),] toptot <- sum(top$PCTTOT) How can I make this automatic? In SAS I would use macro w/ a do while. Thanks for your help. -- View this message in context: http://r.789695.n4.nabble.com/Conditional-operations-in-R-tp4643497.html Sent from the R help mailing list archive at Nabble.com.
On Tue, Sep 18, 2012 at 3:41 PM, ramoss <ramine.mossadegh at finra.org> wrote:> Hello, > > I am a newbie to R coming from SAS background. I am trying to program the > following: > I have a monthly data frame with 2 variables: > > client pct_total > A 15% > B 10% > C 10% > D 9% > E 8% > F 6% > G 4% > > I need to come up w/ a monthly list of clients that make 50% or just above > it every month so I can pass them to the rest of the program. In this case > the list would contain the first 4 rows. > top <- client[c(1,4),] > toptot <- sum(top$PCTTOT) > How can I make this automatic? In SAS I would use macro w/ a do while. > Thanks for your help. >If I understand the algorithm correctly, you take a cumulative sum of the pct_total column and want the index of the first place that passes 50%: try with(DATA, which.max(cumsum(pct_total) > 0.5)) which is admittedly rather opaque. Also in: top <- client[c(1,4),] That's not rows 1 to 4 but rows one and 4: you need instead: seq(1,4) to make c(1,2,3,4). Cheers, Michael> -- > View this message in context: http://r.789695.n4.nabble.com/Conditional-operations-in-R-tp4643497.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hello,
In R you would use vectorized instructions, not a do while loop.
dat <- read.table(text="
client   pct_total
A          15%
B          10%
C          10%
D          9%
E           8%
F          6%
G          4%
", header = TRUE)
# Make it numeric
dat$pct_total <- with(dat, as.numeric(sub("%", "",
pct_total))/100)
str(dat)  # see its STRucture
top <- which(dat$pct_total >= median(dat$pct_total))  # make index vector
sum(dat$pct_total[top])
Hope this helps,
Rui Barradas
Em 18-09-2012 15:41, ramoss escreveu:> Hello,
>
> I am a newbie to R coming from SAS background. I am trying to program the
> following:
> I have a monthly data frame with 2 variables:
>
> client   pct_total
> A          15%
> B          10%
> C          10%
> D          9%
> E           8%
> F          6%
> G          4%
>
> I need to come up w/ a monthly list of clients that make 50% or just above
> it every month so I can pass them to the rest of the program.  In this case
> the list would contain the first 4 rows.
> top <- client[c(1,4),]
> toptot <- sum(top$PCTTOT)
> How can I make this automatic?  In SAS I would use macro w/ a do while.
> Thanks for your help.
>
>
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/Conditional-operations-in-R-tp4643497.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Have your read an Introduction to R? If not, do so before posting further. There are also many "R for SAS users" tutorials on the web I'm sure. Google or check CRAN. In particular, you need to understand how indexing works. See ?"[" and ?subset You will certainly have to define what you mean by "just over". Once you do so, ?cumsum will do what you want (once you learn about indexing in R). -- Bert On Tue, Sep 18, 2012 at 7:41 AM, ramoss <ramine.mossadegh at finra.org> wrote:> Hello, > > I am a newbie to R coming from SAS background. I am trying to program the > following: > I have a monthly data frame with 2 variables: > > client pct_total > A 15% > B 10% > C 10% > D 9% > E 8% > F 6% > G 4% > > I need to come up w/ a monthly list of clients that make 50% or just above > it every month so I can pass them to the rest of the program. In this case > the list would contain the first 4 rows. > top <- client[c(1,4),] > toptot <- sum(top$PCTTOT) > How can I make this automatic? In SAS I would use macro w/ a do while. > Thanks for your help. > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Conditional-operations-in-R-tp4643497.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
Thanks to all who responded, particularly to Michael. Your solution was the easiest to understand & to implement. This worked beautifully: cmtot <- arrange(cmtot, -PCTTOT)#sort by descending top <- with(cmtot,which.max(cumsum(PCTTOT) >= 50)) topcm <- cmtot[seq(1,top),] -- View this message in context: http://r.789695.n4.nabble.com/Conditional-operations-in-R-tp4643497p4643540.html Sent from the R help mailing list archive at Nabble.com.