Skip to content

Conditional operations in R

5 messages · ramoss, R. Michael Weylandt, Rui Barradas +1 more

#
Hello,

I am a newbie to R coming from SAS background. I am trying to program the
following:
I have a monthly data frame with 2 variables:

client   pct_total
A          15%
B          10%
C          10%
D          9%
E           8%
F          6%
G          4%

I need to come up w/ a monthly list of clients that make 50% or just above
it every month so I can pass them to the rest of the program.  In this case
the list would contain the first 4 rows.
top <- client[c(1,4),]
toptot <- sum(top$PCTTOT)
How can I make this automatic?  In SAS I would use macro w/ a do while.  
Thanks for your help.



--
View this message in context: http://r.789695.n4.nabble.com/Conditional-operations-in-R-tp4643497.html
Sent from the R help mailing list archive at Nabble.com.
#
On Tue, Sep 18, 2012 at 3:41 PM, ramoss <ramine.mossadegh at finra.org> wrote:
If I understand the algorithm correctly, you take a cumulative sum of
the pct_total column and want the index of the first place that passes
50%:

try

with(DATA, which.max(cumsum(pct_total) > 0.5))

which is admittedly rather opaque.

Also in: top <- client[c(1,4),]

That's not rows 1 to 4 but rows one and 4: you need instead: seq(1,4)
to make c(1,2,3,4).

Cheers,
Michael
#
Hello,

In R you would use vectorized instructions, not a do while loop.


dat <- read.table(text="
client   pct_total
A          15%
B          10%
C          10%
D          9%
E           8%
F          6%
G          4%
", header = TRUE)

# Make it numeric
dat$pct_total <- with(dat, as.numeric(sub("%", "", pct_total))/100)
str(dat)  # see its STRucture

top <- which(dat$pct_total >= median(dat$pct_total))  # make index vector
sum(dat$pct_total[top])

Hope this helps,

Rui Barradas
Em 18-09-2012 15:41, ramoss escreveu:
#
Have your read an Introduction to R? If not, do so before posting
further. There are also many "R for SAS users" tutorials on the web
I'm sure. Google or check CRAN. In particular, you need to understand
how indexing works. See ?"[" and ?subset

You will certainly have to define what you mean by "just over". Once
you do so, ?cumsum will do what you want (once you learn about
indexing in R).

-- Bert
On Tue, Sep 18, 2012 at 7:41 AM, ramoss <ramine.mossadegh at finra.org> wrote:

  
    
#
Thanks to all who responded, particularly to Michael. Your solution was the
easiest to understand & to implement.
This worked beautifully:

cmtot <- arrange(cmtot, -PCTTOT)#sort by descending
top <- with(cmtot,which.max(cumsum(PCTTOT) >= 50))
topcm <- cmtot[seq(1,top),]







--
View this message in context: http://r.789695.n4.nabble.com/Conditional-operations-in-R-tp4643497p4643540.html
Sent from the R help mailing list archive at Nabble.com.