Skip to content

Inefficiency of SAS Programming

22 messages · Wensui Liu, Peter Dalgaard, JRG +8 more

#
If anyone wants to see a prime example of how inefficient it is to 
program in SAS, take a look at the SAS programs provided by the US 
Agency for Healthcare Research and Quality for risk adjusting and 
reporting for hospital outcomes at 
http://www.qualityindicators.ahrq.gov/software.htm .  The PSSASP3.SAS 
program is a prime example.  Look at how you do a vector product in the 
SAS macro language to evaluate predictions from a logistic regression 
model.  I estimate that using R would easily cut the programming time of 
this set of programs by a factor of 4.

Frank
#
Frank,
I couldn't locate the program you mentioned. doyou mind being more
specific? could you please point me to the file? i am just curious.
thanks.

On Thu, Feb 26, 2009 at 5:57 PM, Frank E Harrell Jr
<f.harrell at vanderbilt.edu> wrote:

  
    
#
2009/2/26 Frank E Harrell Jr <f.harrell at vanderbilt.edu>:
Plenty of examples ripe for sending to www.thedailywtf.com there. Like this:

    IF &N. =  1 THEN SUB_N = 1;
    IF &N. =  3 THEN SUB_N = 2;
    IF &N. =  4 THEN SUB_N = 3;
    IF &N. =  6 THEN SUB_N = 4;
    IF &N. =  7 THEN SUB_N = 5;
    IF &N. =  8 THEN SUB_N = 6;
    IF &N. =  9 THEN SUB_N = 7;
    IF &N. = 10 THEN SUB_N = 8;
    IF &N. = 11 THEN SUB_N = 9;
    IF &N. = 12 THEN SUB_N = 10;
    IF &N. = 13 THEN SUB_N = 11;
    IF &N. = 14 THEN SUB_N = 12;
    IF &N. = 15 THEN SUB_N = 13;
    IF &N. = 17 THEN SUB_N = 14;
    IF &N. = 18 THEN SUB_N = 15;
    IF &N. = 19 THEN SUB_N = 16;

Of course it's possible to write code like that in any language, it
just looks worse when it's in ALL CAPS and written in a style that
looks like the 1980s and onward never happened. The question is
whether it's possible to write this better in SAS. Most of us on this
list could write it in R in a better way.

 Barry
#
Barry Rowlingson wrote:
Presumably, something like

      IF &N. =  1 THEN SUB_N = 1;
      ELSE IF &N. < 5 THEN SUB_N = &N.-1;
      ELSE IF &N. < 16 THEN SUB_N = &N.-2;
      ELSE SUB_N = &N.-3;

would work, provided that 2, 5, 16 are impossible values. Problem is 
that it actually makes the code harder to grasp, so experienced SAS 
programmers go for the dumb but readable code like the above.

In R, the cleanest I can think of is

subn <- match(n, setdiff(1:19, c(2,5,16)))

or maybe just

subn <- match(n, c(1, 3:4, 6:15, 17:19))

although

subn <- factor(n, levels = c(1, 3:4, 6:15, 17:19))

might be what is really wanted
JRG
#
On 26 Feb 2009 at 23:47, Barry Rowlingson wrote:

            
Oh, it's definitely possible to write better SAS code than that.  This should do the trick:

   Sub_n = input(scan("1 . 2 3 . 4 5 6 7  8  9 10 11 12 13  . 14 15 16", &N, " "), 2.);

among various other ways.

But it remains true that certain operations in SAS will be quite inefficient.

---JRG
John R. Gleason
Associate Professor

Syracuse University
430 Huntington Hall                      Voice:   315-443-3107
Syracuse, NY 13244-2340  USA             FAX:     315-443-4085

PGP public key at keyservers
#
Thanks for pointing me to the SAS code, Dr Harrell
After reading codes, I have to say that the inefficiency is not
related to SAS language itself but the SAS programmer. An experienced
SAS programmer won't use much of hard-coding, very adhoc and difficult
to maintain.
I agree with you that in the SAS code, it is a little too much to
evaluate predictions. such complex data step actually can be replaced
by simpler iml code.

On Thu, Feb 26, 2009 at 5:57 PM, Frank E Harrell Jr
<f.harrell at vanderbilt.edu> wrote:

  
    
#
Frank,

I can't see the code you mention - Web marshall at work - but I don't think
you should be too quick to run down SAS - it's a powerful and flexible
language but unfortunately very expensive.

Your example mentions doing a vector product in the macro language - this
only suggest to me that those people writing the code need a crash course
in SAS/IML (the matrix language). SAS is designed to work on records and so
is inapproprorriate for matrices - macros are only an efficient code
copying device. Doing matrix computations in this way is pretty mad and the
code would be impossible never mind the memory problems.
SAS recognise that but a lot of SAS users remain familiar with IML.

In IML by contrast there are inner, cross and outer products and a raft of
other useful methods for matrix work that R users would be familiar with.
OLS for example is one line:

b = solve(X`X, X`y) ;
rss = sqrt(ssq(y - Xb)) ;

And to give you a flavour of IML's capabilities I implemented a SAS version
of the MARS program in it about 6 or 7 years ago.
BTW SPSS also has a matrix language.

Gerard



                                                                           
             Frank E Harrell                                               
             Jr                                                            
             <f.harrell at vander                                          To 
             bilt.edu>                 R list <r-help at stat.math.ethz.ch>   
             Sent by:                                                   cc 
             r-help-bounces at r-                                             
             project.org                                           Subject 
                                       [R] Inefficiency of SAS Programming 
                                                                           
             26/02/2009 22:57                                              
                                                                           
                                                                           
                                                                           
                                                                           




If anyone wants to see a prime example of how inefficient it is to
program in SAS, take a look at the SAS programs provided by the US
Agency for Healthcare Research and Quality for risk adjusting and
reporting for hospital outcomes at
http://www.qualityindicators.ahrq.gov/software.htm .  The PSSASP3.SAS
program is a prime example.  Look at how you do a vector product in the
SAS macro language to evaluate predictions from a logistic regression
model.  I estimate that using R would easily cut the programming time of
this set of programs by a factor of 4.

Frank
--
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



**********************************************************************************
The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.  It is the policy of the Department of Justice, Equality and Law Reform and the Agencies and Offices using its IT services to disallow the sending of offensive material.
Should you consider that the material contained in this message is offensive you should contact the sender immediately and also mailminder[at]justice.ie.

Is le haghaidh an duine n? an eintitis ar a bhfuil s? d?rithe, agus le haghaidh an duine n? an eintitis sin amh?in, a bhearta?tear an fhaisn?is a tarchuireadh agus f?adfaidh s? go bhfuil ?bhar faoi r?n agus/n? faoi phribhl?id inti. Toirmisctear aon athbhreithni?, atarchur n? leathadh a dh?anamh ar an bhfaisn?is seo, aon ?s?id eile a bhaint aisti n? aon ghn?omh a dh?anamh ar a hiontaoibh, ag daoine n? ag eintitis seachas an faighteoir beartaithe. M? fuair t? ? seo tr? dhearmad, t?igh i dteagmh?il leis an seolt?ir, le do thoil, agus scrios an t-?bhar as aon r?omhaire. Is ? beartas na Roinne Dl? agus Cirt, Comhionannais agus Athch?irithe Dl?, agus na nOif?g? agus na nGn?omhaireachta? a ?s?ideann seirbh?s? TF na Roinne, seoladh ?bhair chol?il a dh?chead?.
M?s rud ? go measann t? gur ?bhar col?il at? san ?bhar at? sa teachtaireacht seo is ceart duit dul i dteagmh?il leis an seolt?ir l?ithreach agus le mailminder[ag]justice.ie chomh maith. 
***********************************************************************************
#
2009/2/27 Peter Dalgaard <p.dalgaard at biostat.ku.dk>:
I'm not sure which is easier to grasp. When I first saw the original
version I thought it was an odd way of doing "SUB_N = &N.". Only then
did I have a closer look and spot the missing 2, 5, and 16. A comment
would have been very enlightening. But there was nothing relevant.
I think the important thing with any programming is to make sure what
you want is expressed in words somewhere. If not in the code, then in
the comments. And operations like this should be abstracted into
functions.

  All the examples of SAS code I've seen seem to fall into the old
practices of writing great long 'scripts', with minimal code-reuse and
encapsulation of useful functionality. If these SAS scripts are then
given to new SAS programmers then the chances are they will follow
these bad practices. Show them well-written R code (or C, or Python)
and maybe they can implement those good practices into their SAS work.
Assuming SAS can do that. I'm not sure.


Barry
#
Wensui Liu wrote:
Agreed that the SAS code could have been much better.  I programmed in 
SAS for 23 years and would have done it much differently.  But you will 
find that the most elegant SAS program re-write will still be a far cry 
from the elegance of R.

Frank

  
    
#
Ajay ohri wrote:
A system that requires Excel for its success is not a complete system.
Really?  Try this in SAS:

library(Design)
f <- lrm(death ~ rcs(age,5)*sex)
anova(f)     # get test of nonlinearity of interactions among other things
nomogram(f)  # depict model graphically

The restricted cubic spline in age, i.e., assuming the age relationship 
is smooth but not much else, is very easy to code in R.  There are many 
other automatic transformations available.  The lack of generality of 
the SAS language makes many SAS users assume linearity for more often 
than R users do.

Also note that PROC LOGISTIC, without invocation of a special option, 
would make the user believe that older subjects have lower chances of 
dying, as SAS by default takes the even being predicted to be death=0.

Frank

  
    
#
Gerard M. Keogh wrote:
But try this:

PROC IML;
... some custom user code ...
... loop over j=1 to 10 ...
...   PROC GENMOD, output results back to IML
...

IML is only a partial solution since it is not integrated with the PROC 
step.

Frank
#
Ajay ohri wrote:
Ajay,

This will generate major confusion among users of all types and be hard 
to maintain.  A better approach is to get Bob Muenchen's excellent book 
and keep it nearby.

Frank

  
    
#
Yes Frank, I accept your point but nevertheless IML is the proper place for
matrix work in SAS - mixing macro-level logic and computation is another
question - R is certainly more seemless in this respect.

Gerard


                                                                           
             Frank E Harrell                                               
             Jr                                                            
             <f.harrell at vander                                          To 
             bilt.edu>                 "Gerard M. Keogh"                   
                                       <GMKeogh at justice.ie>                
             27/02/2009 13:55                                           cc 
                                       R list <r-help at stat.math.ethz.ch>,  
                                       r-help-bounces at r-project.org        
                                                                   Subject 
                                       Re: [R] Inefficiency of SAS         
                                       Programming
Gerard M. Keogh wrote:
think
so
the
of
version
But try this:

PROC IML;
... some custom user code ...
... loop over j=1 to 10 ...
...   PROC GENMOD, output results back to IML
...

IML is only a partial solution since it is not integrated with the PROC
step.

Frank
To
cc
Subject
Programming
**********************************************************************************
which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipient is prohibited. If you received
this in error, please contact the sender and delete the material from any
computer.  It is the policy of the Department of Justice, Equality and Law
Reform and the Agencies and Offices using its IT services to disallow the
sending of offensive material.
offensive you should contact the sender immediately and also
mailminder[at]justice.ie.
haghaidh an duine n? an eintitis sin amh?in, a bhearta?tear an fhaisn?is a
tarchuireadh agus f?adfaidh s? go bhfuil ?bhar faoi r?n agus/n? faoi
phribhl?id inti. Toirmisctear aon athbhreithni?, atarchur n? leathadh a
dh?anamh ar an bhfaisn?is seo, aon ?s?id eile a bhaint aisti n? aon ghn?omh
a dh?anamh ar a hiontaoibh, ag daoine n? ag eintitis seachas an faighteoir
beartaithe. M? fuair t? ? seo tr? dhearmad, t?igh i dteagmh?il leis an
seolt?ir, le do thoil, agus scrios an t-?bhar as aon r?omhaire. Is ?
beartas na Roinne Dl? agus Cirt, Comhionannais agus Athch?irithe Dl?, agus
na nOif?g? agus na nGn?omhaireachta? a ?s?ideann seirbh?s? TF na Roinne,
seoladh ?bhair chol?il a dh?chead?.
teachtaireacht seo is ceart duit dul i dteagmh?il leis an seolt?ir
l?ithreach agus le mailminder[ag]justice.ie chomh maith.
***********************************************************************************
--
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University
#
on 02/27/2009 07:57 AM Frank E Harrell Jr wrote:
I whole heartedly agree with Frank here. It may be one thing to have a
"translation" process in place based upon some form of logical mapping
between the two languages (as Bob's book provides). But is another thing
entirely to actually start writing functions that provide wrappers
modeled on SAS based PROCs.

If you do this, then you only serve to obfuscate the fundamental
philosophical and functional differences between the two languages and
doom a new useR to missing all of R's benefits. They will continue to
try to figure out how to use R based upon their "SAS intuition" rather
than developing a new set of coding and even statistical paradigms.

Having been through the SAS to S/R transition myself, having used SAS
for much of the 90's and now having used R for over 7 years, I can speak
from personal experience and state that the only way to achieve the
requisite proficiency with R is immersion therapy.

Regards,

Marc Schwartz
#
I had enrolled in a statistics course this semester, but after the
first class, I dropped it because it uses SAS. This thread makes me
quite glad.

Tom!

On Fri, Feb 27, 2009 at 8:48 AM, Frank E Harrell Jr
<f.harrell at vanderbilt.edu> wrote:
#
Ajay ohri wrote:
This is futile and will make it more difficult for other R users to help 
you in the future.  As Marc said this is really a bad idea and will 
backfire.

Frank

  
    
#
But SAS/IML is not part of base SAS, it costs extra, so there is a good chance that a user that has SAS will not be able to run code that uses SAS/IML.

I have known of SAS programmers who know IML well that still write matrix/vector tools using macros or proc transpose so that a user without IML can still use the code (the fact that the code that started this thread was found on a website, suggests that it was meant for general use rather than something only used internally where you know what add-ons will be available).

Just another way that R makes life easier for both programmer and user.
#
On Fri, Feb 27, 2009 at 8:53 AM, Frank E Harrell Jr
<f.harrell at vanderbilt.edu> wrote:
To be fair R depends on perl (although this dependence seems to be decreasing
lately and possibly will be eliminated), latex and a bunch of unix
tools.  Developing
GUIs depends on tcl/tk or other external system and developing fast code
can require that some of it be written in C.