Skip to content

SAS or R software

2 messages · Michael Friendly, Frank E Harrell Jr

#
I can't resist dipping my oar in here:

For me, some significant advantages of SAS are

- Ability to input data in almost *any* conceivable form using the 
combination of features available through
input/infile statements, SAS informats and formats, data step 
programming, etc.  Dataset manipulation
(merge, join, stack, subset, summarize, etc.) also favors SAS in more 
complex cases, IMHO.
- Output delivery system (ODS):  *Every* piece of SAS output is an 
output object that can be captured as
a dataset, rendered in RTF, LaTeX, HTML, PDF, etc. with a relatively 
simple mechanism (including graphs)
    ods pdf file='mystuff.pdf'';
      << any sas stuff>>
    ods pdf close;
- If you think the output tables are ugly, it is not difficult to define 
a template for *any* output to display
it the way you like.
- ODS Graphics (new in SAS 9.1) will extend much of this so that 
statistical procedures will produce many
graphics themselves with ease.

One significant disadvantage for me is the difficulty of composing 
multipanel graphic displays
(trellis graphics, linked micromaps, etc.) due to the lack of an 
overall, top-down graphics environment.
As well, there are a variety of kinds of graphics I've found 
extraordinarily frustrating to try to do
in SAS because of lack of coherence or generality in the output 
available from procedures --- an
example would be effect displays, such as implemented in R in the 
effects package.  

I can't agree, however, with Frank Harrell that SAS produces 'the worst 
graphics in the statistical software world.'
One can get ugly graphs in R almost as easily in SAS just by accepting 
the 80-20 rule:  You can typically get 80% of
what you want with 20% of the effort.  To get what you really want takes 
the remaining 80% of effort.
On the other hand, the active hard work of many R developers has led to 
R graphics for which the *default*
results for many graphs avoid many of the egregious design errors 
introduced in SAS in the days of pen-plotters
(+ signs for points, cross-hatching for area fill).

-Michael
#
Michael Friendly wrote:
I respectfully disagree Michael, in cases where the file sizes are not 
enormous.  And in many cases repetitive SAS data manipulation can be 
done much easier using vectors (e.g., vectors of recodes) and matrices 
in R, as shown in the examples in Alzola & Harrell 
(http://biostat.mc.vanderbilt.edu/twiki/pub/Main/RS/sintro.pdf).

We are working on a project in which a client sends us 50 SAS datasets. 
  Not only can we do all the complex data manipulations needed in R, but 
we can treat the 50 datasets as a array (actually a list( )) to handle 
repetitive operations over many datasets.
I would like to see SAS ODS duplicate Table 10 in 
http://biostat.mc.vanderbilt.edu/twiki/pub/Main/StatReport/summary.pdf

Cheers,

Frank