Skip to content

[R-gui] The hidden costs of GPL software?

22 messages · Philippe GROSJEAN, Jan P. Smit, (Ted Harding) +12 more

#
Hello,

In the latest 'Scientific Computing World' magazine (issue 78, p. 22), there
is a review on free statistical software by Felix Grant ("doesn't have to
pay good money to obtain good statistics software"). As far as I know, this
is the first time that R is even mentioned in this magazine, given that it
usually discuss commercial products.

In this article, the analysis of R is interesting. It is admitted that R is
a great software with lots of potentials, but: "All in all, R was a good
lesson in the price that may have to be paid for free software: I spent many
hours relearning some quite basic things taken for granted in the commercial
package." Those basic things are releated with data import, obtention of
basic plots, etc... with a claim for a missing more intuitive GUI in order
to smooth a little bit the learning curve.

There are several R GUI projects ongoing, but these are progressing very
slowly. The main reason is, I believe, that a relatively low number of
programmers working on R are interested by this field. Most people wanting
such a GUI are basic user that do not (cannot) contribute... And if they
eventually become more knowledgeable, they tend to have other interests.

So, is this analysis correct: are there hidden costs for free software like
R in the time required to learn it? At least currently, for the people I
know (biologists, ecologists, oceanographers, ...), this is perfectly true.
This is even an insurmountable barrier for many of them I know, and they
have given up (they come back to Statistica, Systat, or S-PLUS using
exclusively functions they can reach through menus/dialog boxes).

Of course, the solution is to have a decent GUI for R, but this is a lot of
work, and I wonder if the intrinsic mechanism of GPL is not working against
such a development (leading to a very low pool of programmers actively
involved in the elaboration of such a GUI, in comparison to the very large
pool of competent developers working on R itself).

Do not misunderstand me: I don't give up with my GUI project, I am just
wondering if there is a general, ineluctable mechanism that leads to the
current R / R GUI situation as it stands,... and consequently to a "general
rule" that there are indeed most of the time "hidden costs" in free
software, due to the larger time required to learn it. I am sure there are
counter-examples, however, my feeling is that, for Linux, Apache, etc... the
GUI (if there is one) is often a way back in comparison to the potentials in
the software, leading to a steep learning curve in order to use all these
features.

I would be interested by your impressions and ideas on this topic.

Best regards,

Philippe Grosjean  

..............................................<?}))><........
 ) ) ) ) )
( ( ( ( (    Prof. Philippe Grosjean
 ) ) ) ) )
( ( ( ( (    Numerical Ecology of Aquatic Systems
 ) ) ) ) )   Mons-Hainaut University, Pentagone
( ( ( ( (    Academie Universitaire Wallonie-Bruxelles
 ) ) ) ) )   6, av du Champ de Mars, 7000 Mons, Belgium  
( ( ( ( (       
 ) ) ) ) )   phone: + 32.65.37.34.97, fax: + 32.65.37.33.12
( ( ( ( (    email: Philippe.Grosjean@umh.ac.be
 ) ) ) ) )      
( ( ( ( (    web:   http://www.umh.ac.be/~econum
 ) ) ) ) )
..............................................................
#
Dear Phillippe,

Very interesting. The URL of the article is 
http://www.scientific-computing.com/scwsepoct04free_statistics.html.

Best regards,

Jan Smit
Philippe Grosjean wrote:
#
On 17-Nov-04 Philippe Grosjean wrote:
Hi Philippe,
Thanks for a most interesting post on this question. Further
comments below. Felix Grant's article is excellent, and well
balanced.
It would better represent the balanced view of the article
to further quote:

  "In fact, the whole file menu in R looks either elegantly
   uncluttered of frightenly obscure, depending on your point
   of view."

  "It [the effort of learning] is the price paid, just as the
   dollars or euros for a commercial package would be. For
   that price, I've learned a great deal -- and nor only
   about R. And I shall remember it when I next have to find
   a heavyweight solution for a big problem presented by a
   small charitable client with an invisible budget. It's a
   huge, awe-inspiring package -- easier to perceive as such
   because the power is not hidden beneath a cosmetic veneer."

This last remark is, in my view, particularly significant.
See below.
Non-GUI vs GUI is not intrinsically linked to Free Software
as such. There are well-known FS programs which are essentially
GUI-based -- as an easy example, consider all the FS Web
Browsers such as Netscape, Mozilla, ... . If you want the
graphics experiences offered by the Web, you're in a graphics
screen anyway, and so it may as well be programmed around
a GUI. Others, such as OpenOffice, have deliberately built
on a GUI approach in order to emulate The Other Thing.

There are a lot of FS programs which offer a GUI, usually
somewhat on the basic side, which nonetheless encapsulates
the entire functionality of the program and saves the user
the task of composing a possibly complex command-line or
even a script.

The comment "hidden beneath a cosmetic veneer" is, in my
view, somewhat directly linked to commercial software.
If you sell software, you want a big market. So you want
to include the people who will never learn how to work
software from a command line; and the sweeter the taste of
the eye candy, the more such people will feel enjoyment
in using the software. The fact that their usage is limited
to what has been pre-programmed into the menus is not going
to affect many such people, since typically their useage
is limited to a very small subset of what is in fact possible.
This in turn leads, of course, to the phenomenon of
"software-driven analysis", where people only do what the
GUI allows (or, more precisely, easily allows); and this
leads on in turn to a culture in which people tend to believe
that Statistics is what they can do with a particular
software package.

S-Plus does its best to compromise: as well as GUI access
to a pretty wide range of functions, there is the Command
Line Window where the user can explicitly type in commands.
(I dare say many R users, in S-Plus, may tend to work in
the latter since they are already used to it.) But, as always
in a GUI, one can tend to get lost in the ramifications.
Also, things like the big arrays of tiny icons you get when
you click on the "2D Plots" or "3D Plots" buttons in the
S-Plus toolbar can be trying on the eyes and time-consuming
to pick through.
Often, I think, in the Free Software world, people get involved
because they want to produce something which achieves a task.
Once they have a program which does that, then their aim is
satisfied. The GUI, in many cases, would be additional work
which would add nothing to what the software can do in terms
of tasks to be achieved. So in such cases, yes, I would tend
to agree that there is an intrinsic mechanism that discourages
work on a GUI for its own sake. You can add to that the fact
that once a developer has got to the point of creating such
software, successful in the tasks, they may have got beyond
the point at which they can readily sympathise with users who
have not acquired such skills: they no longer perceive, from
their own experience, that there is a problem.

However, this leaves people like you, having colleagues who
"come back to Statistica, Systat, or S-PLUS using exclusively
functions they can reach through menus/dialog boxes." By this
experience, you are aware of the problem, and rightly feel
that they would be helped by having access to the sort of
GUI/Menu interface that they are used to using.

One genuine benefit that the GUI offers, especially to
beginners with a particular software package, is that the
resources of the software can perhaps more easily and rapidly
be explored through the GUI, rather than searching laboriously
through the documentation of functions, extra packages, and
so on. This means that they more readily come to perceive
what is available though of course this is limited to what
the GUI will show them. But a good "Help" window can break
that barrier.

Perhaps R itself is less helpful than it might be in this
respect. The R-help list bristles with queries of the form
"How can I do X?", which I think is evidence of a problem.
While some of these queries clearly originate from people
who have taken no trouble to explore readily accessible
information, many others can not be so easily dismissed.

If you know something about what you're after, once you
realise that a judiciously formulated "help.search" can
throw up a lot of possibilities you are well on your way.
So, for instance (as in a recent query about 2-D Fourier
transform for spatial data) 'help.search("fourier")' gives
relevant information.

This, though, still fails for information in packages which
you have not installed. Perhaps I'm about to reveal my own
culpable ignorance here, but I'm not aware of a "full R info"
package which would be installed as part of R-base, being
a database of info about R-base itself and also every current
additional package, such that a "help.search" would show
all resources -- including those not installed -- which
match a query (and flag the non-installed ones as such so
that the user knows what to install for a particular purpose).

Whether this needs to be supplemented by a GUI is a point
that could be discussed from several points of view.
Philippe's biological/oceanographic users no doubt would
be considerably helped, provided they can in due course
come to the point where they can start to work "beyond
the GUI" (if indeed they need to).

Personally, however, I find that GUI work is slower and
more error-prone than command-line work. Swanning the
mouse around the screen, visually idebtifying icons and
buttons, clicking on this and that in order to see whether
it's what you want, and so on, is much more time-consuming
than typiing in a command.
And God help you if you accidentally click on something
destructive!

I'll close with an immortal quotation (from Charles Curran,
of the UK Unix Users Group):

  "I can touch-type, but I can't touch-mouse"

Best wishes to all,
Ted.
/\
                                                 /   |
  .............................<?}))><........  :)    >=---
                                                 \   |
                                                   \/

Best wishes to all,
Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding@nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861  [NB: New number!]
Date: 17-Nov-04                                       Time: 12:34:31
------------------------------ XFMail ------------------------------
#
On 11/17/04 12:34, Ted Harding wrote:
This is one of the purpose of my R search page.  I have all
packages installed.  You can also search the help list, etc., in
the same search.  Some people have bookmarks for it.  Of course
you need to be connected to the internet.

I think that any attempt to replicate this for a single user, or
even the packages, would be difficult.

BUT, it might help to install just the help pages for all
packages, without the packages themselves.  Then help.search()
would find things.  (I have no interest in figuring out how to do
this, but maybe someone else does.)

Jon
#
I'm a big advocate -- perhaps even fanatic -- of  making R easier for
novices in order to spread its use, but I'm not convinced that  a GUI
(at least in the traditional form) is the most valuable approach.

Perhaps an overly harsh summary of some of Ted Harding's statements
is: You can make a truck easier to get into by taking off the wheels, but
that doesn't make it more useful.

In terms of GUIs, I think what R should focus on is the ability for  user's
to make their own specialized GUI.  So that a knowledgeable programmer
at an installation can create a system that is easy for unsophisticated
users for the limited number of tasks that are to be done.  The ultimate
users may not even need to know that R exists.

I think Ted Harding was on  the mark when he said that it is the help
system that needs enhancement.  I can imagine a system that gets the
user to the right function and then helps fill in the arguments; all of the
time pointing them towards the command line rather than away from
it.

The author of the referenced article highlighted some hidden costs of R,
but did not highlight the hidden benefits (because they were hidden from
him).  A big benefit of R is all of the bugs that aren't in it (which may or
may not be due to its free status).

Patrick Burns

Burns Statistics
patrick@burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")
Jan P. Smit wrote:

            
[ ...]
#
On Wed, 17 Nov 2004 14:27:49 +0000, Patrick Burns
<pburns@pburns.seanet.com> wrote :
I think there is (slow) movement towards that.  Certainly it's
possible now (you can add menus to Rgui in Windows, you can do nice
things like Rcmdr using TCL/TK on any platform).   However, designing
a nice GUI is very hard work.
That would be helpful, and the only really difficult part would be the
first part:  getting the user to the right function.  help.search()
sometimes works, but often people ask for the wrong thing.

After that, R knows a lot about the structure of its help files, so it
could display all of the arguments with their defaults and the help
text that corresponds to each argument, as well as the help text for
the rest of the help file.

Probably the main obstacle to getting this is finding someone with the
time and interest to do it.

Duncan Murdoch
#
This has been an interesting discussion. I make the following comment with 
hesitation, since I have neither the time nor the ability to implement it 
myself.

Using CLI software, an infrequent user has trouble remembering the known 
functions needed and trouble finding new ones (especially as that user gets 
older).  What might help is an added help facility more oriented towards 
tasks, rather than structured around functions or packages.

Such a help facility might have a tree structure.

Want help?  Are you looking for information on (1) data manipulation or (2) 
analysis?  If (1), do you want to to (3) import or export data, (4) 
transform data, (5) reshape data, or (6) select data?  If (2), do you want 
to (7) fit a model or (8) make a graph?  And so on....

Once appropriate function(s) are located, the user would be directed (by 
hyperlinks) to the existing help framework.

That could help the problem of knowing what you want to do, but not what it 
is called.  I think that "Introductory Statistics with R" is a step in that 
direction for the basics, as MASS is for more complex matters.  The 
question is whether such material can be incorporated into a help system 
that will allow users to find, more easily, what they need.  That largely 
depends, it seems to me, on a great deal of work by volunteers.

I agree also with the suggestion that a dedicated editor (or add-in) that 
could supply arguments for functions might be considerable help.

MHP
#
Patrick Burns wrote:
I really agree with you Patrick.  To me the keys are having better help 
search capabilities, linking help files to case studies or at least 
detailed examples, having a navigator by keywords (a rudimentary one is 
at http://biostat.mc.vanderbilt.edu/s/finder/finder.html), having a 
great library of examples keyed by statistical goals (a la BUGS examples 
guides), and having a menu-driven skeleton code generator that gives 
beginners a starting script to edit to use their variable names, etc. 
Also I think we need a discussion board that has a better "memory" for 
new users, like some of the user forums currently on the web, or using a 
wiki.

Frank
#
Hi Everyone,

I've been lurking on the list for a while, but found this thread very
interesting.  Basically, I agree with Felix's article except it's
assumption that these problems are open source only.  I've used plenty
of crap commercial software as a professional programmer.

I used to work on a GUI for R, but found that continuing was impossible
for me.  Part of this was personal, but part was also related to how the
R community thinks of GUIs.  In general I found that the R community was
not that interested, and many people were violently opposed to a usable
GUI.  In the end I felt that my time would be better spent on something
that would actually be appreciated, rather than ridiculed as useless.

Having said that, I really would like to give this advice to people
trying to solve the "R needs a GUI" problem:

*  One half of R is already a GUI.  If it's not, then why are there so
many plotting functions?
*  R is also a programming language.  All this talk about CLI vs GUI
completely ignores this fact.
*  There are incredibly fantastic GUIs for many other programming
languages available.  Take a look at Eclipse for Java as a great example
of how to create a platform for a language and not get in the way of the
language.
*  Rather than focusing on this false "CLI vs. GUI" dichotomy, maybe
someone should sit down with users and analyze how they actually use the
system.  I assume that the folks on the R list would be pretty good at
analyzing user behavior.  These same people should also be good at
researching language usability.
*  I believe that you actually can have both CLI and GUI living hapily,
but only after the "programming language", "data management", and
"plotting" parts are componentized and separated.
*  I found that R would need to go in one of two directions before a GUI
is feasible:  more like a compiler, or more like a service.
	* More like a compiler means to make R more like Python, Ruby, Perl,
and other languages:  have a compiler, make byte code, use a VM, and
have the UI (no matter what type) use this system.
	* More like a service means to turn R into a separate service that is a
"black box".  I tried this route with limited success by exposing the R
interpreter with a CORBA wrapper.

That's my .02 USD from having tried this once before.  I still use R
professionally, but I just don't bother with improving it.

I won't be responding to this message, but feel free to reply anyway.

Zed
On Wed, 2004-11-17 at 10:53 +0100, Philippe Grosjean wrote:

  
    
#
On 17 Nov 2004, at 2:27 pm, Patrick Burns wrote:

            
I think this is spot on.  My situation is that I am a scientist turned 
system administrator, and R is a package which I am increasingly being 
asked to install for the use of scientists at this Institute.  I am by 
no means a statistician;  the statistics I learned in A-level maths 
almost 20 years ago were as far as I got, and most of that I have 
forgotten.  But I like to have some understanding of the software 
packages I am asked to support, so I've been looking at R with a view 
to learning some of its more basic functions.  It looks potentially 
very useful to me anyway for summarising activity on the supercomputing 
cluster that I run.

So I'm a newbie to R, armed with only a very basic knowledge of 
statistics (I know the difference between a Normal and a Poisson 
distribution at least, and with a bit of prodding could probably 
remember a binomial distribution too).  I'm an experienced programmer 
in several languages, and a PhD-level scientist.

And yet I have still found R really quite hard to learn, and this is 
principally because the on-line help is a reference manual.  I'm sure 
it's a fabulous resource if you're a statistician who uses R every day, 
but for me it's not very helpful.

The R Intro PDF is good, but it would be nice if it were integrated 
better, with hyperlinks to the reference documentation, or to other 
parts of the introduction, for those platforms that support such things 
(it looks like this was intended for MacOS X, which is the version I am 
playing with for my own use, although the version I maintain for users 
is on Linux [ and would be on Alpha/Tru64 too if I could get it to pass 
its tests ]) but the on-line help link to the Intro on the Aqua R 
version brings up a blank page, so I'm using the generic PDF document 
instead.

I think the GUI question has nothing to do with the hidden costs of the 
GPL, or otherwise.  This is the age-old ease-of-use versus power and 
capability argument.

I don't think a fancy GUI is necessary - the GUI aspects that have been 
added to R on Mac OS X are sufficient.  I get the impression that the 
real power of R is the fact that really it's a programming language, 
and should probably be treated and learned as such.  Quite apart from 
the fact that a GUI will necessarily be a somewhat restricted subset of 
the total functionality, and a lot slower to use once you've taken the 
effort to learn the software, I think there is another danger, which I 
have already seen in other pieces of software in the bioinformatics 
community.  Users frequently run completely pointless analyses through 
the GUI wrappers we provide.  The users using the command line 
interfaces typically do much more sensible things.

If you make a piece of software trivial for a user to use without 
thinking about what they're doing, then the users won't think.  I may 
not know much about statistics, but what little I do know is that 
understanding exactly what form of analysis or significance test is 
required to be meaningful is a real skill that takes a lot of 
experience to master.   Having to perform that analysis with written 
commands means that your method is recorded, and could be published, 
and more importantly be checked and reproduced by other researchers.  
It also gives you ample time to think about what you're doing, rather 
than just bashing out a pretty graph which actually has no real meaning 
whatsoever.

Any GUI to R could (and should) be able to store the command line 
equivalent to what it has just done, to satisfy the reproducible 
criterion above, but I suspect it could still lead to some pretty 
shoddy work being done by careless and lazy scientists, and we get 
enough of that already.

Tim
#
On 18 Nov 2004, at 10:27 am, Tim Cutts wrote:

            
I should correct myself here, and note that there are some 
cross-references within the PDF document, it's not completely devoid of 
them.

Tim
#
Hmmmm, interesting thread and minds will not be
changed but regarding GUIs...I thought S (aka R) was a
PROGRAMMING LANGUAGE with a statistical and numerical
slant, and not a statistics application. ;O)  

Certainly there is an important place for GUIs but I
believe that it is very much overemphasized in modern
computer culture. My experience and bias--and I
started in the 1960's-- is that except for 'trivial'
uses, GUIs are a detriment to any reasonably complex
CREATIVE computational task. They are adequate for the
simple, common task. But even then, typing a command
or two is not overly taxing--- particularly when
compared to navigating layer upon layer of submenus as
is some times needed. If I need to, I will add a
little syntactical sugaring when coding and move on. 

GUIs encourage a passive approach to using computers
when solving problems. In addition, it is regretable
that a lot of people in the 'workplace' will carry out
incomplete and/or incorrect quantitative work because
of the real or perceived limitations of the particular
(GUI) apps they are using. There is no inclination to
go beyond the menu and even then many menu items
gather 'electronic dust'.

Finally, there are times for many of us when work
'goes home' at the end of the day. That just comes
with the territory. I (and most others) can not afford
the luxury of S-plus, Statistica, SPSS, etc. at home.
So in a sense there is a very real 'loss of
productivity' cost associated with using commercial
software. Now that does bring us around to supporting
R doesn't it? (Mea culpa. And I resolve to do better!)
What value does one put on the vitality of the R
community?

Best regards,
Michael Grant, Ph.D. 

* The requirements for creating packages are on
target,  and have the desired impact on both the
quality and breadth of R.
--- Philippe Grosjean <phgrosjean@sciviews.org> wrote:

            
#
I have found this discussion interesting, and Michael's comments seem to me to 
be right on the money :)

It seems to me that one thing we can learn from the commercial software world 
is the benefit of defining the "market segments" for R, and of identifying 
the needs and drivers of each segment. 

A rough starting point for that definition might include the following 
segments:

1) A Statistical or machine learning researcher needing a rapid development 
framework to implement and prototype her research;

2) Consultant statisticians / data analysts needing a flexible modelling tool 
in which to perform state-of-the-art analyses. A typical worker of this type 
will use R daily; she will have a sophisticated knowledge of statistics, and 
broad experience of programming.

3) Application builders or system integrators needing to use R to deliver 
packaged, and specialist functionality. A typical user of this type will have 
sophisticated knowledge of information systems and software engineering, and 
some knowledge of statistics.

4) Scientists, or other non statistical and non IT professionals needing a 
good solution for general purpose statistical analysis. Such a person will be 
familiar with spreadsheets and other office productivity tools, will have 
limited knowledge of statistics and limited knowledge of IT.


The needs of these segments are very different. Segments 1,2 and 3 will 
probably have little need for a GUI - but may benefit enormously from an IDE.

Segment 3 will benefit from having a rapid development tool for building GUIs.

Segment 4 will be looking for a GUI which insulates them from the technical 
details of the statistical models, and from the need to program analyses.

It seems to me that R addresses segments 1 and 2 brilliantly, that it has made 
real progress in segment 3 (though much more could be done) and that in its 
current form it makes little provision for segment 4.  

If R were a commercial operation, the decision would be simple - what is the 
size of each segment?, how much is each segment prepared to pay? what are the 
emerging requirements of each segment? How easily can each segment be 
supported with our existing architecture? what is the cost of providing for 
each segment? 

For a community it is more difficult - we all have a different profile in the 
"market" and therefore have different needs and differences of opinion about 
the best direction for the R project. I personally operate mostly in segment 
2 - and I do not need and would not use a GUI. I don't want anything to get 
in the way of the flexibility of a statistical language. I can't afford Splus 
- so I just have to put up with the superior performance, stability and focus 
of R :)  Whilst I may have sympathy for the needs of segment 4, I can't say 
that I feel especially motivated to do anything about them.


Kindest regards
Mervyn Thomas
#
On Wed, 17 Nov 2004 21:40:35 -0500, Zed Shaw <zedshaw@zedshaw.com>
wrote :
Building a nice GUI is definitely a lot of work, and not everyone will
appreciate it.  But the same can be said about just about any other
aspect of R.  You should go ahead with it if you like doing that sort
of thing, or have a need for it:  and expect that some people will
like your work, some won't, some will claim that their system is
better, etc.
Yes, there is definitely graphical output, and some limited graphical
input.
I've never used Eclipse.  The example I'd cite would be Delphi, and I
know there are others.
I'm not sure of that.  Being able to assess usability and user
behaviour is a pretty specialized task.  There are probably some folks
who can do that, but not very many.
The plotting part is already separated:  there is a defined binary
interface to the graphics system.

Separating the language from the data may not be a good idea:  one of
the goals of R is to allow language to be considered as data.
I think both of these directions are possible.  There has already been
progress in making R look more like a compiler:  the idea of packages
with namespaces is very much along that line.

But R is also a service.  Here R is still lacking, in that the main
interface to the service is through a teletype style interface.  ESS
shows that it's possible to use this interface to put a new GUI on top
of the service, but it would be a lot easier if the interface were
done at a lower level, and if there were a clearer separation between
user interface and service.  Designing this and getting it to work is
very hard work.

Duncan Murdoch
#
Hello,

I appreciate many comments and the various points of view, especially
because there are a couple of clear explanations why several people do not
need (or even do not want) a GUI for R!

Another part of the discussion seems to switch to the never-ending question
of "what kind of GUI"... which will never be answered, because there is not
one best GUI, and it also depends on the use (both the application and the
user). It's a long time I hesitate to propose in R-SIG-GUI + the R GUI
projects web site to place a description for one or several "prototype"
GUI(s) we would like for R, with the intention to include all the good ideas
everybody has in this list.

I never did that, because I am pretty sure it is useless! Now, I feel that
one guy, with a clear view of what he wants, a lot of free time, a lot of
energy, and some decent skills in programming, is actually required to make
real what he has in his head! Indeed, it is such a huge work that several
people are required! Here are the topics currently developed (sorry if I
don't cite Bioconductor stuff: I don't know it):

- Most of the "low-level" work is done, I think, like interface with
graphical toolkits: tcltk by Peter Dalgaard, of course, but many others
(Gtk, wxPython, ...), a better control of Rgui under Windows (ongoing,
Duncan Murdoch), ESS, ... All this is already available, even if one could
always argue that it is not optimal in some respects.

- A better console (multiple-lines editing, syntax coloring, code tip
presenting the syntax of a function when you type it, contextual completion
list, ...). This is ongoing project in both JGR and SciViews-R.

- A better table editor: RKward team.

- A classical menus/dialog box approach: John Fox's R commander,

- An object explorer: JGR, RKward, SciViews-R, experimental functions in R,

- A "plug-in" approach, that is, a piece of code that brings a GUI for a
targeted analysis and builds R code for you: RKward team, but also some
functions in svDialogs (part of the SciViews bundle, R GUI API),

- Interactive documents mixing formatted text, graphs, etc... with R
input/output: Rpad, Sweave (not interactive), and some other,

- Rich-formatted output of R objects (in/out, views, reporting,...): Eric
Lecoutre's R2HTML + SciViews-R,

- Code editor with interaction with R: Tinn-R, WinEdt, Emacs, and many
others, 

- IDE (humm, some code editors are not so far away from an IDE, but there is
still some lack here),

- A R GUI API: SciViews.

I hope all these projects will continue, will mature, and their developers
will ultimately realize that they provide complementary pieces of a giant
puzzle and start to work together. This is when it will become most
exciting! I hope also that it will result in an original GUI that keeps most
of the spirit of R, that is, not a simplified point&click UI, leading to
meaningless analyses by lazy people, but a real tool whose goal is to make R
easier and faster to learn for beginner, and pretty usable for occasional
users.

May be, I am just a dreamer, but all I read in this discussion reinforce my
conviction that an **innovative** GUI would be a good addition to R: most
criticisms clearly relate to the kind of inflexible GUI, with a forest of
menus and submenus, and other bad things one could find. I never, and will
never advocate for such a GUI!

For sure, the alternate GUI will only support you in writing R code, and
will deliver plenty of help to achieve this goal. I think it is possible...
with enough people collaborating in a common project! I think the later
point is really the problem: not enough people, too many projects! Is it a
consequence of the way R is developed (GPL)? Well, I think so, but only
partly. It is also the consequence of ego (everybody wants to be the leader
of his own project), and a lack of communication (R-SIG-GUI is not what one
would call an active list!) Or, may be, a "good GUI" for R is a fuzzy target
and it is not possible to cristallize enough power around a common goal: to
reach it!

Anyway, despite R GUI projects are progressing very slowly, I think only
when we would have a "good GUI" available for R, we would be able to
evaluate if there are really "hidden costs" in R, as Felix Grant suggests in
his paper.

Best regards and thank you all for your comments and suggestions.

Philippe Grosjean
#
On Thu, 2004-11-18 at 03:24 -0800, Michael Grant wrote:
"R is a language and environment for statistical computing and
graphics."


I think that this is a critical point and that there is, to my mind, a
false predicate at play here.

That predicate is that somehow one should be able to rapidly learn R (or
any programming language for that matter) solely via the available
online reference help or via the freely provided documentation (whether
via R Core or via Contributors).

How many people here have learned to use C, FORTRAN, SAS, VBA, Perl or
any other language strictly by using built-in reference help systems. If
any, it will be a very small proportion.

Sure, SAS comes with documentation that can be measured in hernia
inducing tonnage, but at a substantial annual cost, which I have
referenced here and elsewhere previously. R is free.

Is there anyone who has learned to code in C that does not have a copy
of K&R someplace on their shelf, probably along with copies of other
both general and application specific C references published by
Prentice-Hall, Addison-Wesley, McGraw-Hill or Hayden?

It has been years since I actively coded in C, but I have almost 3
shelves filled with C reference books. I have books dating back to the
early 80's for 80x86 Assembly, MS-DOS/BIOS interrupts and Windows API
technical references and other such books that I used to use on a daily
basis in a former life.

For Linux, I have two shelves filled with various O'Reilly and other
references running the gambit from general Linux stuff to Perl,
Procmail, Postfix, Bash, Regex, Emacs, Admin, Firewalls and others.

For R, I have most of a shelf filled with multiple references, including
three of the four editions of MASS (somehow I missed the 2nd edition). I
have a copy of Peter's ISwR (because on occasion I have an acute attack
of cerebral flatulence and have to go back to basics) along with copies
of Pinheiro & Bates, Fox, Maindonald & Braun, Krause & Olson, Everitt &
Rabe-Hesketh and V&R's S Programming. I have copies of the "White Book"
and the "Green Book" and I have copies of Harrell and Therneau &
Grambsch for specific applications of R.

There are a fair number of already published books on R/S with more
coming by Faraway, Heiberger & Holland, Verzani and others including a
new series from Springer.

My point being that the old philosophy of "No Pain, No Gain" is a
component of the learning curve with R. R is not going to be for
everybody. That's why there are other "point and click" statistical
_applications_ like JMP (albeit not cheap). They are relatively easy,
but at the same time, they are self-limiting. No single math/statistical
"product" is going to meet the needs of the entire spectrum of the
potential user space.

As I have mentioned previously, I am a firm believer in Pareto's 80/20
Rule. In this case, you develop a "product" to meet the needs of 80% of
your target user space, because you will go "bankrupt" meeting the needs
of the other 20%. Said differently, meeting the needs of the other 20%
will consume 80% of your development resources, restricting your ability
to meet the needs of the larger audience.

Having spent 12 years previously with a commercial medical software
company, I will also suggest that typically 20% of your user base will
consume 80% of your support resources.

I will also note that having been on both sides of that equation, the
support provided here within this community is superb and has no peer in
the commercial arena.

In R's case, the 80% of the user space has perhaps been extended by the
kind offerings of those who have made specialty packages available via
CRAN, BioC and others.

It takes a certain level of commitment and time with R to become
effective with it.

That commitment includes, in my mind, supplementing the available _free_
documentation that has kindly been provided by R Core and others, with
other available resources. That does not mean that everyone needs to get
on Amazon.com and spend hundreds of $YOUR_MONETARY_UNIT on books. Many
are available via libraries and/or other resources, especially for those
here in academic environments.

This is a community effort folks and not everything is going to be
provided to you free of charge, with that notion being either in actual
financial cost or time.

It appears that, since this is not the first time this subject has come
up, there is strong interest in building a c("new", "different",
"better", ...) documentation/help system for R. That's fine. For those
that have interest in pursuing this, perhaps the time has come for a
group to form a new r-sig-doc list and move forward with the development
of a framework for a new system that can be developed and implemented by
that same group and then provided back to the community. 

Writing technical and user documentation is a specialty skill set unto
itself and perhaps those with the requisite skill sets will contribute
them for the benefit of all.

For those that do not have the skills and/or the time to contribute, I
would urge you to financially contribute to the R Foundation in whatever
way you can afford. Through that mechanism you will support the
community at large and the future development and enhancement of R.

There is no "hidden cost" here and certainly not one that is unique to
GPL software. The cost is self-evident and it is measured in time and 
$YOUR_MONETARY_UNITs. "Time is money" as they say and that is the same
whether you are using GPL software or a commercial proprietary product. 

A key difference here if any, is that none of us have paid anything for
R, where a portion of that "revenue" would go to support a dedicated
documentation team. In this case, it is "If you want it, you will need
to design and build it."

Best regards,

Marc Schwartz
#
On Wed, 17 Nov 2004, Mike Prager wrote:
...
...

Another good (non-GUI) tool for the CLI is keyword completion.  R in ESS
does this, giving you lists of possible functions, variables and objects,
or feedback if there isn't any.  R's CLI completes, but only with
filenames in the current directory.

Dave
#
On Thu, 18 Nov 2004 03:24:01 -0800 (PST), Michael Grant
<mwgrant2001@yahoo.com> wrote:

            
I have to disagree with you.  What you say might be true about *bad*
GUIs, but I find nothing more frustrating than the lack of programming
support in R.

What's a nice GUI for programming?

You should be able to edit code, and have R parse the code that you
are editing.  The current disconnect between the source file and what
is in R makes debugging really painful.  I'd like to single step
through a function, and when I spot the error, *edit it right there*.
I'd like to be able to use the mouse to find the contents of a
variable as I'm single stepping.  I'd like code-completion to be
possible in the editor, and help hints based on what I'm typing.

All of these things have existed for years in IDEs (i.e. programming
GUIs), but most are not in R's GUIs.
That's one sort of GUI that R could have, but it's not the only one,
and it's not the one that I'd use.  However, I might start out
students on it.  There's a big benefit to a list of suggestions as
opposed to a big blank space.
A lot of people do incomplete or incorrect work because they don't
know any better.  It doesn't matter if they're using a GUI or not,
they'll do what they think they know, and get it wrong.

Duncan Murdoch
#
--- Duncan Murdoch <murdoch@stats.uwo.ca> wrote:

            
...
...
[snip] [snip] [snip]
...
I guess we'll just agree to disagree. :O)
1.)The LACK of programming support? Isn't that a bit
of an overstatement? There are materials available, as
of ciurse you are aware. At one time or another many
of us may find it difficult to determine some 'key'
programming information at the moment. But you know
something, I've  had that happen using the packages
like you describe--this includes wired IDE help,
original documentation, and 3rd party books. I accept
that as a condition for using both free and commercial
software. And if the particular burden is too great,
then I don't use the product. Such is life :O)

2.)As you indicate below, R doesn't not have a VB or
VC++ style IDE. R doesn't have the development
environment of Smalltalk or the commercial LISPs
(sigh...) But, really, an IDE is a bit more than a
GUI, wouldn't you agree? A GUI is just one component
of an IDE.

Perhaps part of our difference is how we view
programming. I view it more as a form of expression
using a LANGUAGE. Like any language, e.g., English,
French, Chinese, you have to develop a degree of
fluency to express yourself. Some people are
comfortable working with a phrase book and others put
more effort in to learn to converse sans book. Both
approaches are quite legitimate in that either can get
the job done. (And both can fail miserably!)
real GUI would be nice at times even for a grump like
myself. And not having such is a cost. But in my case
that cost is not the deciding factor. The fact is, I
by preference do a lot of coding--both at the
quick/dirty scale and the project scale--in R that I
could do in C/C++, FORTRAN, BASIC. I have those tools
in commerical form with IDEs

Why R? The turn around is so fast by comparison. R/S
is language in which I can much more easily and
quickly express myself.  The development team has done
a lot of work developing my high-level language for me
:O). (Note--my second hacking language is  lisp-stat,
also an interpreted, higher functionality language.) I
don't use most of R's capabilities, and 'not knowing
that which I do not know' is not an issue. When I need
something new I am able to learn it incrementally on
top of what I already know.
...
Did I suggest banning GUIs? I don't think so. Your
world is one where there are benefit for your
clients--the students. My world is turn around and
documentation. Coding is easier to document than a
complex sequence of menu actions. Indeed I would get
laughed out of Dodge City if I documented a set of
calculations: " next I clicked ...". It's that just
different requirements lead to different needs.
...
[snip]
...
Of course that is the case, but the limitations in a
given GUI is one more thing that puts such people in
rationalized comfort-zone with their actions.
(Typically I see this with EXCEL apps--99.9% of the
people in my trade run away from statistical
software.)
More than once I have seen this occur in a senior
scientist review capacity after management has seen
the product and 'accepted' its results. Doom, doom,
doom...shoot the messenger! Oh woe, oh woe!:O(.

Best regards, Duncan
Michael
#
Note that there are bits and pieces of IDE components within ESS,
provided that you use some of the other tools available (in
particular, ECB).  I've not finished integration, but it provides
tools comparable to JDEE (the Emacs Java IDE), which isn't far off
from Eclipse in many ways.

Currently, it does provide limited source code navigation within the
file, for example to functions and "data assignments".  Next would be
some of the code generation tools, but that becomes tricky.

Applications programming is NOT statistical programming, and this
point needs to be hammered in, sometimes.  There is a duality with R
(and similar interactive (not necessarily interpretive) programming
languages used for data analysis such as Lisp, Perl, and Python)

An IDE for statistical analysis is different than what one wants from
a GUI, or from an IDE for applications programming.

best,
-tony

On Fri, 19 Nov 2004 04:37:04 -0800 (PST), Michael Grant
<mwgrant2001@yahoo.com> wrote:

  
    
#
On Fri, 19 Nov 2004 13:50:59 +0100, "A.J. Rossini"
<blindglobe@gmail.com> wrote:

            
Yes, definitely, and if R itself had more of the infrastructure to
support a full IDE, I imagine you'd expose it in ESS as soon as anyone
did.
I can see the need for differences between IDEs for interactive vs
compiled operation, but what sorts of differences do you think are
specific to statistical programming vs application programming?

By the way, I think we're using "GUI" differently.  For reference,
when I use it I'm distinguishing it from a teletype style command line
interface.  In my usage, vi and Emacs are both GUIs (though vi is a
pretty limited one).  Command line R is not.  Windows Rgui is mixed,
in that the console acts like a teletype (you can only add input at
the bottom), but there are also GUI elements.

Duncan Murdoch
2 days later
#
I don't think we are using GUI differently.  I'll reference some
experiments that I disliked that used Emacs/ESS pull-downs in much the
same style as SPSS/Minitab/etc.

More amusing, I actually got them to "write and evaluate" SAS
equivalents for regression model fitting as well as the same for R.

I didn't like them.  I'm not about to publish them.  They slowly
become an RSI hazard.  The only nice use was that they were templates
(and I should say that forming a general notion of statistical
templates is on my "to-do" list, probably more like a "never-do"
list).

With respect to applications programming and "statistical
programming", I could write a book.  Here is a subset of issues, and
it's not complete:

applications programs have to be mildly optimized (i.e. can't be too
slow), and are generally left alone except for maintanance.
Generally, there is a specification to which the program has to be
written, and metrics for evaluation are based on code quality, not
just the output.


statistical programs (and I'm thinking "data analysis", not
infrastructure coding or refactoring analytics to be more general)
usually have to be "finished", rather than "almost done".   Usually no
optimization is done, and extensions happen in a less specified way. 
More importantly, in idea cases they follow but don't religiously obey
an analysis plan (usually due to holes in the plan, not the program)
and metrics are based on comprehensiveness of results rather than code
metrics or the "path taken".

Sometimes application programs have to be written to get the job done
(thinking of applications programming as a subset) and other times,
it's just a pile of code (or the GUI generated hodgepodge mix of MS
Excel).

best,
-tony
On Fri, 19 Nov 2004 22:55:08 -0500, Duncan Murdoch <murdoch@stats.uwo.ca> wrote: