Hello,
In the latest 'Scientific Computing World' magazine (issue 78, p. 22), there
is a review on free statistical software by Felix Grant ("doesn't have to
pay good money to obtain good statistics software"). As far as I know, this
is the first time that R is even mentioned in this magazine, given that it
usually discuss commercial products.
In this article, the analysis of R is interesting. It is admitted that R is
a great software with lots of potentials, but: "All in all, R was a good
lesson in the price that may have to be paid for free software: I spent many
hours relearning some quite basic things taken for granted in the commercial
package." Those basic things are releated with data import, obtention of
basic plots, etc... with a claim for a missing more intuitive GUI in order
to smooth a little bit the learning curve.
There are several R GUI projects ongoing, but these are progressing very
slowly. The main reason is, I believe, that a relatively low number of
programmers working on R are interested by this field. Most people wanting
such a GUI are basic user that do not (cannot) contribute... And if they
eventually become more knowledgeable, they tend to have other interests.
So, is this analysis correct: are there hidden costs for free software like
R in the time required to learn it? At least currently, for the people I
know (biologists, ecologists, oceanographers, ...), this is perfectly true.
This is even an insurmountable barrier for many of them I know, and they
have given up (they come back to Statistica, Systat, or S-PLUS using
exclusively functions they can reach through menus/dialog boxes).
Of course, the solution is to have a decent GUI for R, but this is a lot of
work, and I wonder if the intrinsic mechanism of GPL is not working against
such a development (leading to a very low pool of programmers actively
involved in the elaboration of such a GUI, in comparison to the very large
pool of competent developers working on R itself).
Do not misunderstand me: I don't give up with my GUI project, I am just
wondering if there is a general, ineluctable mechanism that leads to the
current R / R GUI situation as it stands,... and consequently to a "general
rule" that there are indeed most of the time "hidden costs" in free
software, due to the larger time required to learn it. I am sure there are
counter-examples, however, my feeling is that, for Linux, Apache, etc... the
GUI (if there is one) is often a way back in comparison to the potentials in
the software, leading to a steep learning curve in order to use all these
features.
I would be interested by your impressions and ideas on this topic.
Best regards,
Philippe Grosjean
..............................................<??}))><........
) ) ) ) )
( ( ( ( ( Prof. Philippe Grosjean
) ) ) ) )
( ( ( ( ( Numerical Ecology of Aquatic Systems
) ) ) ) ) Mons-Hainaut University, Pentagone
( ( ( ( ( Academie Universitaire Wallonie-Bruxelles
) ) ) ) ) 6, av du Champ de Mars, 7000 Mons, Belgium
( ( ( ( (
) ) ) ) ) phone: + 32.65.37.34.97, fax: + 32.65.37.33.12
( ( ( ( ( email: Philippe.Grosjean at umh.ac.be
) ) ) ) )
( ( ( ( ( web: http://www.umh.ac.be/~econum
) ) ) ) )
..............................................................
The hidden costs of GPL software?
37 messages · Philippe GROSJEAN, Jan P. Smit, (Ted Harding) +17 more
Messages 1–25 of 37
Dear Phillippe, Very interesting. The URL of the article is http://www.scientific-computing.com/scwsepoct04free_statistics.html. Best regards, Jan Smit
Philippe Grosjean wrote:
Hello,
In the latest 'Scientific Computing World' magazine (issue 78, p. 22), there
is a review on free statistical software by Felix Grant ("doesn't have to
pay good money to obtain good statistics software"). As far as I know, this
is the first time that R is even mentioned in this magazine, given that it
usually discuss commercial products.
In this article, the analysis of R is interesting. It is admitted that R is
a great software with lots of potentials, but: "All in all, R was a good
lesson in the price that may have to be paid for free software: I spent many
hours relearning some quite basic things taken for granted in the commercial
package." Those basic things are releated with data import, obtention of
basic plots, etc... with a claim for a missing more intuitive GUI in order
to smooth a little bit the learning curve.
There are several R GUI projects ongoing, but these are progressing very
slowly. The main reason is, I believe, that a relatively low number of
programmers working on R are interested by this field. Most people wanting
such a GUI are basic user that do not (cannot) contribute... And if they
eventually become more knowledgeable, they tend to have other interests.
So, is this analysis correct: are there hidden costs for free software like
R in the time required to learn it? At least currently, for the people I
know (biologists, ecologists, oceanographers, ...), this is perfectly true.
This is even an insurmountable barrier for many of them I know, and they
have given up (they come back to Statistica, Systat, or S-PLUS using
exclusively functions they can reach through menus/dialog boxes).
Of course, the solution is to have a decent GUI for R, but this is a lot of
work, and I wonder if the intrinsic mechanism of GPL is not working against
such a development (leading to a very low pool of programmers actively
involved in the elaboration of such a GUI, in comparison to the very large
pool of competent developers working on R itself).
Do not misunderstand me: I don't give up with my GUI project, I am just
wondering if there is a general, ineluctable mechanism that leads to the
current R / R GUI situation as it stands,... and consequently to a "general
rule" that there are indeed most of the time "hidden costs" in free
software, due to the larger time required to learn it. I am sure there are
counter-examples, however, my feeling is that, for Linux, Apache, etc... the
GUI (if there is one) is often a way back in comparison to the potentials in
the software, leading to a steep learning curve in order to use all these
features.
I would be interested by your impressions and ideas on this topic.
Best regards,
Philippe Grosjean
..............................................<??}))><........
) ) ) ) )
( ( ( ( ( Prof. Philippe Grosjean
) ) ) ) )
( ( ( ( ( Numerical Ecology of Aquatic Systems
) ) ) ) ) Mons-Hainaut University, Pentagone
( ( ( ( ( Academie Universitaire Wallonie-Bruxelles
) ) ) ) ) 6, av du Champ de Mars, 7000 Mons, Belgium
( ( ( ( (
) ) ) ) ) phone: + 32.65.37.34.97, fax: + 32.65.37.33.12
( ( ( ( ( email: Philippe.Grosjean at umh.ac.be
) ) ) ) )
( ( ( ( ( web: http://www.umh.ac.be/~econum
) ) ) ) )
..............................................................
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
On 17-Nov-04 Philippe Grosjean wrote:
Hello,
In the latest 'Scientific Computing World' magazine
(issue 78, p. 22), there is a review on free statistical
software by Felix Grant ("doesn't have to pay good money
to obtain good statistics software"). As far as I know,
this is the first time that R is even mentioned in this
magazine, given that it usually discuss commercial products.
Hi Philippe, Thanks for a most interesting post on this question. Further comments below. Felix Grant's article is excellent, and well balanced.
In this article, the analysis of R is interesting. It is admitted that R is a great software with lots of potentials, but: "All in all, R was a good lesson in the price that may have to be paid for free software: I spent many hours relearning some quite basic things taken for granted in the commercial package." Those basic things are releated with data import, obtention of basic plots, etc... with a claim for a missing more intuitive GUI in order to smooth a little bit the learning curve.
It would better represent the balanced view of the article to further quote: "In fact, the whole file menu in R looks either elegantly uncluttered of frightenly obscure, depending on your point of view." "It [the effort of learning] is the price paid, just as the dollars or euros for a commercial package would be. For that price, I've learned a great deal -- and nor only about R. And I shall remember it when I next have to find a heavyweight solution for a big problem presented by a small charitable client with an invisible budget. It's a huge, awe-inspiring package -- easier to perceive as such because the power is not hidden beneath a cosmetic veneer." This last remark is, in my view, particularly significant. See below.
There are several R GUI projects ongoing, but these are progressing very slowly. The main reason is, I believe, that a relatively low number of programmers working on R are interested by this field. Most people wanting such a GUI are basic user that do not (cannot) contribute... And if they eventually become more knowledgeable, they tend to have other interests. So, is this analysis correct: are there hidden costs for free software like R in the time required to learn it? At least currently, for the people I know (biologists, ecologists, oceanographers, ...), this is perfectly true. This is even an insurmountable barrier for many of them I know, and they have given up (they come back to Statistica, Systat, or S-PLUS using exclusively functions they can reach through menus/dialog boxes).
Non-GUI vs GUI is not intrinsically linked to Free Software as such. There are well-known FS programs which are essentially GUI-based -- as an easy example, consider all the FS Web Browsers such as Netscape, Mozilla, ... . If you want the graphics experiences offered by the Web, you're in a graphics screen anyway, and so it may as well be programmed around a GUI. Others, such as OpenOffice, have deliberately built on a GUI approach in order to emulate The Other Thing. There are a lot of FS programs which offer a GUI, usually somewhat on the basic side, which nonetheless encapsulates the entire functionality of the program and saves the user the task of composing a possibly complex command-line or even a script. The comment "hidden beneath a cosmetic veneer" is, in my view, somewhat directly linked to commercial software. If you sell software, you want a big market. So you want to include the people who will never learn how to work software from a command line; and the sweeter the taste of the eye candy, the more such people will feel enjoyment in using the software. The fact that their usage is limited to what has been pre-programmed into the menus is not going to affect many such people, since typically their useage is limited to a very small subset of what is in fact possible. This in turn leads, of course, to the phenomenon of "software-driven analysis", where people only do what the GUI allows (or, more precisely, easily allows); and this leads on in turn to a culture in which people tend to believe that Statistics is what they can do with a particular software package. S-Plus does its best to compromise: as well as GUI access to a pretty wide range of functions, there is the Command Line Window where the user can explicitly type in commands. (I dare say many R users, in S-Plus, may tend to work in the latter since they are already used to it.) But, as always in a GUI, one can tend to get lost in the ramifications. Also, things like the big arrays of tiny icons you get when you click on the "2D Plots" or "3D Plots" buttons in the S-Plus toolbar can be trying on the eyes and time-consuming to pick through.
Of course, the solution is to have a decent GUI for R, but this is a lot of work, and I wonder if the intrinsic mechanism of GPL is not working against such a development (leading to a very low pool of programmers actively involved in the elaboration of such a GUI, in comparison to the very large pool of competent developers working on R itself). Do not misunderstand me: I don't give up with my GUI project, I am just wondering if there is a general, ineluctable mechanism that leads to the current R / R GUI situation as it stands,... and consequently to a "general rule" that there are indeed most of the time "hidden costs" in free software, due to the larger time required to learn it. I am sure there are counter-examples, however, my feeling is that, for Linux, Apache, etc... the GUI (if there is one) is often a way back in comparison to the potentials in the software, leading to a steep learning curve in order to use all these features.
Often, I think, in the Free Software world, people get involved
because they want to produce something which achieves a task.
Once they have a program which does that, then their aim is
satisfied. The GUI, in many cases, would be additional work
which would add nothing to what the software can do in terms
of tasks to be achieved. So in such cases, yes, I would tend
to agree that there is an intrinsic mechanism that discourages
work on a GUI for its own sake. You can add to that the fact
that once a developer has got to the point of creating such
software, successful in the tasks, they may have got beyond
the point at which they can readily sympathise with users who
have not acquired such skills: they no longer perceive, from
their own experience, that there is a problem.
However, this leaves people like you, having colleagues who
"come back to Statistica, Systat, or S-PLUS using exclusively
functions they can reach through menus/dialog boxes." By this
experience, you are aware of the problem, and rightly feel
that they would be helped by having access to the sort of
GUI/Menu interface that they are used to using.
One genuine benefit that the GUI offers, especially to
beginners with a particular software package, is that the
resources of the software can perhaps more easily and rapidly
be explored through the GUI, rather than searching laboriously
through the documentation of functions, extra packages, and
so on. This means that they more readily come to perceive
what is available though of course this is limited to what
the GUI will show them. But a good "Help" window can break
that barrier.
Perhaps R itself is less helpful than it might be in this
respect. The R-help list bristles with queries of the form
"How can I do X?", which I think is evidence of a problem.
While some of these queries clearly originate from people
who have taken no trouble to explore readily accessible
information, many others can not be so easily dismissed.
If you know something about what you're after, once you
realise that a judiciously formulated "help.search" can
throw up a lot of possibilities you are well on your way.
So, for instance (as in a recent query about 2-D Fourier
transform for spatial data) 'help.search("fourier")' gives
relevant information.
This, though, still fails for information in packages which
you have not installed. Perhaps I'm about to reveal my own
culpable ignorance here, but I'm not aware of a "full R info"
package which would be installed as part of R-base, being
a database of info about R-base itself and also every current
additional package, such that a "help.search" would show
all resources -- including those not installed -- which
match a query (and flag the non-installed ones as such so
that the user knows what to install for a particular purpose).
Whether this needs to be supplemented by a GUI is a point
that could be discussed from several points of view.
Philippe's biological/oceanographic users no doubt would
be considerably helped, provided they can in due course
come to the point where they can start to work "beyond
the GUI" (if indeed they need to).
Personally, however, I find that GUI work is slower and
more error-prone than command-line work. Swanning the
mouse around the screen, visually idebtifying icons and
buttons, clicking on this and that in order to see whether
it's what you want, and so on, is much more time-consuming
than typiing in a command.
And God help you if you accidentally click on something
destructive!
I'll close with an immortal quotation (from Charles Curran,
of the UK Unix Users Group):
"I can touch-type, but I can't touch-mouse"
Best wishes to all,
Ted.
I would be interested by your impressions and ideas on this topic. Best regards, Philippe Grosjean ..............................................<??}))><........
/\
/ |
.............................<??}))><........ :) >=---
\ |
\/
Best wishes to all,
Ted.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861 [NB: New number!]
Date: 17-Nov-04 Time: 12:34:31
------------------------------ XFMail ------------------------------
On 11/17/04 12:34, Ted Harding wrote:
This, though, still fails for information in packages which you have not installed. Perhaps I'm about to reveal my own culpable ignorance here, but I'm not aware of a "full R info" package which would be installed as part of R-base, being a database of info about R-base itself and also every current additional package, such that a "help.search" would show all resources -- including those not installed -- which match a query (and flag the non-installed ones as such so that the user knows what to install for a particular purpose).
This is one of the purpose of my R search page. I have all packages installed. You can also search the help list, etc., in the same search. Some people have bookmarks for it. Of course you need to be connected to the internet. I think that any attempt to replicate this for a single user, or even the packages, would be difficult. BUT, it might help to install just the help pages for all packages, without the packages themselves. Then help.search() would find things. (I have no interest in figuring out how to do this, but maybe someone else does.) Jon
Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron R search page: http://finzi.psych.upenn.edu/
I'm a big advocate -- perhaps even fanatic -- of making R easier for novices in order to spread its use, but I'm not convinced that a GUI (at least in the traditional form) is the most valuable approach. Perhaps an overly harsh summary of some of Ted Harding's statements is: You can make a truck easier to get into by taking off the wheels, but that doesn't make it more useful. In terms of GUIs, I think what R should focus on is the ability for user's to make their own specialized GUI. So that a knowledgeable programmer at an installation can create a system that is easy for unsophisticated users for the limited number of tasks that are to be done. The ultimate users may not even need to know that R exists. I think Ted Harding was on the mark when he said that it is the help system that needs enhancement. I can imagine a system that gets the user to the right function and then helps fill in the arguments; all of the time pointing them towards the command line rather than away from it. The author of the referenced article highlighted some hidden costs of R, but did not highlight the hidden benefits (because they were hidden from him). A big benefit of R is all of the bugs that aren't in it (which may or may not be due to its free status). Patrick Burns Burns Statistics patrick at burns-stat.com +44 (0)20 8525 0696 http://www.burns-stat.com (home of S Poetry and "A Guide for the Unwilling S User")
Jan P. Smit wrote:
Dear Phillippe, Very interesting. The URL of the article is http://www.scientific-computing.com/scwsepoct04free_statistics.html. Best regards, Jan Smit Philippe Grosjean wrote:
Hello,
In the latest 'Scientific Computing World' magazine (issue 78, p.
22), there
is a review on free statistical software by Felix Grant ("doesn't
have to
pay good money to obtain good statistics software"). As far as I
know, this
is the first time that R is even mentioned in this magazine, given
that it
usually discuss commercial products.
[ ...]
On Wed, 17 Nov 2004 14:27:49 +0000, Patrick Burns <pburns at pburns.seanet.com> wrote :
In terms of GUIs, I think what R should focus on is the ability for user's to make their own specialized GUI. So that a knowledgeable programmer at an installation can create a system that is easy for unsophisticated users for the limited number of tasks that are to be done. The ultimate users may not even need to know that R exists.
I think there is (slow) movement towards that. Certainly it's possible now (you can add menus to Rgui in Windows, you can do nice things like Rcmdr using TCL/TK on any platform). However, designing a nice GUI is very hard work.
I think Ted Harding was on the mark when he said that it is the help system that needs enhancement. I can imagine a system that gets the user to the right function and then helps fill in the arguments; all of the time pointing them towards the command line rather than away from it.
That would be helpful, and the only really difficult part would be the first part: getting the user to the right function. help.search() sometimes works, but often people ask for the wrong thing. After that, R knows a lot about the structure of its help files, so it could display all of the arguments with their defaults and the help text that corresponds to each argument, as well as the help text for the rest of the help file. Probably the main obstacle to getting this is finding someone with the time and interest to do it. Duncan Murdoch
All: I have much enjoyed the discussion. Thanks to all who have contibuted. Two quick comments: 1. The problem of designing a GUI to make R's functionality more accessible is, I believe just one component of the larger issue of making statistical/data analysis functionality available to those who need to use it but do not have sufficient understanding and background to do so properly. I certainly include myself in this category in many circumstances. A willingness and commitment to learning ( = hard work!) is the only rational solution here, and saying that one doesn't have the time really doesn't cut it for me. Ditto for R language functionality? 2. However, R has many attractive features for data manipulation and graphics that make it attractive for common tasks that are now done most frequently with (ugh!) Excel (NOT Statistica, Systat, et. al.). For this subset of R's functionality a GUI would be attractive. However, writing a good GUI for graphing that even begins to take advantage of R's flexibility and power in this arena is an enormous -- perhaps an impossible -- task. Witness the S-Plus graphics GUI, which I think is truly awful (and appears to thwart more than it helps, at least from many of the queries one sees on that news list). So I'm not sanguine. Again, thanks to all for a thoughful and enjoyable discussion. -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA "The business of the statistician is to catalyze the scientific learning process." - George E. P. Box
-----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Patrick Burns Sent: Wednesday, November 17, 2004 6:28 AM To: Jan P. Smit Cc: r-help at stat.math.ethz.ch; Philippe Grosjean; r-sig-gui at stat.math.ethz.ch Subject: Re: [R] The hidden costs of GPL software? I'm a big advocate -- perhaps even fanatic -- of making R easier for novices in order to spread its use, but I'm not convinced that a GUI (at least in the traditional form) is the most valuable approach. Perhaps an overly harsh summary of some of Ted Harding's statements is: You can make a truck easier to get into by taking off the wheels, but that doesn't make it more useful. In terms of GUIs, I think what R should focus on is the ability for user's to make their own specialized GUI. So that a knowledgeable programmer at an installation can create a system that is easy for unsophisticated users for the limited number of tasks that are to be done. The ultimate users may not even need to know that R exists. I think Ted Harding was on the mark when he said that it is the help system that needs enhancement. I can imagine a system that gets the user to the right function and then helps fill in the arguments; all of the time pointing them towards the command line rather than away from it. The author of the referenced article highlighted some hidden costs of R, but did not highlight the hidden benefits (because they were hidden from him). A big benefit of R is all of the bugs that aren't in it (which may or may not be due to its free status). Patrick Burns Burns Statistics patrick at burns-stat.com +44 (0)20 8525 0696 http://www.burns-stat.com (home of S Poetry and "A Guide for the Unwilling S User") Jan P. Smit wrote:
Dear Phillippe, Very interesting. The URL of the article is http://www.scientific-computing.com/scwsepoct04free_statistics.html. Best regards, Jan Smit Philippe Grosjean wrote:
Hello,
In the latest 'Scientific Computing World' magazine (issue 78, p.
22), there
is a review on free statistical software by Felix Grant ("doesn't
have to
pay good money to obtain good statistics software"). As far as I
know, this
is the first time that R is even mentioned in this magazine, given
that it
usually discuss commercial products.
[ ...]
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
I agree with Bert. Thanks to all who contributed. I'd like to
add one comment I didn't see in the thread so far:
The corporate legal where I work is deathly afraid of the GNU
General Public License (GPL), because if we touch GPL software
inappropriately with our commercial software, our copyrights are
replaced by the GPL. This in turn means we can't charge royalties,
which means we can't repay the investors who covered our initial
development costs, and we file for bankruptcy. The rabid capitalists
meet the rabid socialists and walk away, shaking their heads. (Sec. 2.b
of the GPL: "You must cause any work that you distribute or publish,
that in whole or in part contains or is derived from the Program or any
part thereof, to be licensed as a whole at no charge to all third
parties under the terms of this License." We can get around this by
packaging accesses to GPL software as separately installed add-on(s),
because then only the add-on(s) would be covered by the GPL.) Our
corporate legal is more concerned about a possible law suit from a
possible competitor than from the R Foundation, but the threat is still
real and still being adjudicated in other cases.
If the GPL were not so tight on this point, someone could
commercialize a GUI for R without having to offer their source code
under the GPL.
However, even without this change, R seems to be the platform of
choice for new statistical algorithm development by a growing portion of
the international scientific community. Moreover, from my experience
with this listserve, the technical support here is far superior to
anything I've experienced with any other software in the 40+ years since
I wrote my first Fortran code.
Best Wishes,
spencer graves
Berton Gunter wrote:
All: I have much enjoyed the discussion. Thanks to all who have contibuted. Two quick comments: 1. The problem of designing a GUI to make R's functionality more accessible is, I believe just one component of the larger issue of making statistical/data analysis functionality available to those who need to use it but do not have sufficient understanding and background to do so properly. I certainly include myself in this category in many circumstances. A willingness and commitment to learning ( = hard work!) is the only rational solution here, and saying that one doesn't have the time really doesn't cut it for me. Ditto for R language functionality? 2. However, R has many attractive features for data manipulation and graphics that make it attractive for common tasks that are now done most frequently with (ugh!) Excel (NOT Statistica, Systat, et. al.). For this subset of R's functionality a GUI would be attractive. However, writing a good GUI for graphing that even begins to take advantage of R's flexibility and power in this arena is an enormous -- perhaps an impossible -- task. Witness the S-Plus graphics GUI, which I think is truly awful (and appears to thwart more than it helps, at least from many of the queries one sees on that news list). So I'm not sanguine. Again, thanks to all for a thoughful and enjoyable discussion. -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA "The business of the statistician is to catalyze the scientific learning process." - George E. P. Box
-----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Patrick Burns Sent: Wednesday, November 17, 2004 6:28 AM To: Jan P. Smit Cc: r-help at stat.math.ethz.ch; Philippe Grosjean; r-sig-gui at stat.math.ethz.ch Subject: Re: [R] The hidden costs of GPL software? I'm a big advocate -- perhaps even fanatic -- of making R easier for novices in order to spread its use, but I'm not convinced that a GUI (at least in the traditional form) is the most valuable approach. Perhaps an overly harsh summary of some of Ted Harding's statements is: You can make a truck easier to get into by taking off the wheels, but that doesn't make it more useful. In terms of GUIs, I think what R should focus on is the ability for user's to make their own specialized GUI. So that a knowledgeable programmer at an installation can create a system that is easy for unsophisticated users for the limited number of tasks that are to be done. The ultimate users may not even need to know that R exists. I think Ted Harding was on the mark when he said that it is the help system that needs enhancement. I can imagine a system that gets the user to the right function and then helps fill in the arguments; all of the time pointing them towards the command line rather than away from it. The author of the referenced article highlighted some hidden costs of R, but did not highlight the hidden benefits (because they were hidden from him). A big benefit of R is all of the bugs that aren't in it (which may or may not be due to its free status). Patrick Burns Burns Statistics patrick at burns-stat.com +44 (0)20 8525 0696 http://www.burns-stat.com (home of S Poetry and "A Guide for the Unwilling S User") Jan P. Smit wrote:
Dear Phillippe, Very interesting. The URL of the article is http://www.scientific-computing.com/scwsepoct04free_statistics.html. Best regards, Jan Smit Philippe Grosjean wrote:
Hello,
In the latest 'Scientific Computing World' magazine (issue 78, p.
22), there
is a review on free statistical software by Felix Grant ("doesn't
have to
pay good money to obtain good statistics software"). As far as I
know, this
is the first time that R is even mentioned in this magazine, given
that it
usually discuss commercial products.
[ ...]
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Spencer Graves, PhD, Senior Development Engineer O: (408)938-4420; mobile: (408)655-4567
This has been an interesting discussion. I make the following comment with hesitation, since I have neither the time nor the ability to implement it myself. Using CLI software, an infrequent user has trouble remembering the known functions needed and trouble finding new ones (especially as that user gets older). What might help is an added help facility more oriented towards tasks, rather than structured around functions or packages. Such a help facility might have a tree structure. Want help? Are you looking for information on (1) data manipulation or (2) analysis? If (1), do you want to to (3) import or export data, (4) transform data, (5) reshape data, or (6) select data? If (2), do you want to (7) fit a model or (8) make a graph? And so on.... Once appropriate function(s) are located, the user would be directed (by hyperlinks) to the existing help framework. That could help the problem of knowing what you want to do, but not what it is called. I think that "Introductory Statistics with R" is a step in that direction for the basics, as MASS is for more complex matters. The question is whether such material can be incorporated into a help system that will allow users to find, more easily, what they need. That largely depends, it seems to me, on a great deal of work by volunteers. I agree also with the suggestion that a dedicated editor (or add-in) that could supply arguments for functions might be considerable help. MHP
Michael Prager, Ph.D. Population Dynamics Team, NMFS SE Fisheries Science Center NOAA Center for Coastal Fisheries and Habitat Research Beaufort, North Carolina 28516 http://shrimp.ccfhrb.noaa.gov/~mprager/
Thank you all (+ a couple of offline comments) on this topic. To summarize your comments: - "Hidden" costs, may be better called "indirect" costs are not so easy to calculate. In the cited paper http://www.scientific-computing.com/scwsepoct04free_statistics.html, there is an interesting advice from a people used to test and wrote about commercial software. Indeed, the whole context around the use of a (statistical) software should be taken into account, which would reveil also indirect costs for commercial packages. Indeed, it is the Total Cost of Ownership (TCO) that should be better considered in this context. - This discussion is connected with the many discussions pro/cons for a R GUI, or any other tool that will facilitate use of R, but loosing one big advantage: currently, you have to know what you are doing to get a result with R... What kind of nonsenses would we get from naive people if they can obtain results with no, or little knowledge? - R is viewed by some as a statistical development platform, mainly for the scientific community. It excels there, but, is it even desirable to get it also used "by the mass"? - ***Many of you claim for a better help system to find a function more easily, than for a GUI***. I think this point is very important and should be placed somewhere high in the "to do" list in order to make R more accessible to beginners/occasional users! - There is no possibility to make a commercial GUI for R (thanks to the GPL), and volunteer R developers tend to work on a problem until they get the solution they need... And this rarely lead to the development of a GUI on top of it, conserning statistical analyses. In this way, yes, there is an intrinsic mechanism that makes R a program by experts, for experts. - A GUI could cover only the bare essentials, is rather unflexible, etc... For all these reasons, how would it help to learn such a feature-rich environment as R? This is not the solution to the problem. - It is more a question of education: it takes so much time to find a function in a menu/dialog box, than to consult help pages to find the right function. However, some categories of people are more accustomished to click and drag that to read help pages! - GUIs, by providing access to a limited amount of analyses in an inflexible way, lead to the phenomenon of "software-driven analysis" where the way data are analyzed is dependent on the software used. - Only commercial software care about eye candy stuffs to get clients more happy to use their software (and thus to sell more); "hidden beneath a cosmetic veneer" in the original paper. R does not care, because there is nothing to sell. So, as a consequence, you face the bare power, but sorry, no eye candy! - GUI work is slower and more error-prone... So, this should be considered in the hidden costs AFTER the learning stage... in favor of R! - "User-friendly" software tend to make a lot of assumptions (to present the analysis in an easier way), and does not tell about it. These could lead to nonsenses in some case, and the user even don't know, precisely because these assumptions are not explained! - The author of the paper talks about hidden costs, but he does not talk about hidden benefits, because he does even not notice them: ***all the bugs that aren't in it*** (I add: transparence in code + possibility for everyone to propose a patch = a big part of the success of Open Source software, especially for data analysis software)! That's all, I think, for the summary! Otherwise: Patrick Burns <pburns at pburns.seanet.com> wrote :
[...] I think Ted Harding was on the mark when he said that it is the help system that needs enhancement. I can imagine a system that gets the user to the right function and then helps fill in the arguments; all of the time pointing them towards the command line rather than away from it.
Duncan Murdoch [murdoch at stats.uwo.ca] answered:
That would be helpful, and the only really difficult part would be the first part: getting the user to the right function. help.search() sometimes works, but often people ask for the wrong thing.
After that, R knows a lot about the structure of its help files, so it could display all of the arguments with their defaults and the help text that corresponds to each argument, as well as the help text for the rest of the help file.
Probably the main obstacle to getting this is finding someone with the time and interest to do it.
Humm, excuse me, but I think that SciViews and JGR *already* do that,... So it appears that at least two people already spend their time and got their interest focused on this topic. Also, functions for such purposes will be added to the R GUI API... Meaning they will be available for a wider use. And I am close to a solution under Windows where hitting a combination of keys in ANY program will display a function tip with arguments, or a contextual completion list for R code. Finally: It seems that a GUI for R is not just lacking, it is purposedly lacking... And there are many argument in favor of this lack. OK for most R users. But could you, please, consider these examples: 1) I teach basic biostat with R/SciViews-R/R Commander. It is a frank success and almost all my students install it on their computer and start using it... So, the next year, I teach them an advanced biostat course with R. I decide to give up with the GUI and to present analyses like PCA, MDS, LDA, clustering, etc... directly in R. For each analysis, I make a small script (10 lines or so), I explain it and show them how it works and how they can edit it to analyze other data. It is a fiasco! It seems that a psychological barrier induced by this unfamiliar object (the script) tends to obscure everything in the mind of my students. I got returns in this way: most of the students that started to use R seem disgusted after this second course, and they switch back to another software with a GUI! When I ask them, they say: SciViews-R/R commander is nice but limited to simple analyses. For other analyses, the R scripts are just too complex for me, so I prefer to use a different software. 2) Second case: I write an original analysis and I want to make it widely available for oceanographers. Most of them do not want, and will never learn the S language. They obviously need a simple and easy GUI on top of my R function, because they want to run the analysis without knowing all the details... Obviously, these are concrete examples where a GUI should be a benefit... unless one consider that R should be restricted to experts only! Best regards, Philippe Grosjean ..............................................<??}))><........ ) ) ) ) ) ( ( ( ( ( Prof. Philippe Grosjean ) ) ) ) ) ( ( ( ( ( Numerical Ecology of Aquatic Systems ) ) ) ) ) Mons-Hainaut University, Pentagone ( ( ( ( ( Academie Universitaire Wallonie-Bruxelles ) ) ) ) ) 6, av du Champ de Mars, 7000 Mons, Belgium ( ( ( ( ( ) ) ) ) ) phone: + 32.65.37.34.97, fax: + 32.65.37.33.12 ( ( ( ( ( email: Philippe.Grosjean at umh.ac.be ) ) ) ) ) ( ( ( ( ( web: http://www.umh.ac.be/~econum ) ) ) ) ) ..............................................................
Patrick Burns wrote:
I'm a big advocate -- perhaps even fanatic -- of making R easier for novices in order to spread its use, but I'm not convinced that a GUI (at least in the traditional form) is the most valuable approach. Perhaps an overly harsh summary of some of Ted Harding's statements is: You can make a truck easier to get into by taking off the wheels, but that doesn't make it more useful. In terms of GUIs, I think what R should focus on is the ability for user's to make their own specialized GUI. So that a knowledgeable programmer at an installation can create a system that is easy for unsophisticated users for the limited number of tasks that are to be done. The ultimate users may not even need to know that R exists. I think Ted Harding was on the mark when he said that it is the help system that needs enhancement. I can imagine a system that gets the user to the right function and then helps fill in the arguments; all of the time pointing them towards the command line rather than away from it. The author of the referenced article highlighted some hidden costs of R, but did not highlight the hidden benefits (because they were hidden from him). A big benefit of R is all of the bugs that aren't in it (which may or may not be due to its free status). Patrick Burns Burns Statistics patrick at burns-stat.com +44 (0)20 8525 0696 http://www.burns-stat.com (home of S Poetry and "A Guide for the Unwilling S User") Jan P. Smit wrote:
Dear Phillippe, Very interesting. The URL of the article is http://www.scientific-computing.com/scwsepoct04free_statistics.html. Best regards, Jan Smit Philippe Grosjean wrote:
Hello,
In the latest 'Scientific Computing World' magazine (issue 78, p.
22), there
is a review on free statistical software by Felix Grant ("doesn't
have to
pay good money to obtain good statistics software"). As far as I
know, this
is the first time that R is even mentioned in this magazine, given
that it
usually discuss commercial products.
[ ...]
I really agree with you Patrick. To me the keys are having better help search capabilities, linking help files to case studies or at least detailed examples, having a navigator by keywords (a rudimentary one is at http://biostat.mc.vanderbilt.edu/s/finder/finder.html), having a great library of examples keyed by statistical goals (a la BUGS examples guides), and having a menu-driven skeleton code generator that gives beginners a starting script to edit to use their variable names, etc. Also I think we need a discussion board that has a better "memory" for new users, like some of the user forums currently on the web, or using a wiki. Frank
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
Hopefully my experience with R may add something to this discussion. I majored in computer science in 1983, with minors in mathematics and statistics. As this was in the days when computers were largely big centralised boxes with remote terminals, I didn't get to use computers for stats while I was at uni. Fast forward to a couple of years ago, and I've got to start "doing statistics" on the computer for the type of work I now do. A friend pointed me to R, so off I went. Between 1983 and then, I did a lot of development, testing, documentation, management, troubleshooting, etc work, so I think it's fair to say that, while my statistics knowledge needed a top up, my computing background was very strong. As of today, after approx 2 years of using R for relatively ad-hoc tasks every few weeks, here's my thoughts about it: - it's extremely powerful and well-maintained; kudos to everyone involved - it's extremely concise; you can do a huge amount of work in very few lines of code - provided a particular task is close to one I've already done before, using R I can extract info from a set of data at an amazing rate. Tasks that would take me an hour or so with another programming language or toolset, may take me under a minute using R (obviously depending on the size of the dataset) Problems arise whenever I need to step outside my existing R knowledge base, and use a feature or function that I haven't used before: - the help documentation in general desperately needs work, particularly the examples. My thinking is that examples should pretty much lead you through a trivial exercise using the tool being discussed. This is very rarely the case with R, and the examples seem to assume you fully understand how e.g. a library works and just need a simple reminder of the syntax. For the purposes of comparison, compare the documentation that comes with the Perl language; even if you don't know what a function or keyword does, you can pretty much read through the given examples and work it out without difficulty - the GUI is pretty much just a working area on the screen; it's just not "helpful". It would probably be reasonably simple to add menu or toolbar options to help a user identify how they can actually achieve a particular task in R (e.g. select a function from a drop-down list, and get one-liner documentation about what it does), but that hasn't been done. Many of the questions asked on this list (which are often answered with "RTFM") are of the nature "I've got this conceptually simple task to do, but I can't find out how to do it using R. Please help"; this is gratifying to me personally, since I frequently encounter the same problem. These issues are extremely frustrating, as you often know the answer will be a one-liner but you may struggle for hours or days trying to find it As I said above, once you understand how to do a particular task in R, you can leverage that knowledge to do similar tasks amazingly quickly; the productivity that comes with using R in this context is incredible. However, that productivity tends to disappear when you need to take even a small step outside your existing R knowledge base. Now maybe I'm the only occasional R user out here, and everyone else is using it 8 hours a day and acquired my 2 years' worth of knowledge in their first week of use. I doubt that is actually the case, and the rest of us could really do with some help from the GUI. Finally, please don't think I don't appreciate the mass of effort required to get R to its current state. I do, and it's made my life a lot easier than it would otherwise have been. Regards Dave Mitchell
Hi All, GRETL, a Gnu Regression, Econometrics and Time-series Library is open-source, cross-platform, multi-language and fully GUI based. The website is http://gretl.sourceforge.net/ This is NOT a personal plug, simply posted to show what can be done. Andrew
On 17 Nov 2004, at 2:27 pm, Patrick Burns wrote:
I think Ted Harding was on the mark when he said that it is the help system that needs enhancement. I can imagine a system that gets the user to the right function and then helps fill in the arguments; all of the time pointing them towards the command line rather than away from it.
I think this is spot on. My situation is that I am a scientist turned system administrator, and R is a package which I am increasingly being asked to install for the use of scientists at this Institute. I am by no means a statistician; the statistics I learned in A-level maths almost 20 years ago were as far as I got, and most of that I have forgotten. But I like to have some understanding of the software packages I am asked to support, so I've been looking at R with a view to learning some of its more basic functions. It looks potentially very useful to me anyway for summarising activity on the supercomputing cluster that I run. So I'm a newbie to R, armed with only a very basic knowledge of statistics (I know the difference between a Normal and a Poisson distribution at least, and with a bit of prodding could probably remember a binomial distribution too). I'm an experienced programmer in several languages, and a PhD-level scientist. And yet I have still found R really quite hard to learn, and this is principally because the on-line help is a reference manual. I'm sure it's a fabulous resource if you're a statistician who uses R every day, but for me it's not very helpful. The R Intro PDF is good, but it would be nice if it were integrated better, with hyperlinks to the reference documentation, or to other parts of the introduction, for those platforms that support such things (it looks like this was intended for MacOS X, which is the version I am playing with for my own use, although the version I maintain for users is on Linux [ and would be on Alpha/Tru64 too if I could get it to pass its tests ]) but the on-line help link to the Intro on the Aqua R version brings up a blank page, so I'm using the generic PDF document instead. I think the GUI question has nothing to do with the hidden costs of the GPL, or otherwise. This is the age-old ease-of-use versus power and capability argument. I don't think a fancy GUI is necessary - the GUI aspects that have been added to R on Mac OS X are sufficient. I get the impression that the real power of R is the fact that really it's a programming language, and should probably be treated and learned as such. Quite apart from the fact that a GUI will necessarily be a somewhat restricted subset of the total functionality, and a lot slower to use once you've taken the effort to learn the software, I think there is another danger, which I have already seen in other pieces of software in the bioinformatics community. Users frequently run completely pointless analyses through the GUI wrappers we provide. The users using the command line interfaces typically do much more sensible things. If you make a piece of software trivial for a user to use without thinking about what they're doing, then the users won't think. I may not know much about statistics, but what little I do know is that understanding exactly what form of analysis or significance test is required to be meaningful is a real skill that takes a lot of experience to master. Having to perform that analysis with written commands means that your method is recorded, and could be published, and more importantly be checked and reproduced by other researchers. It also gives you ample time to think about what you're doing, rather than just bashing out a pretty graph which actually has no real meaning whatsoever. Any GUI to R could (and should) be able to store the command line equivalent to what it has just done, to satisfy the reproducible criterion above, but I suspect it could still lead to some pretty shoddy work being done by careless and lazy scientists, and we get enough of that already. Tim
Dr Tim Cutts Informatics Systems Group, Wellcome Trust Sanger Institute GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5 860B 3CDD 3F56 E313 4233
Hmmmm, interesting thread and minds will not be changed but regarding GUIs...I thought S (aka R) was a PROGRAMMING LANGUAGE with a statistical and numerical slant, and not a statistics application. ;O) Certainly there is an important place for GUIs but I believe that it is very much overemphasized in modern computer culture. My experience and bias--and I started in the 1960's-- is that except for 'trivial' uses, GUIs are a detriment to any reasonably complex CREATIVE computational task. They are adequate for the simple, common task. But even then, typing a command or two is not overly taxing--- particularly when compared to navigating layer upon layer of submenus as is some times needed. If I need to, I will add a little syntactical sugaring when coding and move on. GUIs encourage a passive approach to using computers when solving problems. In addition, it is regretable that a lot of people in the 'workplace' will carry out incomplete and/or incorrect quantitative work because of the real or perceived limitations of the particular (GUI) apps they are using. There is no inclination to go beyond the menu and even then many menu items gather 'electronic dust'. Finally, there are times for many of us when work 'goes home' at the end of the day. That just comes with the territory. I (and most others) can not afford the luxury of S-plus, Statistica, SPSS, etc. at home. So in a sense there is a very real 'loss of productivity' cost associated with using commercial software. Now that does bring us around to supporting R doesn't it? (Mea culpa. And I resolve to do better!) What value does one put on the vitality of the R community? Best regards, Michael Grant, Ph.D. * The requirements for creating packages are on target, and have the desired impact on both the quality and breadth of R.
--- Philippe Grosjean <phgrosjean at sciviews.org> wrote:
Hello, In the latest 'Scientific Computing World' magazine (issue 78, p. 22), there is a review on free statistical software by Felix Grant ...2.)
On 18 Nov 2004, at 10:27 am, Tim Cutts wrote:
The R Intro PDF is good, but it would be nice if it were integrated better, with hyperlinks to the reference documentation, or to other parts of the introduction, for those platforms that support such things
I should correct myself here, and note that there are some cross-references within the PDF document, it's not completely devoid of them. Tim
Dr Tim Cutts Informatics Systems Group, Wellcome Trust Sanger Institute GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5 860B 3CDD 3F56 E313 4233
Tim Cutts schrieb:
Any GUI to R could (and should) be able to store the command line equivalent to what it has just done, to satisfy the reproducible criterion above, but I suspect it could still lead to some pretty shoddy work being done by careless and lazy scientists, and we get enough of that already.
In that respect you should have a look at Emacs/XEmacs/ESS package. This package combines the power of command line and reproducibility of what has been done to generate graphs or whatever you like. Its also equipped with a nice ref-card-pdf which is very helpful to learn common shortcuts to increase your productivity levels. I wouldn't call ESS necessarily a GUI in a traditional sense, though. When I started using R I was inclined to use the RCommander-GUI. After fiddling with this for a while I came to the conclusion that its possibilities are, at least for the moment, really limited. Furthermore some things increased my irritation levels, i.e. orientation to push the correct buttons to achieve a specific task. If I hit a false button I hardly wasn't able to find out what actually went wrong. Nevertheless, for me as a beginner in GNU R, who never used S before, but primarily SPSS and BMDP in early times, it is a long way to gain some control of advanced aspects of using R. This is also true despite the fact that I took statistics courses for several years and do have experiences in research projects (social sciences and epidemiology), so I'll would agree that using GNU R has some hidden costs for me! To sum up, what I am in need to is an extensive example based help-system, focused on how to do things in R. In parts this is already there, i.e. SimpleR from Verzani (contributed docs area) etc. Hopefully I can contribute to this in future, since it is seems to me invaluable to learn R by going through example-based lessons (some are found in vignette() ). These are much more comprehensible to me than those short reference like entries in the current help-system, mostly due to their very technical approach (same is to be said about the official GNU R manuals, especially "The R Language", which wasn't a great help for me when I took my first look at GNU R). In this context something like the GuideMaps of Vista come to my mind! But to be as clear as possible, I think GNU R is great and I appreciate all the efforts done by the R core team and associates! Nevertheless it seems to be valuable to re-think the help-system in R with respect to those who may have a good understanding in statistics, but lacking some basic experiences in how to introduce themselves to sophisticated world of R/S languages. Regards Thomas
At 11/18/2004 07:01 AM Thursday, Thomas Sch??nhoff wrote:
To sum up, what I am in need to is an extensive example based help-system, focused on how to do things in R. In parts this is already there, i.e. SimpleR from Verzani (contributed docs area) etc. Hopefully I can contribute to this in future, since it is seems to me invaluable to learn R by going through example-based lessons (some are found in vignette() ). These are much more comprehensible to me than those short reference like entries in the current help-system, mostly due to their very technical approach (same is to be said about the official GNU R manuals, especially "The R Language", which wasn't a great help for me when I took my first look at GNU R). In this context something like the GuideMaps of Vista come to my mind! But to be as clear as possible, I think GNU R is great and I appreciate all the efforts done by the R core team and associates! Nevertheless it seems to be valuable to re-think the help-system in R with respect to those who may have a good understanding in statistics, but lacking some basic experiences in how to introduce themselves to sophisticated world of R/S languages.
(I posted similar material before, but it was moved to R-devel, and I wanted to express a bit of it here.) I have frequently felt, like Thomas, that what could make R easier to use is not a GUI, but a help system more focused on tasks and examples, rather than on functions and packages. This has obvious and large costs of development, and I am unlikely to contribute much myself, for reasons of time and ability. Yet, I mention it for the sake of this discussion. Such a help system could be a tree (or key) structure in which through making choices, the user's description of the desired task is gradually narrowed. At the end of each twig of the tree would be a list of suggested functions for solving the problem, hyperlinked into the existing help system (which in many ways is outstanding and has evolved just as fast as R itself). This could be coupled with the continued expansion of the number of examples in the help system. Now I must express appreciation for what exists already that helps in this regard: MASS (in its many editions), Introductory Statistics with R, Simple R, and the other free documentation that so many authors have generously provided. Not to mention the superlative contribution of R itself, and the work of the R development team. It is beyond my understanding how something so valuable and well thought out has been created by people with so many other responsibilities. Mike
Michael Prager, Ph.D. Population Dynamics Team, NMFS SE Fisheries Science Center NOAA Center for Coastal Fisheries and Habitat Research Beaufort, North Carolina 28516 http://shrimp.ccfhrb.noaa.gov/~mprager/
Hello, I appreciate many comments and the various points of view, especially because there are a couple of clear explanations why several people do not need (or even do not want) a GUI for R! Another part of the discussion seems to switch to the never-ending question of "what kind of GUI"... which will never be answered, because there is not one best GUI, and it also depends on the use (both the application and the user). It's a long time I hesitate to propose in R-SIG-GUI + the R GUI projects web site to place a description for one or several "prototype" GUI(s) we would like for R, with the intention to include all the good ideas everybody has in this list. I never did that, because I am pretty sure it is useless! Now, I feel that one guy, with a clear view of what he wants, a lot of free time, a lot of energy, and some decent skills in programming, is actually required to make real what he has in his head! Indeed, it is such a huge work that several people are required! Here are the topics currently developed (sorry if I don't cite Bioconductor stuff: I don't know it): - Most of the "low-level" work is done, I think, like interface with graphical toolkits: tcltk by Peter Dalgaard, of course, but many others (Gtk, wxPython, ...), a better control of Rgui under Windows (ongoing, Duncan Murdoch), ESS, ... All this is already available, even if one could always argue that it is not optimal in some respects. - A better console (multiple-lines editing, syntax coloring, code tip presenting the syntax of a function when you type it, contextual completion list, ...). This is ongoing project in both JGR and SciViews-R. - A better table editor: RKward team. - A classical menus/dialog box approach: John Fox's R commander, - An object explorer: JGR, RKward, SciViews-R, experimental functions in R, - A "plug-in" approach, that is, a piece of code that brings a GUI for a targeted analysis and builds R code for you: RKward team, but also some functions in svDialogs (part of the SciViews bundle, R GUI API), - Interactive documents mixing formatted text, graphs, etc... with R input/output: Rpad, Sweave (not interactive), and some other, - Rich-formatted output of R objects (in/out, views, reporting,...): Eric Lecoutre's R2HTML + SciViews-R, - Code editor with interaction with R: Tinn-R, WinEdt, Emacs, and many others, - IDE (humm, some code editors are not so far away from an IDE, but there is still some lack here), - A R GUI API: SciViews. I hope all these projects will continue, will mature, and their developers will ultimately realize that they provide complementary pieces of a giant puzzle and start to work together. This is when it will become most exciting! I hope also that it will result in an original GUI that keeps most of the spirit of R, that is, not a simplified point&click UI, leading to meaningless analyses by lazy people, but a real tool whose goal is to make R easier and faster to learn for beginner, and pretty usable for occasional users. May be, I am just a dreamer, but all I read in this discussion reinforce my conviction that an **innovative** GUI would be a good addition to R: most criticisms clearly relate to the kind of inflexible GUI, with a forest of menus and submenus, and other bad things one could find. I never, and will never advocate for such a GUI! For sure, the alternate GUI will only support you in writing R code, and will deliver plenty of help to achieve this goal. I think it is possible... with enough people collaborating in a common project! I think the later point is really the problem: not enough people, too many projects! Is it a consequence of the way R is developed (GPL)? Well, I think so, but only partly. It is also the consequence of ego (everybody wants to be the leader of his own project), and a lack of communication (R-SIG-GUI is not what one would call an active list!) Or, may be, a "good GUI" for R is a fuzzy target and it is not possible to cristallize enough power around a common goal: to reach it! Anyway, despite R GUI projects are progressing very slowly, I think only when we would have a "good GUI" available for R, we would be able to evaluate if there are really "hidden costs" in R, as Felix Grant suggests in his paper. Best regards and thank you all for your comments and suggestions. Philippe Grosjean
On Thu, 2004-11-18 at 03:24 -0800, Michael Grant wrote:
Hmmmm, interesting thread and minds will not be changed but regarding GUIs...I thought S (aka R) was a PROGRAMMING LANGUAGE with a statistical and numerical slant, and not a statistics application. ;O)
From the R web site:
"R is a language and environment for statistical computing and
graphics."
I think that this is a critical point and that there is, to my mind, a
false predicate at play here.
That predicate is that somehow one should be able to rapidly learn R (or
any programming language for that matter) solely via the available
online reference help or via the freely provided documentation (whether
via R Core or via Contributors).
How many people here have learned to use C, FORTRAN, SAS, VBA, Perl or
any other language strictly by using built-in reference help systems. If
any, it will be a very small proportion.
Sure, SAS comes with documentation that can be measured in hernia
inducing tonnage, but at a substantial annual cost, which I have
referenced here and elsewhere previously. R is free.
Is there anyone who has learned to code in C that does not have a copy
of K&R someplace on their shelf, probably along with copies of other
both general and application specific C references published by
Prentice-Hall, Addison-Wesley, McGraw-Hill or Hayden?
It has been years since I actively coded in C, but I have almost 3
shelves filled with C reference books. I have books dating back to the
early 80's for 80x86 Assembly, MS-DOS/BIOS interrupts and Windows API
technical references and other such books that I used to use on a daily
basis in a former life.
For Linux, I have two shelves filled with various O'Reilly and other
references running the gambit from general Linux stuff to Perl,
Procmail, Postfix, Bash, Regex, Emacs, Admin, Firewalls and others.
For R, I have most of a shelf filled with multiple references, including
three of the four editions of MASS (somehow I missed the 2nd edition). I
have a copy of Peter's ISwR (because on occasion I have an acute attack
of cerebral flatulence and have to go back to basics) along with copies
of Pinheiro & Bates, Fox, Maindonald & Braun, Krause & Olson, Everitt &
Rabe-Hesketh and V&R's S Programming. I have copies of the "White Book"
and the "Green Book" and I have copies of Harrell and Therneau &
Grambsch for specific applications of R.
There are a fair number of already published books on R/S with more
coming by Faraway, Heiberger & Holland, Verzani and others including a
new series from Springer.
My point being that the old philosophy of "No Pain, No Gain" is a
component of the learning curve with R. R is not going to be for
everybody. That's why there are other "point and click" statistical
_applications_ like JMP (albeit not cheap). They are relatively easy,
but at the same time, they are self-limiting. No single math/statistical
"product" is going to meet the needs of the entire spectrum of the
potential user space.
As I have mentioned previously, I am a firm believer in Pareto's 80/20
Rule. In this case, you develop a "product" to meet the needs of 80% of
your target user space, because you will go "bankrupt" meeting the needs
of the other 20%. Said differently, meeting the needs of the other 20%
will consume 80% of your development resources, restricting your ability
to meet the needs of the larger audience.
Having spent 12 years previously with a commercial medical software
company, I will also suggest that typically 20% of your user base will
consume 80% of your support resources.
I will also note that having been on both sides of that equation, the
support provided here within this community is superb and has no peer in
the commercial arena.
In R's case, the 80% of the user space has perhaps been extended by the
kind offerings of those who have made specialty packages available via
CRAN, BioC and others.
It takes a certain level of commitment and time with R to become
effective with it.
That commitment includes, in my mind, supplementing the available _free_
documentation that has kindly been provided by R Core and others, with
other available resources. That does not mean that everyone needs to get
on Amazon.com and spend hundreds of $YOUR_MONETARY_UNIT on books. Many
are available via libraries and/or other resources, especially for those
here in academic environments.
This is a community effort folks and not everything is going to be
provided to you free of charge, with that notion being either in actual
financial cost or time.
It appears that, since this is not the first time this subject has come
up, there is strong interest in building a c("new", "different",
"better", ...) documentation/help system for R. That's fine. For those
that have interest in pursuing this, perhaps the time has come for a
group to form a new r-sig-doc list and move forward with the development
of a framework for a new system that can be developed and implemented by
that same group and then provided back to the community.
Writing technical and user documentation is a specialty skill set unto
itself and perhaps those with the requisite skill sets will contribute
them for the benefit of all.
For those that do not have the skills and/or the time to contribute, I
would urge you to financially contribute to the R Foundation in whatever
way you can afford. Through that mechanism you will support the
community at large and the future development and enhancement of R.
There is no "hidden cost" here and certainly not one that is unique to
GPL software. The cost is self-evident and it is measured in time and
$YOUR_MONETARY_UNITs. "Time is money" as they say and that is the same
whether you are using GPL software or a commercial proprietary product.
A key difference here if any, is that none of us have paid anything for
R, where a portion of that "revenue" would go to support a dedicated
documentation team. In this case, it is "If you want it, you will need
to design and build it."
Best regards,
Marc Schwartz
On 17-Nov-04 Patrick Burns wrote:
[...] Perhaps an overly harsh summary of some of Ted Harding's statements is: You can make a truck easier to get into by taking off the wheels, but that doesn't make it more useful.
Yes, perhaps overly harsh ... but if you had said instead "by deflating the tyres" then I think I'd agree that you were spot on! Otherwise I agree with your other comments. All best wishes, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 094 0861 [NB: New number!] Date: 18-Nov-04 Time: 16:57:20 ------------------------------ XFMail ------------------------------
On Wed, 17 Nov 2004, Mike Prager wrote:
...
Using CLI software, an infrequent user has trouble remembering the known functions needed and trouble finding new ones (especially as that user gets older). What might help is an added help facility more oriented towards tasks, rather than structured around functions or packages.
... Another good (non-GUI) tool for the CLI is keyword completion. R in ESS does this, giving you lists of possible functions, variables and objects, or feedback if there isn't any. R's CLI completes, but only with filenames in the current directory. Dave
Dave Forrest
drf at vims.edu (804)684-7900w
drf5n at maplepark.com (804)642-0662h
http://maplepark.com/~drf5n/
Mike Prager wrote:
At 11/18/2004 07:01 AM Thursday, Thomas Sch??nhoff wrote:
To sum up, what I am in need to is an extensive example based help-system, focused on how to do things in R. In parts this is already there, i.e. SimpleR from Verzani (contributed docs area) etc. Hopefully I can contribute to this in future, since it is seems to me invaluable to learn R by going through example-based lessons (some are found in vignette() ). These are much more comprehensible to me than those short reference like entries in the current help-system, mostly due to their very technical approach (same is to be said about the official GNU R manuals, especially "The R Language", which wasn't a great help for me when I took my first look at GNU R). In this context something like the GuideMaps of Vista come to my mind! But to be as clear as possible, I think GNU R is great and I appreciate all the efforts done by the R core team and associates! Nevertheless it seems to be valuable to re-think the help-system in R with respect to those who may have a good understanding in statistics, but lacking some basic experiences in how to introduce themselves to sophisticated world of R/S languages.
(I posted similar material before, but it was moved to R-devel, and I wanted to express a bit of it here.) I have frequently felt, like Thomas, that what could make R easier to use is not a GUI, but a help system more focused on tasks and examples, rather than on functions and packages. This has obvious and large costs of development, and I am unlikely to contribute much myself, for reasons of time and ability. Yet, I mention it for the sake of this discussion. Such a help system could be a tree (or key) structure in which through making choices, the user's description of the desired task is gradually narrowed. At the end of each twig of the tree would be a list of suggested functions for solving the problem, hyperlinked into the existing help system (which in many ways is outstanding and has evolved just as fast as R itself). This could be coupled with the continued expansion of the number of examples in the help system. Now I must express appreciation for what exists already that helps in this regard: MASS (in its many editions), Introductory Statistics with R, Simple R, and the other free documentation that so many authors have generously provided. Not to mention the superlative contribution of R itself, and the work of the R development team. It is beyond my understanding how something so valuable and well thought out has been created by people with so many other responsibilities. Mike
... I second all of that. What you are describing Mike could be done with a community-maintained wiki, with easy to add hyperlinks to other sites. Just think what a great value it would be to the statistical community to have an ever-growing set of examples with all code and output, taking a cue from the BUGS examples guides. The content could be broken down by major areas (data import examples, data manipulation examples, many analysis topics, many graphics topics, etc.). Ultimately the more elaborate case studies could be peer-reviewied (a la the Journal of Statistical Software) and updated. Frank
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
"Thomas" == Thomas Sch??nhoff <tom_woody at swissinfo.org> writes:
> To sum up, what I am in need to is an extensive example
> based help-system, focused on how to do things in R. In
> parts this is already there, i.e. SimpleR from Verzani
> (contributed docs area) etc.
I have a nice set of extensive help with documentation sitting on
my shelf:
- Peter Dalgaard. Introductory Statistics with R. Springer,
2002. ISBN 0-387-9
- William N. Venables and Brian D. Ripley. Modern Applied
Statistics with S. Fourth Edition. Springer, 2002. ISBN
0-387-95457-0.
- Jose C. Pinheiro and Douglas M. Bates. Mixed-Effects Models
in S and S-Plus. Springer, 2000. ISBN 0-387-98957-0.
I suspect that I would have spent the money on these books even
if I'd started by spending money for S-plus, instead of R. But
I've never seen the S-plus help system, so I may be wrong.
See http://www.r-project.org/doc/bib/R-publications.html and
http://www.r-project.org/doc/bib/R_bib.html for yet more.
Mike
Michael A. Miller mmiller3 at iupui.edu Imaging Sciences, Department of Radiology, IU School of Medicine
On Thu, 18 Nov 2004, Frank E Harrell Jr wrote:
...
... I second all of that. What you are describing Mike could be done with a community-maintained wiki, with easy to add hyperlinks to other sites.
There is a wiki at http://fawn.unibw-hamburg.de/cgi-bin/Rwiki.pl but it doesn't seem to get much use. Last time I was hunting for help on R, I made the page http://fawn.unibw-hamburg.de/cgi-bin/Rwiki.pl?SearchFunctions and in particular: help.search.archive<-function(string){ RURL="http://www.google.com/u/newcastlemaths" RSearchURL=paste(RURL,"?q=",string,sep='') browseURL(RSearchURL) return(invisible(0)) } help.search.archive('wiki') # example Dave
Dave Forrest
drf at vims.edu (804)684-7900w
drf5n at maplepark.com (804)642-0662h
http://maplepark.com/~drf5n/