Putting together documentation in Rd format is a bit of a pain. In fact, one of my colleagues has chosen to use S-PLUS instead of R partly because it's easier to document the stuff he's written. (In S-PLUS plain text files can be used to document your code. At least they could in fairly recent versions; I don't have the current one installed.) I think it's reasonable to require Rd format documents for stuff on CRAN, but it should be easier to document things that are for personal use or limited distribution. Are there existing schemes that help in this? If not, would it be worth putting one together? Duncan Murdoch
Flat documentation?
10 messages · Brian Ripley, Paul Gilbert, Duncan Murdoch +2 more
On Tue, 10 Dec 2002, Duncan Murdoch wrote:
Putting together documentation in Rd format is a bit of a pain. In fact, one of my colleagues has chosen to use S-PLUS instead of R partly because it's easier to document the stuff he's written. (In S-PLUS plain text files can be used to document your code. At least they could in fairly recent versions; I don't have the current one installed.)
That was only so on Windows, and I *think* not true on S-PLUS 6 on Windows either. [...]
Are there existing schemes that help in this? If not, would it be worth putting one together?
Only for something very simple: put a text file up in a pager as we do for info files. That should be doable cross-platform easily. Brian
Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595
I am a bit concerned about the direction of some of this discussion. !!Please!! do not consider gutting the R package Quality Assurance system and start a slide back to the chaos of Statlib. There has to be a mechanism that weeds out code that no longer works or is inadequately documented. Do you realize how much time people have wasted trying to make poorly documented "casual" Statlib code work? There is nothing that prevents non CRAN distribution of code and casual documentation. Posting of an r-help message with a web site link does make this fairly easily accessible to anyone who searches the help archives, and there is no need for the code or documentation to be in any special format. CRAN also has a devel area for packages that are not yet in good enough shape for the regular area.
In fact, one of my colleagues has chosen to use S-PLUS instead of R partly because it's easier to document the stuff he's written.
I have mostly gone the other way, largely because of the QA tools (which in large part are possible because of the Rd format). It is worth pointing out to your colleagues that there is short term pain for long term gain. The fact that code and documentation arguments are matched, and examples are checked, means that documentation does not need to be manually checked all the time as your code evolves. Changes that require changes in the documentation tend to be pointed out automatically. Paul Gilbert
Dear Paul, Duncan, et al., I too like the package-construction tools in R, and find it easier to assemble R packages than S-PLUS libraries. I wonder, however, whether the following simple suggestion might prove useful: Suppose that help(foo) and ?foo first look for standard documentation. If such documentation exists, it would be processed as at present. If there is no standard documentation on foo, then help and ? would look for a "doc" attribute of foo (or for initial comment lines in the function definition, if foo is a function), and, if this exists, display the contents in a pager. John
At 10:17 AM 12/11/2002 -0500, Paul Gilbert wrote:
I am a bit concerned about the direction of some of this discussion. !!Please!! do not consider gutting the R package Quality Assurance system and start a slide back to the chaos of Statlib. There has to be a mechanism that weeds out code that no longer works or is inadequately documented. Do you realize how much time people have wasted trying to make poorly documented "casual" Statlib code work? There is nothing that prevents non CRAN distribution of code and casual documentation. Posting of an r-help message with a web site link does make this fairly easily accessible to anyone who searches the help archives, and there is no need for the code or documentation to be in any special format. CRAN also has a devel area for packages that are not yet in good enough shape for the regular area.
In fact, one of my colleagues has chosen to use S-PLUS instead of R partly because it's easier to document the stuff he's written.
I have mostly gone the other way, largely because of the QA tools (which in large part are possible because of the Rd format). It is worth pointing out to your colleagues that there is short term pain for long term gain. The fact that code and documentation arguments are matched, and examples are checked, means that documentation does not need to be manually checked all the time as your code evolves. Changes that require changes in the documentation tend to be pointed out automatically.
____________________________ John Fox Department of Sociology McMaster University email: jfox@mcmaster.ca web: http://www.socsci.mcmaster.ca/jfox
On Wed, 11 Dec 2002 14:42:41 -0500, you wrote:
I wonder, however, whether the following simple suggestion might prove useful: Suppose that help(foo) and ?foo first look for standard documentation. If such documentation exists, it would be processed as at present. If there is no standard documentation on foo, then help and ? would look for a "doc" attribute of foo (or for initial comment lines in the function definition, if foo is a function), and, if this exists, display the contents in a pager.
I think that would be an ideal solution, as long as there was a relatively easy way to import text. For example, if it's done with comments (which would be my preference), there should be a way to enter multi-line comments (like /* ... */ in C). If it's done with attributes there needs to be an easy way to put free-form text into the attribute. As an aside, I wasn't certain that multi-line comments didn't exist, so I checked the language reference. Comments aren't documented at all! (At least in the r-devel version...) This should probably be fixed. I've submitted draft text as a bug report. Duncan Murdoch
Dear Duncan,
At 09:18 AM 12/12/2002 -0500, Duncan Murdoch wrote:
On Wed, 11 Dec 2002 14:42:41 -0500, you wrote:
I wonder, however, whether the following simple suggestion might prove useful: Suppose that help(foo) and ?foo first look for standard documentation. If such documentation exists, it would be processed as at present. If there is no standard documentation on foo, then help and ? would look for a "doc" attribute of foo (or for initial comment lines in the function definition, if foo is a function), and, if this exists, display the contents in a pager.
I think that would be an ideal solution, as long as there was a relatively easy way to import text.
One could simply supply a function to perform this task -- e.g., doc(foo, 'file'), which returns the function or data frame foo with the contents of file in the doc attribute (or as initial comment lines).
For example, if it's done with comments (which would be my preference), there should be a way to enter multi-line comments (like /* ... */ in C). If it's done with attributes there needs to be an easy way to put free-form text into the attribute.
I can think of several ways to store a multi-line text attribute: a vector of strings, a string with new-line characters, etc. It would be easiest to import the text from a file, and it would be up to help() to display the information correctly.
As an aside, I wasn't certain that multi-line comments didn't exist, so I checked the language reference. Comments aren't documented at all! (At least in the r-devel version...) This should probably be fixed. I've submitted draft text as a bug report.
Regards, John ----------------------------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada L8S 4M4 email: jfox@mcmaster.ca phone: 905-525-9140x23604 web: www.socsci.mcmaster.ca/jfox -----------------------------------------------------
On Thu, 12 Dec 2002 09:35:40 -0500, you wrote in message <5.1.0.14.2.20021212092533.01ddb880@mcmail.cis.mcmaster.ca>:
Dear Duncan, At 09:18 AM 12/12/2002 -0500, Duncan Murdoch wrote:
For example, if it's done with comments (which would be my preference), there should be a way to enter multi-line comments (like /* ... */ in C). If it's done with attributes there needs to be an easy way to put free-form text into the attribute.
I can think of several ways to store a multi-line text attribute: a vector of strings, a string with new-line characters, etc. It would be easiest to import the text from a file, and it would be up to help() to display the information correctly.
Storage isn't a problem, I'm thinking of the user interface. I
normally write my functions in a text editor, then source them into R.
Other people use a workspace as the primary place to store functions.
Both methods should allow for easy addition of lightweight
documentation.
One problem with using embedded comments is that people don't agree on
the One True Comment Style. For example, I wrote a Turbo Pascal
language parser once that built help files from comments in Pascal
source, and I found it very useful. However, when I gave it away to
other people, I found that everyone has their own comment style, and
they didn't like the assumptions my parser was making about how to put
the comments into the help file. For example this sort of problem
(translated into R) came up. Which style of source should I assume?
Version 1:
# Add two vectors
sum <- function(x, y) x+y
# Subtract two vectors
diff <- function(x, y) x-y
Version 2: (This one makes more sense in TP, where you give the
function header in one section, and the implementation in another)
sum <- function(x, y) x+y
# Add two vectors
diff <- function(x, y) x-y
# Subtract two vectors
Version 3:
sum <- function(x, y) {
# Add two vectors
x+y
}
diff <- function(x, y) {
# Subtract two vectors
x-y
}
Duncan Murdoch
Dear Duncan,
At 10:51 AM 12/12/2002 -0500, Duncan Murdoch wrote:
On Thu, 12 Dec 2002 09:35:40 -0500, you wrote in message <5.1.0.14.2.20021212092533.01ddb880@mcmail.cis.mcmaster.ca>:
Dear Duncan, At 09:18 AM 12/12/2002 -0500, Duncan Murdoch wrote:
For example, if it's done with comments (which would be my preference), there should be a way to enter multi-line comments (like /* ... */ in C). If it's done with attributes there needs to be an easy way to put free-form text into the attribute.
I can think of several ways to store a multi-line text attribute: a vector of strings, a string with new-line characters, etc. It would be easiest to import the text from a file, and it would be up to help() to display the information correctly.
Storage isn't a problem, I'm thinking of the user interface. I normally write my functions in a text editor, then source them into R. Other people use a workspace as the primary place to store functions. Both methods should allow for easy addition of lightweight documentation.
I was assuming the use of your third style. At present -- in the absence of
multiline comments -- that would require #ing each comment line at the
start of the function.
Alternatively, you could create a separate text file, say sum.txt, and
define the function as:
sum <- function(x, y) x + y
doc(sum, "sum.txt")
[or sum <- doc(sum, "sum.txt") for an implementation of doc() without side
effects.]
Either method should work whether functions are kept in text files or in
saved workspaces.
One problem with using embedded comments is that people don't agree on
the One True Comment Style. For example, I wrote a Turbo Pascal
language parser once that built help files from comments in Pascal
source, and I found it very useful. However, when I gave it away to
other people, I found that everyone has their own comment style, and
they didn't like the assumptions my parser was making about how to put
the comments into the help file. For example this sort of problem
(translated into R) came up. Which style of source should I assume?
Version 1:
# Add two vectors
sum <- function(x, y) x+y
# Subtract two vectors
diff <- function(x, y) x-y
Version 2: (This one makes more sense in TP, where you give the
function header in one section, and the implementation in another)
sum <- function(x, y) x+y
# Add two vectors
diff <- function(x, y) x-y
# Subtract two vectors
Version 3:
sum <- function(x, y) {
# Add two vectors
x+y
}
diff <- function(x, y) {
# Subtract two vectors
x-y
}
Regards, John ----------------------------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada L8S 4M4 email: jfox@mcmaster.ca phone: 905-525-9140x23604 web: www.socsci.mcmaster.ca/jfox -----------------------------------------------------
Hola!
Duncan Murdoch wrote:
. . .
Storage isn't a problem, I'm thinking of the user interface. I normally write my functions in a text editor, then source them into R. Other people use a workspace as the primary place to store functions. Both methods should allow for easy addition of lightweight documentation.
When functions are stored in workspaces, and options keep.source=FALSE are used, it will not work to write the documentation as comments in the function. So attributes seems preferable, if one goes for light-weight documentation. Kjetil Halvorsen
One problem with using embedded comments is that people don't agree on
the One True Comment Style. For example, I wrote a Turbo Pascal
language parser once that built help files from comments in Pascal
source, and I found it very useful. However, when I gave it away to
other people, I found that everyone has their own comment style, and
they didn't like the assumptions my parser was making about how to put
the comments into the help file. For example this sort of problem
(translated into R) came up. Which style of source should I assume?
Version 1:
# Add two vectors
sum <- function(x, y) x+y
# Subtract two vectors
diff <- function(x, y) x-y
Version 2: (This one makes more sense in TP, where you give the
function header in one section, and the implementation in another)
sum <- function(x, y) x+y
# Add two vectors
diff <- function(x, y) x-y
# Subtract two vectors
Version 3:
sum <- function(x, y) {
# Add two vectors
x+y
}
diff <- function(x, y) {
# Subtract two vectors
x-y
}
Duncan Murdoch
______________________________________________ R-devel@stat.math.ethz.ch mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-devel
Dear Kjetil,
At 01:37 PM 12/12/2002 -0400, kjetil halvorsen wrote:
Duncan Murdoch wrote:
Storage isn't a problem, I'm thinking of the user interface. I normally write my functions in a text editor, then source them into R. Other people use a workspace as the primary place to store functions. Both methods should allow for easy addition of lightweight documentation.
When functions are stored in workspaces, and options keep.source=FALSE are used, it will not work to write the documentation as comments in the function. So attributes seems preferable, if one goes for light-weight documentation.
It occurs to me that this behaviour could be modified so that comments at the beginning of a function are kept in any event (perhaps in an attribute). It seems to me that there are lots of simple ways of implementing the ideas that we've been discussing and that any one of them would probably be reasonable and better than the current situation. Regards, John ____________________________ John Fox Department of Sociology McMaster University email: jfox@mcmaster.ca web: http://www.socsci.mcmaster.ca/jfox