Documenting classes and methods: was [Rd] Re: R-devel Digest, Vol 3, Issue 23
Gordon Smyth wrote:
I am another person who has had trouble documenting S4 classes and (particularly) methods. The methods package itself is pretty cool by the way, but it is a pity that there are as yet no guidelines on S4 in the "Writing R Extensions" document. I have actually put together a guide on S4 documentation myself for the use of my own lab which is at http://bioinf.wehi.edu.au/limma/Rdocs.html. I don't pretend that the guide is perfect - I can already see problems with it - but it has proved adequate so far for our own use (writing the limma package) and has gained some more general acceptance from the Bioconductor community. I found it hard to use the skeleton documentation provided by promptMethods.
The "structure" of the skeletons (the \alias lines especially) are intended to be used by the help system. You're not meant to "use" these directly, much of the time. It's the case that the tools to work with the .Rd structure haven't caught up yet, but please don't modify the skeleton's structure arbitrarily.
Suppose for example that I wish to document a method for
generic function 'foo' with argument list (x,y,...) for x of class 'bar1'
and y of class 'bar2':
1. The skeleton .Rd file contains \alias{foo-methods}. If two or more more
packages document methods for 'foo', they'll all have the same alias entry,
and the help that a user will get by typing ?"foo-methods" will depend on
which package happens to have been loaded most recently.
Good point, but related to the behavior of "?". It's related to a number of other issues about multiple packages referring to the same generic function. Not likely to change for 1.7.1, but likely to be different in several ways in 1.8
2. There seems to be no allowance for documenting extra named arguments for
this method which are not specified in the generic. There is no usage
entry, no argument list, and no process for R CMD check to check the
argument list against the definition of the method. In S3 one can write
\usage{\method{generic}{class}} and it would be nice to have an extension
of this facility for S4 methods. I have been abandoning the skeleton
structure produced by promptMethods and have been using \section{Usage} and
\section{Arguments}.
Seems ok to have separate discussion of arguments, but don't "abandon" the rest of the material in the skeleton (see below). Heavy use of extra arguments in the methods is a little bit worrisome. There is an efficiency penalty, though not likely serious in sizable computations. More basic (this is just my personal view), I like to think of the function as having a single conceptual definition--what it does and (by and large) what arguments it takes to describe what it should do. Then the methods are the implementation. The function description is likely what users, begining users particularly, want to see. More advanced users and programmers may also be concerned with the implementation. So, most of the time, one would like the function to define the arguments, and the methods to work from these. In some examples of extra arguments (the S3 print() methods, for instance), these are style-setting parameters, or perhaps control parameters for numeric computations. It might be clearer in such cases to say that "..." is always passed to a (class-dependant) parameter-setting function. Documenting that function is then a separate step. Again, this is just by way of what may help users to understand the functions and help designers to write functions cleanly; not suggesting you should be forced to take this route.
3. The aliases for methods are pretty verbose and make the html contents
page for the package look rather cluttered. I have been deleting the
\alias{foo-methods} alias and been replacing \alias{foo,bar1,bar2-method}
with \alias{foo.bar1.bar2}. I know that using a syntactically valid name
for the alias has the potential problem that a function could actually
exist with that name, but I just like to use something shorter.
Don't do that. It's not what you like that counts, it's what works with the ? function, and your change will wipe out the ability of the help functions to identify correctly which method is being documented. For 1.8 (unfortunately, unlikely to be ironed out for 1.7.1), users should be able to get documentation on the method, say, for function f(x,y) corresponding to signature(x = "character", y = "numeric") by the expression method ? f(x="character", y = "numeric") (or something along these lines). In any case, the \alias lines are crucial to going from any way of requesting method documentation to the correct documentation.
4. There don't seem to be any guidelines for documenting a method with the generic, if the generic happens to be defined in the same package, or with the object class, if the generic dispatches on only one argument. I know that you have thought about this, and in the document http://developer.r-project.org/moreClassMethodIssues.html you refer to the 'addTo' argument for 'promptMethods'. The 'addTo' argument however has not yet been implemented in R. It would be nice to have a method for finding dynamically all available documentation for methods for a given generic function. I wrote a little prototype function called 'helpMethods' which simply extracts the list of available methods and prompts the user for which help topic they'd like to read. For this to work though, developers need to use a consistent alias system for documenting methods. I haven't seen any package yet which is using the aliases suggested by promptMethods. Do you think there is any value in my S4 documentation guide? Are there errors or mis-understandings in it which should be corrected before it is adopted as a guideline by Bioconductor?
It's a useful document to have. The whole area of documentation and online help is being worked on by a number of people, so there is the "moving target" difficulty. You mention in your document altering the output of the promptMethods skeleton. Adding material, up to a point, is OK, but changing or deleting the "\" lines is not a good idea if you want the documentation to work with R's (evolving) help system. As noted, the \alias lines should be left alone. There are a few other points we can discuss off-list, not directly related to this thread.
Are there major changes planned for the documentation system for S4 methods and classes in R in the near future? Is it worth our while spending time working out guidelines now or should we wait a bit until the situation stabilizes?
Commented on above--yes changes are in prospect. Bioconductor may want to encourage documentation even before things settle down--really for the people in the project to assess whether guidelines are helpful at this point. As said, there will be some changes for 1.8, mostly additions to the code that processes the online help requests. It's a fairly good guess that the structure of \ lines, esp. the \alias lines, will be kept or extended, not radically changed, so keeping the current prompt output of these lines would be desirable. If there are changes in the structure, you're more likely to see tools to modify what you have if you follow the current prompt output. In the longer run, it would be useful to have a documentation system based on a more modern form (e.g., XML), making possible more powerful online help software. Duncan Temple Lang and others have done some good work on such systems. My crystal ball is very foggy on what will happen with the R community in this direction. Regards, John
Best wishes Gordon
Date: Fri, 23 May 2003 15:37:50 -0400 From: John Chambers <jmc@research.bell-labs.com> Subject: Re: [Rd] Documenting S4 classes; debugging them To: Duncan Murdoch <dmurdoch@pair.com> Cc: r-devel@stat.math.ethz.ch Duncan Murdoch wrote:
1. I'm putting together my first package that uses S4 classes and objects. I'd like to document them, but I'm not sure what the documentation should look like, and package.skeleton doesn't produce any at all for the classes or methods.
Hmm, sounds as if it should. Meanwhile, promptClass and promptMethods generate skeleton documentation.
Are there any good examples to follow?
The bioconductor packages (e.g, Biobase) have some examples.
...
John
Duncan Murdoch
______________________________________________ R-devel@stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
-- John M. Chambers jmc@bell-labs.com Bell Labs, Lucent Technologies office: (908)582-2681 700 Mountain Avenue, Room 2C-282 fax: (908)582-3340 Murray Hill, NJ 07974 web: http://www.cs.bell-labs.com/~jmc
--------------------------------------------------------------------------------------- Dr Gordon K Smyth, Senior Research Scientist, Bioinformatics, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3050, Australia Tel: (03) 9345 2326, Fax (03) 9347 0852, Email: smyth@wehi.edu.au, www: http://www.statsci.org
______________________________________________ R-devel@stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
John M. Chambers jmc@bell-labs.com Bell Labs, Lucent Technologies office: (908)582-2681 700 Mountain Avenue, Room 2C-282 fax: (908)582-3340 Murray Hill, NJ 07974 web: http://www.cs.bell-labs.com/~jmc