Skip to content

First R Package --- Advice?

12 messages · ivo welch, Yihui Xie, Duncan Murdoch +2 more

#
Dear R experts---

after many years, I am planning to give in and write my first R
package.  I want to combine my collection of collected useful utility
routines.

as my guide, I am planning to use Friedrich Leisch's "Creating R
Packages: A Tutorial" from Sep 2009.  Is there a newer or better
tutorial?  this one is 4 years old.

I also plan on one change---given that the package.skeleton() function
writes all the individual man[ual] functions, I am thinking it would
be a good idea to put the doc and the R code together in the same
file, one for each function.  Interestingly enough, the code is by
default in the \examples{} section, so I am thinking of writing a perl
program that takes every .Rd file and writes the function into the R/
directory, overwriting anything else that is already there.  this way,
I maintain only one file for each function, and the docs and code are
together.  sort of like knuth's literate programming and the
numerical-recipees approach to keeping each function in its own file
with equal name.

I believe my "try-out and debug cycle" will then be

   $ cd iaw  ## the package name and top directory is iaw
   $ perl weaveall.pl   ## extract all man/*.Rd files code examples
and place them in R/
   $ R CMD INSTALL iaw
   $ R CMD check iaw

good idea?  bad idea?  common?  uncommon?

I do not understand the namespace mechanism yet.  I understand the
NAMESPACE file, and I think this lists the routines that become
visible when a later program of mine contains 'library(iaw)'.  I think
I want to explicitly declare what packages are actually imported.
?importIntoEnv tells me that it is not intended to be used.  how can
another program declare exactly what functions it wants to import?
(frankly, I would love to turn all default autovivification off in my
program, but that's not possible.)

/iaw
----
Ivo Welch (ivo.welch at gmail.com)
#
My short answer is to watch this video by Jeffrey Horner
http://youtu.be/ScV7XXlBZww and learn roxygen2.

And the long answer is to read the manual which has everything you
need: http://cran.r-project.org/doc/manuals/r-release/R-exts.html

Regards,
Yihui
--
Yihui Xie <xieyihui at gmail.com>
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA
On Tue, Feb 5, 2013 at 6:43 PM, ivo welch <ivo.welch at anderson.ucla.edu> wrote:
#
thanks for the responses.  this is what I have learned so far:
primarily, I need to learn roxygen2 and devtools, because they do what
I was planning to reinvent.

roxygen2 allows R users to write R code that embeds its documentation.
 the .R user source file has a family semblance with POD (perl's plain
old documentation), although pod looks cleaner because it does not
need the '#' comments.  all roxygen2 documentation commands are
interspersed as comments, and presumably devtools creates the .Rd
files from such .R files in devtool's package build stage.  it is not
clear to me whether every function must or should get its own .R file
now (just like each R function has its own .Rd file).  in "Getting
Started with RStudio", there is a brief 1-pager on roxygen2, which
gave me a flavor of what it is.  there is also a stackoverflow with
links at http://stackoverflow.com/questions/4523513/good-reference-for-roxygen
, but this is for oxygen not oxygen2, I think.  then there is the
standard roxygen2 standard documentation, but this is just the
function man set and not a "how to get started" description.  I don't
think roxygen2 has a vignette, either.  right now, I am wondering
whether there is a skeleton creator for an existing function for
roxygen2, which generates all the required .Rd fields as appropriate
'#@' directives in the comments, just like the vanilla package
skeleton creator in R creates them in the .Rd files.

devtools is a package that can take roxygen2 .R files, compile the
embedded comments into .Rd files into the man/ directory, and then
rebuilds and reloads the package on the fly.  (possibly also
repackages them.)  roughly speaking, I think it translates the more
convenient new doc format into the old standard basic way to package
packages, and also obviates the need to restart R.  again, this is
what I gleamed from the rstudio guide.

so, the combination of devtools and oxygen2 seem to be exactly what I
wanted, but I did not find the starter docs. what I would love to see
is someone pointing me to a 1-page example of how a new user should
take two real simple functions, say

# my old sum remark
sum1 <- function(x,y) (x+y)
# my old difference remark
difference1 <- function(x,y) (x-y)
stopifnot( sum1(5,5) == 10 )  ## should become a testthat

and make a (silly) package out of it.  this intro should describe how
the user should start by creating the package skeleton (say, package
sumdiff), how one gets the roxygen2 skeletons for the two functions,
how one builds and loads it (for testing), then how one would change
it, e.g., by changing both functions to square the inputs, rebuild and
reload.  someone has probably already written this up, but it is
difficult to find.

I read some other package introductions, too, but they end up
overbuilding the example and/or are better references than "get
started guides."  for example, the Leisch packaging introduction
spends more than half of its space and the user's attention span (and
in my case, brain cells) on peripheral aspects.  yes, S3 and S4
functions, extending lm() etc., is all nice and useful, but teaching
the basic logic of the R packaging system may be better done with
functions that do almost nothing.  contemplating the core logic of
lm() extensions is distracting at the first baby steps.  similarly,
rstudio is a really nice IDE, but I don't think that roxygen2 and
devtools need rstudio.

could someone point me to a simpler "starting" document for roxygen2
and devtools, if such exists?

regards,

/iaw
----
Ivo Welch (ivo.welch at gmail.com)
http://www.ivo-welch.info/
J. Fred Weston Professor of Finance
Anderson School at UCLA, C519
Director, UCLA Anderson Fink Center for Finance and Investments
Free Finance Textbook, http://book.ivo-welch.info/
Editor, Critical Finance Review, http://www.critical-finance-review.org/
On Tue, Feb 5, 2013 at 4:43 PM, ivo welch <ivo.welch at anderson.ucla.edu> wrote:
#
I have a "minimal" package here based on roxygen2:
https://github.com/yihui/rmini

Regards,
Yihui
--
Yihui Xie <xieyihui at gmail.com>
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA
On Tue, Feb 5, 2013 at 11:49 PM, ivo welch <ivo.welch at anderson.ucla.edu> wrote:
#
On 13-02-05 7:43 PM, ivo welch wrote:
I have heard of people using noweb to do this, but I can't point to any 
examples.  I'd actually recommend against it.  Good documentation files 
don't make good source files.

If you really want close connections between your source and the user 
documentation, I think that's the job of your IDE.  (I don't find this 
to be a problem, so I don't use an IDE that attempts this, but I believe 
they exist:  I'd look at ESS, RStudio, RKWard if I was in the market for 
that.)

Other people have recommended Roxygen, but honestly I haven't seen a 
package documented with Roxygen where the documentation was adequate.
It looks as though it's great to get initial documentation created, but 
does not appear to encourage followup.
I wouldn't put the last step in this cycle.  Have a separate check 
cycle, which includes a build step, and checks the built tarball.
I am not sure I know what you mean by "program", but the NAMESPACE file 
allows you to declare which functions you want to import from other 
packages.  I think it is not as strict as you want:  if you don't 
declare it, you might still find it, but if you do declare it, you'll 
find that version before any user-created or other-package-created one.

It might be a good idea for R to allow a package to request the strict 
declaration model, where you need to declare *every* import.  I don't 
know how difficult a change that would be.

Duncan Murdoch
#
I don't think that's a problem with roxygen - I think that's the key
difficult of writing documentation.  Personally, I find it much easier
to keep roxygen documentation up-to-date than with separate Rd files,
because when I'm modifying the function, the documentation is right
there, reminding me to update it.

Hadley
#
On 02/06/2013 03:31 AM, Duncan Murdoch wrote:
the compiler package in base R is, apparently, developed using noweb 
https://svn.r-project.org/R/trunk/src/library/compiler/noweb, which provide 
excellent documentation of the code for other developers and is not quite what 
Ivo was suggesting.
This sounds like codetools' findGlobals, which has problems with idioms like 
subset() and with() at least. One would want a general solution for this, rather 
than the current utils::globalVariables.

Martin

  
    
#
I'd argue that there's an important distinction between documenting a
function (how to use it) vs. documenting an algorithm (how it works).
I think noweb can work well for describing how something works, but
it's not so good for describing how to use it (as evidence see the 400
page latex package manuals that don't help you at all)

Hadley
#
On 06/02/2013 9:44 AM, Martin Morgan wrote:
Actually I was thinking of having the chain of parents of the namespace 
environment stop at the base environment, rather than continuing through 
the base namespace to the global namespace and search path.

I'd guess that would mess pretty seriously with things like S3 and S4 
dispatch, and maybe other things too.

Duncan Murdoch
#
On 06/02/2013 9:49 AM, Hadley Wickham wrote:
I agree about that.

Duncan Murdoch
#
a nice aspect about roxygen (and perl pod) is that it compiles into
the standard .Rd files.  it's not a substitute, but a complement tool.
 it may also make a good start for docs when one starts writing code,
and be eventually yanked out in favor of direct changes in the .Rd
files that it has created.

it may or may not be for code which is destined to become a real
package for third parties.  for end users like me, it is about
documenting their own basic code to the point that they themselves
still understand it a year later.  in a sense, roxygen comments (or
perl podl) are more lightweight than .Rd and more heavyweight than
just '#' comments.

/iaw
----
Ivo Welch (ivo.welch at gmail.com)