Skip to content
Prev 57342 / 63421 Next

Underscores in package names

Martin,

Thank you for discussing this amongst R-core and for detailing the
R-core discussion here.

Some specific examples where having underscores available would have
been useful.

1. My primerTree package (2013) was originally primer_tree, but I had
to change the name to camelCase to comply with the check requirements.
Using camelCase in the package name makes reading code jarring, as the
functions all use snake_case.
2. The widely used testthat package would likely be called test_that,
like the corresponding function within the package. This also
highlights one of the drawbacks of the current situation, without
separators the package name is more difficult to read, does it have
two t's or three?
3. The assertive suite of packages use `.` for separation, e.g.
`assertive.base`, `assertive.datetimes` etc. but all functions within
the packages use `_` separators, again likely this was done out of
necessity rather than desire.

There are many more I am sure, these were some that came immediately
to mind. More important than the specific examples is the opportunity
cost of having this restriction, which we cannot really quantify.

Using dots for separators has a number of practical problems.
Functions using dots are ambiguous, e.g. is `as.data.frame()` a
regular function, an `as.data()` method for a `frame` object, or an
`as()` method for a `data.frame` object? And in fact regular functions
can be accidentally promoted to S3 methods by defining a S3 generic,
which does actually happen in real life, confusing users [1]. While
package names are not functions, using dots in package names
encourages the use of dots in functions, a dangerous practice. Dots in
names is also one of the common stones cast at R as a language, as
dots are used for object oriented method dispatch in other common
languages.

The prevalence of dotted functions is the only major naming convention
which is steadily decreasing over time. It now accounts for only
around 15% of all function names when looking at all 94 Million lines
of code currently available on CRAN (See Figure 2. from Yen et. al.
[2]).

Thanks again for the public discussion,

Jim

[1]: https://twitter.com/_ColinFay/status/1105579764797108230
[2]: https://osf.io/preprints/socarxiv/ts2wq/

On Wed, Aug 14, 2019 at 5:16 AM Martin Maechler
<maechler at stat.math.ethz.ch> wrote: