Hi, So now R CMD check starts to warn against :::, but I believe sometimes it is legitimate to use it when developing R packages. For example, I have some utils functions that are not exported but I want to share them across the packages that I maintain. I do not need to coordinate with other authors about these internal functions since I'm the only author and I know clearly what I'm doing, and I want to avoid copying and pasting the code across packages just to avoid the NOTE in R CMD check. What should I do in this case? Regards, Yihui -- Yihui Xie <xieyihui at gmail.com> Web: http://yihui.name Department of Statistics, Iowa State University 102 Snedecor Hall, Ames, IA
legitimate use of :::
34 messages · Michael Friendly, Peter Dalgaard, Hadley Wickham +12 more
Messages 1–25 of 34
On 22.08.2013 07:45, Yihui Xie wrote:
Hi, So now R CMD check starts to warn against :::, but I believe sometimes it is legitimate to use it when developing R packages. For example, I have some utils functions that are not exported but I want to share them across the packages that I maintain. I do not need to coordinate with other authors about these internal functions since I'm the only author and I know clearly what I'm doing, and I want to avoid copying and pasting the code across packages just to avoid the NOTE in R CMD check. What should I do in this case?
Nothing. The way you describe above seems to be a reasonable usage, iff you are the same maintainer who knows what is going on. Other maintainers should not use one of your not exported (hence non API) functions, of course. Uwe Ligges
Regards, Yihui -- Yihui Xie <xieyihui at gmail.com> Web: http://yihui.name Department of Statistics, Iowa State University 102 Snedecor Hall, Ames, IA
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
On 8/22/2013 7:45 AM, Uwe Ligges wrote:
On 22.08.2013 07:45, Yihui Xie wrote:
Hi, So now R CMD check starts to warn against :::, but I believe sometimes it is legitimate to use it when developing R packages. For example, I have some utils functions that are not exported but I want to share them across the packages that I maintain. I do not need to coordinate with other authors about these internal functions since I'm the only author and I know clearly what I'm doing, and I want to avoid copying and pasting the code across packages just to avoid the NOTE in R CMD check. What should I do in this case?
Nothing. The way you describe above seems to be a reasonable usage, iff you are the same maintainer who knows what is going on. Other maintainers should not use one of your not exported (hence non API) functions, of course. Uwe Ligges
Related to this is the use of other-package unexported utility functions that don't pass Uwe's iff test, but I, as maintainer, want to use in my package. Cases in point: in heplots, I had used stats:::Pillai, stats:::Wilks, stats:::Roy and stats:::LH for calculation in one of my functions. Similarly, I had a need to use car:::df.terms, also unexported, but don't want to ask John Fox to export it just for my use. Uwe's reply suggests that I should not be using car:::df.terms, however. To avoid the NOTEs (which often triggers a 'pls fix' upon submission to CRAN), I simply copied/pasted these functions to my package, but this seems wasteful. -Michael
Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. & Chair, Quantitative Methods York University Voice: 416 736-2100 x66249 Fax: 416 736-5814 4700 Keele Street Web: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA
Dear Michael and Uwe,
-----Original Message----- From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r- project.org] On Behalf Of Michael Friendly Sent: Thursday, August 22, 2013 2:57 PM To: Uwe Ligges Cc: R-devel Subject: Re: [Rd] legitimate use of ::: On 8/22/2013 7:45 AM, Uwe Ligges wrote:
On 22.08.2013 07:45, Yihui Xie wrote:
Hi, So now R CMD check starts to warn against :::, but I believe
sometimes
it is legitimate to use it when developing R packages. For example,
I
have some utils functions that are not exported but I want to share them across the packages that I maintain. I do not need to
coordinate
with other authors about these internal functions since I'm the only author and I know clearly what I'm doing, and I want to avoid
copying
and pasting the code across packages just to avoid the NOTE in R CMD check. What should I do in this case?
Nothing. The way you describe above seems to be a reasonable usage,
iff
you are the same maintainer who knows what is going on. Other maintainers should not use one of your not exported (hence non API) functions, of course. Uwe Ligges
Related to this is the use of other-package unexported utility functions that don't pass Uwe's iff test, but I, as maintainer, want to use in my package. Cases in point: in heplots, I had used stats:::Pillai, stats:::Wilks, stats:::Roy and stats:::LH for calculation in one of my functions. Similarly, I had a need to use car:::df.terms, also unexported, but don't want to ask John Fox to export it just for my use. Uwe's reply suggests that I should not be using car:::df.terms, however. To avoid the NOTEs (which often triggers a 'pls fix' upon submission to CRAN), I simply copied/pasted these functions to my package, but this seems wasteful.
I think that the ideal solution is for everyone to export functions that somewhat else might want, but it's hard to anticipate what these are, and it would be useful then to differentiate functions that are meant for "end" users from those meant for developers. Maybe packages could document the latter in something like a Utilities.Rd file. Probably there's a better, more formal, solution. The stats:::Pillai, Wilks, HL, and Roy functions seem reasonable candidates for export -- I too use these functions, in the car package, and have resorted to the fix that Michael adopted. I'd be happy to export df.terms, but would rather segregate it from end-user functions. It's also clear to me that enforcing namespace conventions more consistently, which is certainly desirable in the abstract, opens a can of worms, especially for the CRAN administrators. One hopes that we'll all survive the process and will have better packages in the end. My two cents. John
-Michael -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. & Chair, Quantitative Methods York University Voice: 416 736-2100 x66249 Fax: 416 736-5814 4700 Keele Street Web: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
On Aug 22, 2013, at 20:57 , Michael Friendly wrote:
Cases in point: in heplots, I had used stats:::Pillai, stats:::Wilks, stats:::Roy and stats:::LH for calculation in one of my functions.
That particular case has been on what remains of my conscience for some time....
Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
To avoid the NOTEs (which often triggers a 'pls fix' upon submission to CRAN), I simply copied/pasted these functions to my package, but this seems wasteful.
Wasteful of disk space, but disk space is cheap. It's less wasteful of your time, if the referenced code breaks in an unexpected time. Your time is much more valuable than disk space. A gigabyte of disk space costs about $0.10, so even if you value your time at a very conservative rate of $100 / hour, you should only spend an hour of your time reducing package size if it saves at least 1 TB of disk space. That's a lot of copies of a function! Hadley
Chief Scientist, RStudio http://had.co.nz/
r63654 has fixed this particular issue, and R-devel will no longer warn against the use of ::: on packages of the same maintainer. Regards, Yihui -- Yihui Xie <xieyihui at gmail.com> Web: http://yihui.name Department of Statistics, Iowa State University 102 Snedecor Hall, Ames, IA On Thu, Aug 22, 2013 at 6:45 AM, Uwe Ligges
<ligges at statistik.tu-dortmund.de> wrote:
On 22.08.2013 07:45, Yihui Xie wrote:
Hi, So now R CMD check starts to warn against :::, but I believe sometimes it is legitimate to use it when developing R packages. For example, I have some utils functions that are not exported but I want to share them across the packages that I maintain. I do not need to coordinate with other authors about these internal functions since I'm the only author and I know clearly what I'm doing, and I want to avoid copying and pasting the code across packages just to avoid the NOTE in R CMD check. What should I do in this case?
Nothing. The way you describe above seems to be a reasonable usage, iff you are the same maintainer who knows what is going on. Other maintainers should not use one of your not exported (hence non API) functions, of course. Uwe Ligges
On Thu, Aug 22, 2013 at 2:03 PM, Hadley Wickham <h.wickham at gmail.com> wrote:
To avoid the NOTEs (which often triggers a 'pls fix' upon submission to CRAN), I simply copied/pasted these functions to my package, but this seems wasteful.
Wasteful of disk space, but disk space is cheap. It's less wasteful of your time, if the referenced code breaks in an unexpected time. Your time is much more valuable than disk space. A gigabyte of disk space costs about $0.10, so even if you value your time at a very conservative rate of $100 / hour, you should only spend an hour of your time reducing package size if it saves at least 1 TB of disk space. That's a lot of copies of a function!
A bigger issue is source-code license conflicts; you may cut'n'paste GPL code into a distribution that is under another license. /Henrik
Hadley -- Chief Scientist, RStudio http://had.co.nz/
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20130822/7d42ff2c/attachment.pl>
Wasteful of disk space, but disk space is cheap. It's less wasteful of your time, if the referenced code breaks in an unexpected time. Your time is much more valuable than disk space.
On the other hand, it's quite dangerous software design. What if the original author finds a bug and implements a fix, but you don't hear about it? Furthermore, what happens when I come along and need the same functionality? Sure I could make a copy, but maybe I only know about your copy and don't know it is a copy of something else, so now we have a copy of a copy, which is even more problematic. Using ::: prevents this issue.
There are costs and benefits to both approaches. Copy-and-paste also minimises external dependencies which can be important in some cases. I'm not arguing for unmitigated duplication, but there are definitely good reasons to do it. I have quite a few v. simple functions that live in multiple packages. Often I want to keep the dependencies of packages as lightweight as possible (learning from past experiences) and avoid tightly coupling packages together. Hadley
Chief Scientist, RStudio http://had.co.nz/
Dear Peter,
-----Original Message----- From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r- project.org] On Behalf Of peter dalgaard Sent: Thursday, August 22, 2013 4:45 PM To: Michael Friendly Cc: R-devel; Uwe Ligges Subject: Re: [Rd] legitimate use of ::: On Aug 22, 2013, at 20:57 , Michael Friendly wrote:
Cases in point: in heplots, I had used stats:::Pillai,
stats:::Wilks,
stats:::Roy and stats:::LH for calculation in one of my functions.
That particular case has been on what remains of my conscience for some time....
Happily, it would be easy to relieve your conscience in this matter. Best, John
-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Another point to consider is that copying someone else's code forces you to become a maintainer of the copied code. If there are any bug fixes/enhancements/what-have-you in the original you won't get those updates. So now you own the copied code and need to consider the cost of the codebase diverging (from the original).
On Aug 22, 2013, at 5:03 PM, Hadley Wickham <h.wickham at gmail.com> wrote:
To avoid the NOTEs (which often triggers a 'pls fix' upon submission to CRAN), I simply copied/pasted these functions to my package, but this seems wasteful.
Wasteful of disk space, but disk space is cheap. It's less wasteful of your time, if the referenced code breaks in an unexpected time. Your time is much more valuable than disk space. A gigabyte of disk space costs about $0.10, so even if you value your time at a very conservative rate of $100 / hour, you should only spend an hour of your time reducing package size if it saves at least 1 TB of disk space. That's a lot of copies of a function! Hadley -- Chief Scientist, RStudio http://had.co.nz/
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
On Thu, Aug 22, 2013 at 4:52 PM, Brian Rowe <rowe at muxspace.com> wrote:
Another point to consider is that copying someone else's code forces you to become a maintainer of the copied code. If there are any bug fixes/enhancements/what-have-you in the original you won't get those updates. So now you own the copied code and need to consider the cost of the codebase diverging (from the original).
Sometimes that's a good thing - you're equally insulated from the original maintainer changing the function to work in a way that you don't like. Again, I'm not arguing that copy-and-paste is necessarily the right solution, but it's not necessarily the wrong solution either - it depends on the context. Hadley
Chief Scientist, RStudio http://had.co.nz/
If ::: is disallowed then its likely that package developers will need to export more functions to satisfy the consumers of those otherwise hidden functions but if more functions are exported then there will be a greater likelihood of conflicts among packages. The problem seems to be that there are potentially three sorts of functions here: 1. a function is hidden 2. a function is accessible via ::: but is not on the search path 3. a function is on the search path The problem arises in attempting to force fit these three concepts into only two categories either by removing the first category (as was done previously) or by removing the second category (which seems to be the new approach).
You raise an interesting point that I've mulled over a bit: namespace collisions. How many of these issues would go away if there were a better mechanism for managing namespaces? eg in other languages you can control which objects/modules you wish to import from a library. Under this regime I think package developers would be less concerned about exposing functions that otherwise would be private.
On Aug 22, 2013, at 6:27 PM, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
If ::: is disallowed then its likely that package developers will need to export more functions to satisfy the consumers of those otherwise hidden functions but if more functions are exported then there will be a greater likelihood of conflicts among packages. The problem seems to be that there are potentially three sorts of functions here: 1. a function is hidden 2. a function is accessible via ::: but is not on the search path 3. a function is on the search path The problem arises in attempting to force fit these three concepts into only two categories either by removing the first category (as was done previously) or by removing the second category (which seems to be the new approach).
On 23.08.2013 00:36, Brian Lee Yung Rowe wrote:
You raise an interesting point that I've mulled over a bit: namespace collisions. How many of these issues would go away if there were a better mechanism for managing namespaces? eg in other languages you can control which objects/modules you wish to import from a library. Under this regime I think package developers would be less concerned about exposing functions that otherwise would be private.
Exactly, the corresponding NAMESPACE directive is importFrom() and it should be used.
On Aug 22, 2013, at 6:27 PM, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
If ::: is disallowed then its likely that package developers will need to export more functions to satisfy the consumers of those otherwise hidden functions but if more functions are exported then there will be a greater likelihood of conflicts among packages. The problem seems to be that there are potentially three sorts of functions here: 1. a function is hidden 2. a function is accessible via ::: but is not on the search path 3. a function is on the search path
Not entirely right: If the package or only parts of it are imported via importFrom by another package, the package is not loaded, hence not on the search path. Best, Uwe Ligges
The problem arises in attempting to force fit these three concepts into only two categories either by removing the first category (as was done previously) or by removing the second category (which seems to be the new approach).
On Thu, Aug 22, 2013 at 6:41 PM, Uwe Ligges
<ligges at statistik.tu-dortmund.de> wrote:
On 23.08.2013 00:36, Brian Lee Yung Rowe wrote:
You raise an interesting point that I've mulled over a bit: namespace collisions. How many of these issues would go away if there were a better mechanism for managing namespaces? eg in other languages you can control which objects/modules you wish to import from a library. Under this regime I think package developers would be less concerned about exposing functions that otherwise would be private.
Exactly, the corresponding NAMESPACE directive is importFrom() and it should be used.
On Aug 22, 2013, at 6:27 PM, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
If ::: is disallowed then its likely that package developers will need to export more functions to satisfy the consumers of those otherwise hidden functions but if more functions are exported then there will be a greater likelihood of conflicts among packages. The problem seems to be that there are potentially three sorts of functions here: 1. a function is hidden 2. a function is accessible via ::: but is not on the search path 3. a function is on the search path
Not entirely right: If the package or only parts of it are imported via importFrom by another package, the package is not loaded, hence not on the search path.
OK but it is still true that under the new rules to use importFrom(B, f) in package A that f must be exported by B. That implies that f could cause a conflict when B is placed on the search path via library(B) by some other package (package C) or by the user. f is either exported by B or not. If f is exported by B then f will be placed on the search path whenever B is placed on the search path and if f is not exported then A can't import it. That is there is no way for B to declare a function to be importable by another package without having that function also placed on the search path whenever B is loaded by a library(B)l or a Depends: B from another package. on the search path
Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20130822/43ec0ca6/attachment.pl>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20130822/b9628bc4/attachment.pl>
On Thu, Aug 22, 2013 at 7:57 PM, Gabriel Becker <gmbecker at ucdavis.edu> wrote:
My understanding is that lookup happens in the imports before moving on to the search path, so if I understand you correctly I don't think that is an issue. If A also *exported* f, that would be a problem...
A can only import f from B if f has been exported from B so while its not a problem for A, whenever anyone issues a library(B) f will be visible on the search path -- the problem of potential conflicts with f remains. B really only exported f so that A could import it but a side effect of that is that anyone who puts B on the search path makes f visible.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20130822/629ed8f8/attachment.pl>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20130822/81fa43bf/attachment.pl>
This is what I was getting at as well. It would be great to have a call like
require(package, c('funtion.1','function.2'))
or similar that gives users granular control over what gets imported in the shell. I would be drunk with joy if the same mechanism could be used to automatically populate the package directives.
On Aug 22, 2013, at 8:01 PM, Peter Meilstrup <peter.meilstrup at gmail.com> wrote:
It would be nice if the functionality of importFrom() and import() were available to user level code, rather than just to people building packages for distribution. One most often encounters namespace conflicts at the user level, when loading two packages that have no logical connection other than both bearing on your problem of the moment. R conflates "having namespaces" with "having a library distribution mechanism" and while its library distribution mechanism is top notch, most modern languages do not require you to learn the distribution procedure in order to just have namespaces. For instance, in Python you merely put code in a file called foo.py and then in any other file in the same directory you type "import functionName from foo". I.E. using namespaces does not require you to build/install packages. Python namespaces are also hierarchical so that the question of this thread would easily be resolved by putting functions into foo._internal and in other packages typing import * from "foo._internal" Peter [[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Peter Meilstrup: (05:01PM on Thu, Aug 22)
One most often encounters namespace conflicts at the user level, when loading two packages that have no logical connection other than both bearing on your problem of the moment.
Unless I'm mistaken, you can reassign the hidden functions, ie fna <- joespackage:::usefulfunction fnb <- janespackage:::usefulfunction which is a little bit of a pain, but makes the user's code unambiguous. This also works with two colons for explicitly exported functions.
Gray Calhoun, Assistant Professor of Economics at Iowa State http://gray.clhn.co (web)
Dear Gray, On Thu, 22 Aug 2013 19:41:58 -0500
Gray <gray at clhn.co> wrote:
Peter Meilstrup: (05:01PM on Thu, Aug 22)
One most often encounters namespace conflicts at the user level, when loading two packages that have no logical connection other than both bearing on your problem of the moment.
Unless I'm mistaken, you can reassign the hidden functions, ie fna <- joespackage:::usefulfunction fnb <- janespackage:::usefulfunction
This will now generate a note from R CMD check and an objection from the CRAN administrators. Best, John
which is a little bit of a pain, but makes the user's code unambiguous. This also works with two colons for explicitly exported functions. -- Gray Calhoun, Assistant Professor of Economics at Iowa State http://gray.clhn.co (web)
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel