Skip to content

Issue tracking in packages [was: Re: [R] change in read.spss, package foreign?]

19 messages · Thomas Lumley, Kurt Hornik, Seth Falcon +3 more

#
On 9/9/05, Thomas Lumley <tlumley at u.washington.edu> wrote:
Of course this has been discussed a number of times but since its being
brought up again let me just add that there is a substantial need for something
here (i.e. something to address the lack of a standard way of communicating 
issues in packages including changes, outstanding bugs, wishlist items, etc.)  

I personally put NEWS, WISHLIST and THANKS files in the 'inst'
directory of all my source packages.  This has the effect of copying them to the
top level of the built version so that they are accessible from R via:

   system.file("NEWS", package = "mypackage"))

without the burden of having the user retrieve the source and I think we 
need something like that, in general.

If someone wanted to set it up it would really be nice to have sourceforge-like
facilities made available for package developers providing groupware
facilities such as svn for each package, Trac issue tracking for each package, 
home page for each package, email list for each package, etc.  Its probably
too much ongoing work to provide this on a package by package basis but 
an automated system that made it easy for package developers to access this 
all in a standard way on a self-serve basis similar to OpenSVN would
be feasible
if someone wanted to do it.
#
On Fri, 9 Sep 2005, Gabor Grothendieck wrote:
I'm not sure that WISHLIST and THANKS need to be available to people who 
haven't installed the package.   NEWS, on the other hand, really does.

One option (if it doesn't turn out to be too much work for the CRAN 
maintainers) would be to have an optional Changelog field in the 
DESCRIPTION file giving the relative path to the file. This would mean 
that maintainers would not all have to switch to the same format.
eg for foreign
   Changelog: ChangeLog
and for survey
   Changelog: inst/NEWS

This might be enough to make it easy for CRAN to display these when the 
maintainer provides them.

 	-thomas
#
On 9/9/05, Thomas Lumley <tlumley at u.washington.edu> wrote:
How about if there were just a standard location and name such as inst/NEWS,
inst/WISHLIST, inst/THANKS (which has the advantage that they are automatically
made available in the built package under the current way packages are
built) and
then CRAN could just check if its there or not -- no need to change
and document
the DESCRIPTION file.  The only thing package developers who want to provide
these have to do is to use the indicated names and location.  It would still be 
possible as step 2 to provide your idea since its upwardly compatible.  

In fact, even with no software at all there would be an advantage to this since
users would definitively know where to look and would not have to download
the source package just to read this.
#
On Fri, 9 Sep 2005, Gabor Grothendieck wrote:
The problem is that there *isn't* a standard location. As Robert Gentleman 
has pointed out, if you only maintain two or three packages it isn't too 
bad to change them to some new layout, but if you are the bioconductor 
project it gets painful quite quickly.

Also, there are good reasons for having NEWS in the top level directory. 
Nearly everything that isn't an R package does this, because it's a useful 
standard.

 	-thomas
#
On 9/9/05, Thomas Lumley <tlumley at u.washington.edu> wrote:
Yes, I know.  That was the point of my post -- declare a standard location
that everyone would use (or not but if they don't then people will have a hard
time finding their info but no worse than now).
Surely there are only a few possibilities that are used and a simple script
could fix that all up.  Its the same problem if you have to modify every 
DESCRIPTION file.

At any rate I don't think this should be driven by compatibility with what
is there now since its not a difficult one-time transition.
This could be handled by having the build procedure
copy NEWS, WISHLIST and THANKS
files at the top level to the build rather than not copying them.  That way one 
would not have to put them in the inst directory -- although unlike my previous 
suggestion this would require modifying the build software though its
presumably
not a big change.  I agree that this would be worth it.
#

        
> On Fri, 9 Sep 2005, Gabor Grothendieck wrote:
>> 
  >> I personally put NEWS, WISHLIST and THANKS files in the 'inst'
  >> directory of all my source packages.  This has the effect of copying them to the
  >> top level of the built version so that they are accessible from R via:
  >> 

  > I'm not sure that WISHLIST and THANKS need to be available to people who 
  > haven't installed the package.   NEWS, on the other hand, really does.

  > One option (if it doesn't turn out to be too much work for the CRAN 
  > maintainers) would be to have an optional Changelog field in the 
  > DESCRIPTION file giving the relative path to the file. This would mean 
  > that maintainers would not all have to switch to the same format.
  > eg for foreign
  >    Changelog: ChangeLog
  > and for survey
  >    Changelog: inst/NEWS

  > This might be enough to make it easy for CRAN to display these when the 
  > maintainer provides them.

Standard location or a mechachanism like the one you describe are both
similar amount of work (and not much at all), the HTML pages are
generated by perl and I have the parsed DESCRIPTION file there, i.e.,
using a fixed name or the value of the Changelog field is basically
the same.

.f
#
And similar things could be said about Emacs users with ChangeLog files
in top-level package directories ...

I like the suggestion about using a Changelog (or whatever it would be
called) field in the package DESCRIPTION meta-data.  If we have that, we
could not only use this for repository-side presentation of the package,
but also install such info and have a simple show_package_change_log()
function ...

-k

        
#
On 9/10/05, Friedrich.Leisch at tuwien.ac.at <Friedrich.Leisch at tuwien.ac.at> wrote:
Regarding the two possibilities I think there is an advantage in a fixed
name in a fixed location since one always knows where to look.  The 
extra level of indirection in the DESCRIPTION file just means that one 
has to fill out yet another field and the user can't know where to look
directly, for sure, but must first look it up in the DESCRIPTION file.

I think the DESCRIPTION file idea was motivated by making it easier
for existing packages but in fact I think its no harder to rename and
move a file, and maybe easier, than to add a line to the DESCRIPTION
file.   Also I think this should apply not only to NEWS/ChangeLog but
also to THANKS and WISHLIST and that would mean 3 more lines in the
DESCRIPTION file so it could rapidly get out of hand.
#
On 9/10/05, Kurt Hornik <Kurt.Hornik at wu-wien.ac.at> wrote:
One could have that without this meta data.  show_package_change_log
could just check if the file is present.
#
On 9/10/05, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
And one more comment.   The DESCRIPTION file does not record the
location or existence of the various subdirectories such as R, man, 
exec, etc. If NEWS is to be recorded as a meta data line item in 
DESCRIPTION then surely all of these should be too so its symmetric
and they are all on an equal footing (or else none of them
should be, which in fact I think is preferable).
#
On 10 Sep 2005, Kurt.Hornik at wu-wien.ac.at wrote:
For what its worth, I don't like this idea of adding a ChangeLog field
to the DESCRIPTION file.

Agreeing upon a standard location for NEWS or CHANGES or some such
seems a more simple solution.  As long as the presence of such a file
is *optional*.  And if the location really needs to be at the top,
then the build tools could grab it from there as they do the
DESCRIPTION file.

+ seth
#
In which case a Changlog entry in DESCRIPTION would be a very nice 
addition, and would have the advantage of not requiring changes to 
packages.

 	-thomas
#
On Sat, 10 Sep 2005, Gabor Grothendieck wrote:
I don't see any advantage in symmetry.  The locations of these 
subdirectories are fixed and I can't see why someone trying to decide 
whether to install an upgrade needs to know if it has an exec 
subdirectory before they download the package.

I also don't see why THANKS and WISHLIST should need to be visible before 
you download the package.  CRAN does display a URL if one is given, and if 
these are important they could be at that URL.

The changelog, on the other hand, is one piece of information that is 
really valuable in deciding whether or not to update a package, so it 
would be worth having it visible on CRAN.  Since other coding standards 
suggest different things for the name and location of this file, a path in 
DESCRIPTION seems a minimal change.

 	-thomas
#
On Sat, 10 Sep 2005, Seth Falcon wrote:
We're certainly agreed on its being optional.

 	-thomas
#
>>  Standard location or a mechachanism like the one you
    >> describe are both similar amount of work (and not much at
    >> all), the HTML pages are generated by perl and I have the
    >> parsed DESCRIPTION file there, i.e., using a fixed name
    >> or the value of the Changelog field is basically the
    >> same.
    >> 

    TL> In which case a Changlog entry in DESCRIPTION would be a
    TL> very nice addition, and would have the advantage of not
    TL> requiring changes to packages.

yes *and* does allow slightly more flexibility with almost
no cost, as Fritz confirmed.

And, BTW, Gabor,  NEWS and ChangeLog are not at all the same
thing and it would be silly to urge users to one of them.
At least 'ChangeLog' is a well defined format for emacs users
that can very quickly be updated semi-automagically
("C-x 4 a" when you're in file  foo.R with function myfun(.)
 autogenerates a neat entry in a ChangeLog file);
but then really people should be allowed to use other formats
for good reasons.

Martin
#
On 9/10/05, Thomas Lumley <tlumley at u.washington.edu> wrote:
The present discussion is where the change information may be located
but that is also true of the source and other information.    We could
just as easily have a field in the DESCRIPTION that tells the build
where to find the R source.
Its really the same issue.
That is a different issue which has not been discussed up to now.
I agree that that would be desirable.  It does seem independent
of the other issues discussed.  If CRAN processing speed can be
enhanced then I see no reason other than work involved to have the
build automatically enter a DESCRIPTION field of News: Yes
However, to make the user fill out another field and to burden
the user with having to look at DESCRIPTION first seems 
to add complexity without benefit.

I can think of one intermediate situation.  The source DESCRIPTION
has the path to the NEWS which the build grabs and puts it in
a standard place in the built package.  However, if we allow that for 
the NEWS then we should allow it for all components rather than
an inconsistent approach.
Either way would be ok in my opinion.
There is no current standard. This is our chance to make it the same
for all packages and therefore easier for all users.


In short, how about we have a standard name and location for
the NEWS, cvs/svn log, WISHLIST, THANKS in the source
package.  The build would maintain their locations and, in
the case of NEWS and the svn/cvs log enter lines in the
DESCRIPTION file such as:

NEWS: Yes
ChangeLog: Yes

for sake of CRAN processing speed (if it turns out that
this does make a material difference which it may not).

This would seem to satisfy all requirements.  Its simple,
its easy to move to since one just renames or renames
and moves one's files (without the need for modifying the
DESCRIPTION file in every package or having yet more fields
in the DESCRIPTION file) and its easy for the 
user since they know where everything is supposed to be 
located without a complicating level of indirection.
#
On Sat, 10 Sep 2005, Gabor Grothendieck wrote:

            
There are two important differences

1/ No existing package has its source anywhere other than in the R 
subdirectory. Existing packages have their change logs in different places 
and different formats.

2/ Having source code where it will not be found must be an error -- 
making the source code available to R *cannot* be optional.  Making a 
change log available *must* be optional.


 	-thomas
#
On 9/10/05, Thomas Lumley <tlumley at u.washington.edu> wrote:
In terms of the source package the source code is in the R
subdirectory because its been standardized that way and the
R CMD tools support it.  It could, in principle be anywhere and brought
into the built package at build time had it not been designed that
way.  The same is true of the change information.  The point is
that there is really no difference in principle between the two.

Furthermore, what existing packages do is not important since its no harder
and probably easier to adapt to the standard scheme.  Even if that
were not the case I don't think that that should drive the design.
Source code is optional too.  One can create a package with no
R subdirectory.  In fact the only thing you cannot leave out and
still pass R CMD check is the DESCRIPTION file.


There really is no difference between change information and the
source.  Both could be in the source package or not in the source
package and just brought into the built package at
build time depending on how the build process is designed.

Also in both cases the advantage of having everything in the
source package is that the built package can be guaranteed
to be built from the source package.
#
>>> Standard location or a mechachanism like the one you
  >>> describe are both similar amount of work (and not much at
  >>> all), the HTML pages are generated by perl and I have the
  >>> parsed DESCRIPTION file there, i.e., using a fixed name
  >>> or the value of the Changelog field is basically the
  >>> same.
  >>> 

  TL> In which case a Changlog entry in DESCRIPTION would be a
  TL> very nice addition, and would have the advantage of not
  TL> requiring changes to packages.

  > yes *and* does allow slightly more flexibility with almost
  > no cost, as Fritz confirmed.

Well, as Kurt pointed out in another (?) thread "CRAN is not the R
universe", and, e.g., Seth might have another opinion when it comes to
BioC administration. But I don't think you can (or should) do too much
sensible computations on packages without having parsed the
DESCRIPTION file, so the "almost no cost" statement should be pretty
safe.


  > And, BTW, Gabor,  NEWS and ChangeLog are not at all the same
  > thing and it would be silly to urge users to one of them.
  > At least 'ChangeLog' is a well defined format for emacs users
  > that can very quickly be updated semi-automagically
  > ("C-x 4 a" when you're in file  foo.R with function myfun(.)
  >  autogenerates a neat entry in a ChangeLog file);
  > but then really people should be allowed to use other formats
  > for good reasons.

I fully agree.

.f