On 22 Mar 2019, at 17:02, Shepherd, Lori <Lori.Shepherd at RoswellPark.org> wrote:
To chime and expand a bit on Vince's comments:
I feel Bioconductor's position when accepting packages , with few exceptions, is that nothing should be written or saved to a users directory without the expressed permission of the user for fear of overwriting a users own directory or files previous to the packages intended use. As Vince explained
For this reason we recommend that the defaults to all function and usage in man/vignettes/tests be written to the tempdir()/tempfile() options. If the package documentation is clear, than it should be known in practical use the user should specify a more permanent location for the file creation rather than a temporary location.
If a file is suppose to persist, BiocFileCache is an option for monitoring and storing files and is becoming a more standard way of organizing files. There is the idea of saving objects to the cache with a given "rname" that would be a unique identifier. Using that identifier, your package or the users should be able to use bfcquery to query the cache and retrieve the file path. As Vince said, this should then be documented in your package. Without thoroughly understanding the implementation of your package this might be of use to you.
Less likely: Depending on its implementation in your package, you may also find the bfcadd function has an option of action = c("copy", "move", "asis") which controls if the file is moved into the BiocFileCache default directory, copied from the location, or left in the original location.
Cheers,
Lori Shepherd
Bioconductor Core Team
Roswell Park Cancer Institute
Department of Biostatistics & Bioinformatics
Elm & Carlton Streets
Buffalo, New York 14263
From: Bioc-devel <bioc-devel-bounces at r-project.org <mailto:bioc-devel-bounces at r-project.org>> on behalf of Vincent Carey <stvjc at channing.harvard.edu <mailto:stvjc at channing.harvard.edu>>
Sent: Friday, March 22, 2019 11:55:05 AM
To: Koustav Pal
Cc: bioc-devel; Ferrari Francesco
Subject: Re: [Bioc-devel] What is Bioconductor's position on allowing users to create files in the working directory without an explicit path definition in the filename
Guidelines on this topic do not seem to be present in our web
site; there is a link to Wickham's guide but I don't see that it
confronts the topic. I will make some unofficial and possibly
wrong remarks.
Suppose my function has to create a file "foo.txt". If I do it
in the working folder, I might destroy a user's cherished file.
So I should check to see if the filename I need is already in use.
If it is, I need to do something graceful.
That's a lot of complexity that may never actually be used. Can
we avoid it completely? Here are a few ways to avoid it:
1) Don't create files, just create objects and leave the serialization
task to the user. You can provide helper functions and documentation but
the details of target location of the serialization are left to the user.
2) If you create a file, use R's tempfile/tempdir discipline to avoid
the need for checking for clobber. If the content needs to persist the
user should direct this, again with helpers as needed.
3) If you create a file that should persist, use BiocFileCache as that
addresses the location problem and has an added benefit of obligatory
metadata binding. This is an underused strategy and more pedagogy
is surely in order. If the user "cannot find" what has been made, there
is a systematic approach available that involves querying the cache. Your
documentation will supply all relevant details.
On Fri, Mar 22, 2019 at 11:38 AM Koustav Pal <koustav.pal at ifom.eu <mailto:koustav.pal at ifom.eu>> wrote:
Hello,
My package HiCBricks was submitted and accepted under the previous 3.8
release of Bioconductor.
At the time, during package review, my reviewer had expressed reservations
towards my package creating
files in the current working directory.
[REQUIRED] CreateLego() creates HDF5 files in the current directory if no
path is given in the Output.Filename argument. This may clutter the working
directory and it would be better to have the files saved to a temporary
file
(or directory) using tempfile() (or tempdir()).
This was with regards to the main output files that were being created by
my package.
I clarified the specific point in question with my reviewer.
The idea behind this package is to create a HDF file for storing
high-resolution Hi-C (can be as large as a user wants) data and keep it as
a persistent copy which the user can access later without having to reload
the file. Therefore, I am a bit averse towards creating a tempfile or
tempdir. Using a temporary file would go against this idea and would
probably result in the user not having access to the file later. I have
incorporated a control statement which will issue a warning regarding file
creation inside the current working directory. Is that ok?
Finally, my reviewer suggested that I make use of the BiocFileCache
package to create files.
The changes so far look good. I understand that tempfile() isn't a great
solution for your package, so may I recommend that you store your data
using the BiocFileCache package
https://bioconductor.org/packages/release/bioc/html/BiocFileCache.html <https://bioconductor.org/packages/release/bioc/html/BiocFileCache.html> <
https://bioconductor.org/packages/release/bioc/html/BiocFileCache.html <https://bioconductor.org/packages/release/bioc/html/BiocFileCache.html>>
as opposed to automatically saving the file in a local directory. Once this
change is made, I should be able to accept the package.
I interpreted this as the reviewer expressing reservation towards files
being created in the
current working directory without the user's explicit requirement.
Therefore, I made a working
implementation of BiocFileCache within my package, which works perfectly
fine.
Yet, users are now facing troubles when having to locate files that they
may have created in the current
working directory using the traditional method of var = ?something.txt?,
because these files were created in
the BiocFileCache cache during file creation. All the confusion and issue
stems from this being a non-traditional
method of keeping track of files and folders.
What is Bioconductor?s position regarding this issue?
Can users create files using Bioconductor packages in the current working
directory without an explicit path definition in the filename?
Or did I misinterpret the reviewer?s position and this is only a
requirement when the package is being build by the builder?
Koustav Pal,
Post-Doctoral Fellow in Genome Architecture,
Computational Genomics Group,
IFOM - The FIRC Institute of Molecular Oncology,
Via Adamello 16,
20139 Milano, Italy.
Phone: +393441130157
E-mail: koustav.pal at ifom.eu <mailto:koustav.pal at ifom.eu>
[[alternative HTML version deleted]]