Skip to content
Prev 19857 / 21312 Next

[Bioc-devel] Including large files for the package

Hi Ali,

Looking at the files, it seems although the file extension is .xls, they're
actually just plain text TSV files.  They compress pretty well with
standard tools and R is able to easily read a tsv compressed with something
like GZIP. I wonder if you've considered just compressing the files and
otherwise using them as they are.  The Hub approaches are neat, but maybe
overkill if the files are < 2MB compressed.

However, I wonder if it's necessary to distribute them with the package at
all.  Perhaps I'm missing it, but I don't see any reference to reading
those files in your code, and the contents already appear to be held in
sysdata.rda.  IMO it would be sufficient to document how sysdata.rda was
created in a README so others can see how it was created (perhaps hosting
those files on your own S3 storage if you have permission to do so) and
then remove the raw files from the package.

Best wishes,
Mike

On Thu, 31 Aug 2023 at 20:08, Ali Sajid Imami <ali.sajid.imami at gmail.com>
wrote:

  
  
Message-ID: <CAGfgtYDmdobQTXT=WdB4LAkiA4TBVhFsSMdFDrJAtWCwpeFNFw@mail.gmail.com>
In-Reply-To: <CABdg6jOrgH4aywwP-aVKr0T1WSyr5qf4mWJ4a49CTuE3CqoUjA@mail.gmail.com>