Skip to content

[Bioc-devel] Issue removing large data files from git history

2 messages · Murphy, Alan E, Nitesh Turaga

#
Hi,

I have recently moved my data to an ExperimentHub package and thus need to clean my statistical package's git history. I followed the guides from both the Bioconductor help page and BFG cleaner pages:

https://bioconductor.org/developers/how-to/git/remove-large-data/
https://rtyley.github.io/bfg-repo-cleaner/

These appeared to have worked, I set the limit to 5MB and pushed the changes. Now when I run the clean-up I get the following message:


Warning : no large blobs matching criteria found in packfiles - does the repo need to be packed?

So it appears nothing remains over the 5MB size but when I locally run BiocCheck() the warning message persists:

WARNING: The following files are over 5MB in size:      '.git/objects/pack/pack-bb385fcbe4bf53b933abe467a62cd836461ac322.pack'

Does anyone have experience dealing with an issue like this? The repository is in the following location if anyone is curious:

https://github.com/NathanSkene/EWCE

Kind regards,
Alan.


Alan Murphy
Bioinformatician
Neurogenomics lab
UK Dementia Research Institute
Imperial College London
2 days later
#
Hi,

This means that BFG cleaner didn't get rid of all the files. You can trace back the .pack file to a commit in your git log.

You can follow something like this post on github, https://github.com/18F/C2/issues/439 to find which file relates to the .pack file you mention.

Best,

Nitesh 

?On 2/24/21, 12:26 PM, "Bioc-devel on behalf of Murphy, Alan E" <bioc-devel-bounces at r-project.org on behalf of a.murphy at imperial.ac.uk> wrote:

    Hi,

    I have recently moved my data to an ExperimentHub package and thus need to clean my statistical package's git history. I followed the guides from both the Bioconductor help page and BFG cleaner pages:

    https://bioconductor.org/developers/how-to/git/remove-large-data/
    https://rtyley.github.io/bfg-repo-cleaner/

    These appeared to have worked, I set the limit to 5MB and pushed the changes. Now when I run the clean-up I get the following message:


    Warning : no large blobs matching criteria found in packfiles - does the repo need to be packed?

    So it appears nothing remains over the 5MB size but when I locally run BiocCheck() the warning message persists:

    WARNING: The following files are over 5MB in size:      '.git/objects/pack/pack-bb385fcbe4bf53b933abe467a62cd836461ac322.pack'

    Does anyone have experience dealing with an issue like this? The repository is in the following location if anyone is curious:

    https://github.com/NathanSkene/EWCE

    Kind regards,
    Alan.


    Alan Murphy
    Bioinformatician
    Neurogenomics lab
    UK Dementia Research Institute
    Imperial College London


    _______________________________________________
    Bioc-devel at r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/bioc-devel