[Bioc-devel] Including large files for the package
Hi, The file is downloaded and filtered from the ilincs website. Unfortunately it is not readily available from ilincs.org itself. We have the capability of storing the file in our S3 buckets, however. The metadata under discussion is metadata related to each individual signature stored in iLINCS, including information about the cell lines, the time points and the dosages. From what I understand this is more likely to be suited for ExperimentHub since it's processed. Regards, Dr. Ali Sajid Imami LinkedIn <https://pk.linkedin.com/pub/ali-sajid-imami/50/956/2a6> On Thu, Aug 31, 2023 at 12:02?PM Kern, Lori via Bioc-devel <
bioc-devel at r-project.org> wrote:
Hello, Regarding Hub use: What sort of information does the metadata contain? That would determine whether ExperimentHub or AnnotationHub is more appropriate. Is the file accessed directly from the http://ilincs.org/ portal with a url link or is there processing/filtering that occurs? The hubs can access data stored on other websites/hosts as long as they are trusted sites (ilincs would fall in this category) if you can access it directly with a url link. The way the hubs work is the data is stored elsewhere either directly from site access or on some hosting serve (S3, Azure, etc) if its processed. The data would be removed from directly being in the package, and downloaded then using the hub interface when needed (and also cached in the backend so its not done every time). Lori Shepherd - Kern Bioconductor Core Team Roswell Park Comprehensive Cancer Center Department of Biostatistics & Bioinformatics Elm & Carlton Streets Buffalo, New York 14263
________________________________ From: Bioc-devel <bioc-devel-bounces at r-project.org> on behalf of Vincent Carey <stvjc at channing.harvard.edu> Sent: Thursday, August 31, 2023 8:29 AM To: Martin Grigorov <martin.grigorov at gmail.com> Cc: bioc-devel at r-project.org <bioc-devel at r-project.org> Subject: Re: [Bioc-devel] Including large files for the package On Thu, Aug 31, 2023 at 7:28?AM Martin Grigorov <martin.grigorov at gmail.com wrote: Hello, Perhaps you could use https://secure-web.cisco.com/1PWeIBsHtYFpwnIBjpsq_YN8z0VkqqbOqtHQk4ITS1RC58_4Mploz6OJS4-Uxw4jq_g9JHqlT9Wq6tkKR-aBwYiSF6Bf-ajT-d7vnHBJlAHNLxs2Y3F979xVFa07xAiyrpeXtgfU0dHry6aNaTQmruT5HzYIplDg0UVfcLK9976qFmnnwuRbo24PxtCSMLTLKbVqlHi_URSb7MYdKpuxIP8SmFalHHQUUZWSG9NT1XSeuTkw8pXPtGzJPB2vyj-zO3-cy9RUHz5gLoFe53a3qV2cRVz7ov7WXhLErjX9fqk7A-EQOQSq5QeyWzmoonEUu/https%3A%2F%2Fbioconductor.r-universe.dev%2FBiocFileCache to download the big file on demand. The benefit is that the file would be stored in ~/.cache/R/yourPackage/ (for Linux; something similar for Windows/Mac) and reused between sessions. Thanks Martin. I think that is a possible approach, but the proposals at http://secure-web.cisco.com/1PN99uHlZGkagOQGmEM4lhVob-mny_wuOMrU_eG-JFkBnBX5W-tXbKupcTbZ-gSq-XMcO9_rg2sGp_3KwriGP5nkPGjk_bL8O5IxcEaPE04uFIvB_UVQh-2NzX-1LfalQo2nPrpuxM3FDJJJPRBz8pjayIb27ThNpZZQI50lyjOLdJUikYdS5-Y4TlTMDGCPfs_854qpfJREWoKeYTJOpRb-95SzxaPxDp2qePIkigSmQzj1JrjIfCYyLGCVIIq1Zz1-kbIEqem7cvMtWe2ZE_Af1yG9wA-51shDuYxapn9yaETK7E8Rsg_OTsp4yfB-R/http%3A%2F%2Fcontributions.bioconductor.org%2Fnon-software.html%3Fq%3DAnnotationHub%23annotationexperiment-hub-packages should also be considered. Ali, if the documentation regarding *Hub contributions is unclear, please file an issue or write back here with the difficulties so that we can improve the material and the methods! Thanks! Regards, Martin On Tue, Aug 29, 2023 at 5:15?AM Ali Sajid Imami < ali.sajid.imami at gmail.com wrote: Hi BioConductor Team, I am a PhD Candidate in the Cognitive Disorders Research lab at the university of Toledo. I am responsible for a number of R packages and our intention is to submit them to bioconductor over the next several months. I had just submitted a package drugfindR ( https://secure-web.cisco.com/1cicrzPanVq35q1BPuFjU_LiICsEK7iZoXLM-t2R1mHcgZYx9SUW2VsKWpSf18Qth0RFcer0FVZwPETWM2KmL8gNtvqOXoL4pEnpyzZqLv1acHN06QD6rwkShy1iEZsPyZLIJHhtNgsJEt7_0s7gYZE98GqoE2RSVyYhNOPS_2ZakwjaFtb-w3_dJGmt7wV1GXpapSa6w5gLICAPUjaaw1jFLsgCc_2dCVuc0mX9VGYNJywp_SDKJH8ex4KX6Groq7ThXm-EQbmSxB8WVqCR0rb-vIqAyS2IC_suOg22e6PkjRwYqgwjtN4mf7i6xe7r2/https%3A%2F%2Fgithub.com%2FCogDisResLab%2FdrugfindR). This was immediately closed as my repo had a single file over the 5MB limit. I wanted to ask both if you would reconsider/make an exception or guide me in the right direction. This package serves as a way to quickly learch through the LINCS data stored at the ilincs.org portal. The file in question is one of three metadata files that allows the package to function efficiently and without having to go through the expensive network requests. It would really be helpful if we could include the file as is. I do not expect more files like that to be added to the package at all. Barring that, I have seen the suggestion of using AnnotationHub or ExperimentHub. While I have gone through the documentation, I'm not entirely sure how those services work. Are those services where we can store the data itself or we are expected to host the data elsewhere and create lightweight "pointer" packages. Similarly, I'm not entirely sure which Hub this would go to. Any advice or guidance will be appreciated. Regards, Dr. Ali Sajid Imami LinkedIn < https://secure-web.cisco.com/1BTO_aZ7cH_8TaD11HyS10Fduxb3co4BqlJudIfzXykrcywobw2n0xsaOdEHdvKApkBAn1ZVq-dlLlBONRSk8O2_5L_2haztYIrFMPYFfQChfhTRe52Gdcvaf0lT4FPdRCC_JHpSCVynfXzds9EeIrf7CriylS-Hs59XtvvUZCfme16xvyeOjQgcY8rV_ODwI6TRsELOKgn34D-kyeRmOmAgaK36NoIFnfZ6uC2BufvWY5TsAXS7hD036WGkg8HSeW2GAYCpYrP95GhfcepkC45lkNsGGRLLFbS58VKw4kdp9OB5XG-9YYJC34SM_5vlF/https%3A%2F%2Fpk.linkedin.com%2Fpub%2Fali-sajid-imami%2F50%2F956%2F2a6 [[alternative HTML version deleted]] _______________________________________________ Bioc-devel at r-project.org mailing list https://secure-web.cisco.com/1aIH389Qk-OTABdM2O6WRy3nL87dqGAbww3fvlRUQA1ie32pxTqf1ZNqzSwxT4LBBlZGgr0QEaJEiHj1JJUKtErqRKGsKQpZpnKjrVVRQPTE0tIORp-qF_USGEarsV6aGVvsNkXfJUc-R46vl1kdq1H4TflgSCi37HVdqHBiEwzEdWJ-gctbw92v8xqwORxqzLzv4PLo_qLaou5YH6hoa---kRWCjhAbC92iJJ-wGBp3n2pe8vsduhJsd0IIOOAsSu4YAgqm41T0oLGfuZYdgbBxT_rAg7iDKlHUxMLr0PbGQ_RGclNT-sztwjd0fbIZq/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel [[alternative HTML version deleted]] _______________________________________________ Bioc-devel at r-project.org mailing list https://secure-web.cisco.com/1aIH389Qk-OTABdM2O6WRy3nL87dqGAbww3fvlRUQA1ie32pxTqf1ZNqzSwxT4LBBlZGgr0QEaJEiHj1JJUKtErqRKGsKQpZpnKjrVVRQPTE0tIORp-qF_USGEarsV6aGVvsNkXfJUc-R46vl1kdq1H4TflgSCi37HVdqHBiEwzEdWJ-gctbw92v8xqwORxqzLzv4PLo_qLaou5YH6hoa---kRWCjhAbC92iJJ-wGBp3n2pe8vsduhJsd0IIOOAsSu4YAgqm41T0oLGfuZYdgbBxT_rAg7iDKlHUxMLr0PbGQ_RGclNT-sztwjd0fbIZq/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel -- The information in this e-mail is intended only for th...{{dropped:31}}