Skip to content
Prev 20473 / 21318 Next

[Bioc-devel] [EXTERNAL] Deeptarget package

I agree that it is important to only redistribute data in a way that is consistent with whatever terms are encountered when accessing the data.

Some options I've seen include

(1) using an API that allows authentication (the httr2 'auth' vignette provides workflows; I took the time recently to figure out how to authenticate against Globus for access to restricted data, https://mtmorgan.github.io/software/2024/06/20/globus-oauth.html);

(2) point the user to the data source and provide functions / etc to incorporate data that has been downloaded by the user into your work flow (e.g., 'use `browseURL(?)` to access and download the data, then use `import_my_data()` to continue with the workflow';

(3) require that the user acknowledges the license of the data before retrieving it (e.g., AlphaMissenseR asks the user to accept licensing using `readline()`)

(4) contact the data owner and ask for a written exception to access restrictions (I believe this was the case for the msigdb package);

(5) rely on 'advertised' data methods that are less encumbered by licensing requirements (I believe KEGGREST takes this approach?);

(6) find a different data resource providing unrestricted access to similar data.

It is equally important to communicate licensing restrictions to the user. This often seems challenging. For instance KEGGREST mentions restrictions to academic use in the Description field (good) but the sole purpose of the package is to provide this data resource so its 'Artistic-2.0' license is appropriate for the code, but somehow not for the package as a whole; R can be configured to exclude certain types of packages from discovery / installation via `available.packages(filter =?)` based on the License field of the DESCRIPTION file. A cursory look suggests msigdb does not mention in a prominent location that data redistribution has been allowed by the data owner. It also seems appropriate to acknowledge the data source in the Authors at R: field of the DESCRIPTION file, with roles documented on the help page `?person` (e.g., 'dtc' data contributor or 'cph' copyright holder).

It would be interesting / challenging to identify packages that do an excellent job of this.

Generally, I believe we should be advocating for open data resources, maybe using https://docs.nih-cfde.org/en/latest/the-fair-cookbook/content/recipes/Compliance/fairshake/ or other resources for assessing openness.

Martin


From: Bioc-devel <bioc-devel-bounces at r-project.org> on behalf of Kristian Ullrich <ullrich at evolbio.mpg.de>
Date: Tuesday, June 25, 2024 at 1:45?AM
To: Nguyen, Trinh (NIH/NCI) [C] <tinh.nguyen at nih.gov>
Cc: bioc-devel at r-project.org <bioc-devel at r-project.org>
Subject: Re: [Bioc-devel] [EXTERNAL] Deeptarget package
Dear Nguyen,

Thank you for your response.

I am just a bit puzzled, and no other developer has answered so far.

E.g. in the description of the depmap package it says:

"This data is distributed under the Creative Commons license (Attribution 4.0 International (CC BY 4.0))."

Anyhow if you visit the original source database it comes with the Terms and conditions that one need to agree and it says specifically:

"Governing Law
The terms and conditions herein shall be construed, governed, interpreted, and applied in accordance with the internal laws of the Commonwealth of Massachusetts, U.S.A. Furthermore, by accessing, downloading, or using the Database, You consent to the personal jurisdiction of, and venue in, the state and federal courts within Massachusetts with respect to Your download or use of the Database."

So if it comes to law, it does not matter, that you just use a subset of the original data or that you just get it from another source, since the statement is kind of very clear.

I do not see any kind of agreement in the depmap package nor a "warning" statement, like "by using this package you agree on the following terms ?"

So how to deal correctly (legally and code-wise) in this situation?

Best regards

--
Kristian Ullrich, Ph.D.
Max Planck Institute
For Evolutionary Biology

Scientific IT group
Department of Evolutionary Genetics
August Thienemann Str. 2
24306 Pl?n
Germany
+49 4522 763 313
ullrich at evolbio.mpg.de

?Ich wei?, allen tut's leid. Jeder muss gucken, wo er bleibt. Dein Lohn, so gut wie nichts. Nichts, was du tust, f?llt ins Gewicht.? (Die traurigen Hummer; Moritz Kr?mer)

--
CONFIDENTIALITY NOTICE:
The contents of this email message and any attachments are intended solely for the addressee(s) and may contain confidential and/or privileged information and may be legally protected from disclosure. If you are not the intended recipient, you are hereby notified that any use, dissemination, copying, or storage of this message or its attachments is strictly prohibited.
_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel