Skip to content

[R-pkg-devel] Logical Inconsistency of check regarding URL?

4 messages · Dr. habil. Michael Thrun, Jeff Newmiller, Ivan Krylov +1 more

#
Dear All,
I got from Uwe the following message after uploading an update of my package ?DatabionicSwarm?.  "Found the  following (possibly) invalid URLs: URL:  https://www.deepbionics.org (moved to  https://mthrun.github.io/index) From: DESCRIPTION Status: 301 Message: Moved Permanently
Please change http --> https, add trailing slashes, or follow moved content as appropriate.
"
 I asked then
"
Dear Uwe,
your request states to either
Please  change http --> https, add trailing slashes or to  follow moved content as appropriate.
As it is not  appropriate to follow content, I select the first   choice. the URL is 
"https://www.deepbionics.org/" 
 which mets both conditions of beeing "https" and 
 having at the end one "/".
What is the problem?  Please elaborate.
You already accepted another  package GeneralizedUmatrix today with exactly same 
 url. I really dont understand it. Please be so kind 
 to elabore.?

 As an answer I got
"You are abusig the system! Again: <Followed by same Message as above>?

Hence, I have several questions.
First, do we not communicate with CRAN anymore through the submission procedure of the package? If not, which is the correct line of communication in such a case?

Second, are the answers that we get now fully automatically generated? It would be strange for me to believe that Uwe would provide such an answer to my polite request.

Third, why can I have a CRAN package "DataVisualizations" with this URL online, another one named "GeneralizedUmatrix" uploaded the same day with the same URL, which both are OK, but the URL in "DatabionicSwarm" is not?

Forth, can't we have more clear feedback messages?
I mean, having in the description the URL "https://www.deepbionics.org/" and getting the feedback "http --> https, add trailing slashes or ... "does not make any sense. Also, could someone please explain why is a "/" at the end of an URL necessary? What is the technical background to this?

Fifth, why do we need https/TLS/SSL? I have to pay a monthly fee for a certificate to apply this to my website so that CRAN accepts my URL - and as far as I can tell, it makes things only more complex but not more secure (e.g., https://www.elektronik-kompendium.de/sites/net/1906041.htm). Or in other words, it seems to me that we are expected to pay to follow the guideline of having a certificate instead of making better code with fewer bugs. I am no security expert, but my baseline in computer science is always if a tool is more complex, then the chances are lower that it works as intended, and the possibilities are higher that it has unintended and potentially risky side effects.    

Best Regards

Michael
#
Educating package authors about the semantics of URLs is not really something the CRAN maintainers should have to do. A slash at the end of a URL implies different URL construction for relative URLs based on the original one. [1]

I am not sure why you think https is no more secure than http... that is exactly why it exists. Again, not the job of CRAN maintainers to explain this to you.

It is not appropriate to rely on URL redirection (from deepbionics to github) in a package... redirections are extra work and hide the true url from the user anyway.

As for having the same URL accepted in one package but rejected in another... well, CRAN moderators are not perfect. They try to automate identification of issues that have previously caused problems, but maybe they don't catch everything every time.

Yes, making software that doesn't break in odd situations is hard. Thank you for making an effort, and thank CRAN for remembering all of these mysterious problems and warning you to fix things that might only break in situations you haven't directly encountered before they puzzle new R users.

[1] https://stackoverflow.com/questions/5948659/when-should-i-use-a-trailing-slash-in-my-url
On November 28, 2022 11:19:40 PM PST, "Dr. habil. Michael Thrun" <m.thrun at gmx.net> wrote:

  
    
#
Dear Michael,

On Tue, 29 Nov 2022 08:19:40 +0100
"Dr. habil. Michael Thrun" <m.thrun at gmx.net> wrote:

            
The "HTTPS and trailing slashes" part is a red herring. The idea is to
only have URLs in your package that return HTTP code 200.
The website https://www.deepbionics.org redirects to
https://mthrun.github.io/index, which is, strictly speaking, against
the letter of the rules [1]. Websites that redirect from http://... to
https://... and from .../website/path to .../website/path/ (and the
other way around) are a common cause of such redirects, which is why Uwe
mentioned it (I think), but this isn't the reason for the redirection at
https://www.deepbionics.org.

I think you could make the argument that https://www.deepbionics.org is
the canonical URL for the website and the way it _currently_ works (by
returning a 301 redirect to https://mthrun.github.io/index) is an
implementation detail that should be ignored, but I don't know whether
CRAN reviewers would agree. I think it should be possible to set up
your domain and GitHub Pages to serve mthrun.github.io at the address
www.deepbionics.org without a redirection [2], but I've never tried it
myself.
There was a case once when the reviewer was mistaken (they were in the
process of heroically clearing out the "newbies" queue that almost
reached 80 packages, aged 10 days and more, all by themselves, so a
certain amount of fatigue was to be expected) and I was able to argue
my way out of a rejection by replying to the reviewer. I think that the
way to go is to either submit a package with requested changes and an
incremented version or to reply-to-all and argue the case for the
package as it is now.
Has anything changed recently regarding the way your domain is set up?
It really is strange that the check passed for one of the packages but
not the other.
I think you're right (see also: depriving an existing website of its
certificate as a means of censorship), but the browser makers may end
up destroying TLS-less workflow for us in a few years. Thankfully, it's
not a requirement of CRAN to have only HTTPS links. I probably
shouldn't continue this topic because the question of "how the Web
should function" tends to result in pointlessly heated debates.
#
Dear Michael,

I second Ivan's suggestion to use a custom domain on your GitHub Pages 
website and I also want to add that this solves your certificate issue 
as a nice side-effect.

GitHub will automatically create a certificate for you, for free, via 
Let's Encrypt:

https://github.blog/2018-05-01-github-pages-custom-domains-https/

Best,

Hugo
On 29/11/2022 08:55, Ivan Krylov wrote: