I write an URL of MalaCards (a database) in my description, because I need the data in this database. However, there is an error here as followed:
Found the following (possibly) invalid URLs:
URL: http://www.malacards.org/
From: DESCRIPTION
Status: 403
Message: Forbidden
I don?t know why. The URL was picked up in the article of this database and I can open it.
This is the first time that I develop a package, therefore I am eager for your help.
Thanks.
| |
jared_wood
|
|
jared_wood at 163.com
|
???????????
[R-pkg-devel] An invalid URLs
5 messages · jared_wood, Richard M. Heiberger, Ivan Krylov +1 more
Is it perhaps an https:// address? You browser will make the adjustment. CRAN will give this message.
On Thu, Mar 12, 2020 at 10:08 PM jared_wood <jared_wood at 163.com> wrote:
I write an URL of MalaCards (a database) in my description, because I need the data in this database. However, there is an error here as followed:
Found the following (possibly) invalid URLs:
URL: http://www.malacards.org/
From: DESCRIPTION
Status: 403
Message: Forbidden
I don?t know why. The URL was picked up in the article of this database and I can open it.
This is the first time that I develop a package, therefore I am eager for your help.
Thanks.
| |
jared_wood
|
|
jared_wood at 163.com
|
???????????
[[alternative HTML version deleted]]
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
On Fri, 13 Mar 2020 09:49:03 +0800 (GMT+08:00)
jared_wood <jared_wood at 163.com> wrote:
Status: 403
Message: Forbidden
I don?t know why. The URL was picked up in the article of this
database and I can open it.
To be fair, my requests to this service are also blocked unless I use Tor Browser. This could be an aggressive case of GeoIP banning. (Blocking a well-known university but not blocking Tor exit nodes? Talk about security theatre.) Another reason could be that R CMD check uses libcurl with its default User-Agent: libcurl/A.BB.C to check URLs, and the remote server could deny requests from such automated user agents, only allowing clients that look like browsers. I have no data to confirm this hypothesis.
Best regards, Ivan
On Fri, 13 Mar 2020 11:02:06 +0300
Ivan Krylov <krylov.r00t at gmail.com> wrote:
the remote server could deny requests from such automated user agents, only allowing clients that look like browsers
Here is what I have been able to observe: If wait for some time, then try to access http://www.malacards.org/ using cURL, I start getting 403 errors in both cURL and browser. If wait for some time, then go to http://www.malacards.org/ using a browser and click on a few links, subsequent access using cURL from the same IP address also starts working (for a while). Given the paranoid nature of the website's security system, it's hard to offer a good solution to your problem: linking to it may place people running R CMD check into temporary ban, while not linking to it does not seem polite.
Best regards, Ivan
3 days later
On 13.03.2020 12:59, Ivan Krylov wrote:
On Fri, 13 Mar 2020 11:02:06 +0300 Ivan Krylov <krylov.r00t at gmail.com> wrote:
the remote server could deny requests from such automated user agents, only allowing clients that look like browsers
Here is what I have been able to observe: If wait for some time, then try to access http://www.malacards.org/ using cURL, I start getting 403 errors in both cURL and browser. If wait for some time, then go to http://www.malacards.org/ using a browser and click on a few links, subsequent access using cURL from the same IP address also starts working (for a while). Given the paranoid nature of the website's security system, it's hard to offer a good solution to your problem: linking to it may place people running R CMD check into temporary ban, while not linking to it does not seem polite.
Well, then mention the URL in plain text but not link.... Best, Uwe Ligges