Issues with libcurl + HTTP status codes (eg. 403, 404)
In fact, this does reproduce on R-devel:
> options(download.file.method = "libcurl")
> options(repos = c(CRAN = "https://cran.rstudio.com/", CRANextra =
+ "http://www.stats.ox.ac.uk/pub/RWin"))
> install.packages("lattice") ## could be any package
Installing package into ?/Users/kevinushey/Library/R/3.3/library?
(as ?lib? is unspecified)
Error: Line starting '<!DOCTYPE HTML PUBLI ...' is malformed!
> sessionInfo()
R Under development (unstable) (2015-08-14 r69078)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.4 (Yosemite)
I think this could be problematic for users with custom CRAN
repositories. For example, if I have a CRAN repository that only
serves source packages (no binary packages), this implies that any R
session configured to download binary packages would fail to download
any packages at all (as it would barf on attempting to read the
non-existent PACKAGES file for the 'binary' branch of the custom
repository).
This can also be seen by attempting to install a package using current
R-devel (since no binaries are made available for R 3.3):
> options(download.file.method = "libcurl")
> options(repos = c(CRAN = "https://cran.rstudio.com/"))
> print(getOption("pkgType"))
[1] "both"
> install.packages("lattice")
Installing package into ?/Users/kevinushey/Library/R/3.3/library?
(as ?lib? is unspecified)
Error in install.packages : Line starting '<!DOCTYPE HTML PUBLI
...' is malformed!
The same error (with a different, XML response) is returned when using
e.g. `https://cran.fhcrc.org`.
Kevin
On Tue, Aug 25, 2015 at 1:33 PM, Martin Morgan <mtmorgan at fredhutch.org> wrote:
On 08/25/2015 01:30 PM, Kevin Ushey wrote:
Hi Martin, Indeed it does (and I should have confirmed myself with R-patched and R-devel before posting...)
actually I don't know that it does -- it addresses the symptom but I think there should be an error from libcurl on the 403 / 404 rather than from read.dcf on error page... Martin
Thanks, and sorry for the noise.
Kevin
On Tue, Aug 25, 2015, 13:11 Martin Morgan <mtmorgan at fredhutch.org
<mailto:mtmorgan at fredhutch.org>> wrote:
On 08/25/2015 12:54 PM, Kevin Ushey wrote:
> Hi all,
>
> The following fails for me (on OS X, although I imagine it's the
same
> on other platforms using libcurl):
>
> options(download.file.method = "libcurl")
> options(repos = c(CRAN = "https://cran.rstudio.com/",
CRANextra =
> "http://www.stats.ox.ac.uk/pub/RWin")) > install.packages("lattice") ## could be any package > > gives me: >
> > options(download.file.method = "libcurl")
> > options(repos = c(CRAN = "https://cran.rstudio.com/",
CRANextra
> > install.packages("lattice") ## coudl be any package
> Installing package into
?/Users/kevinushey/Library/R/3.2/library?
> (as ?lib? is unspecified)
> Error: Line starting '<!DOCTYPE HTML PUBLI ...' is malformed!
>
> This seems to come from a call to `available.packages()` to a URL
that
> doesn't exist on the server (likely when querying PACKAGES on the
> CRANextra repo)
>
> Eg.
>
> > URL <- "http://www.stats.ox.ac.uk/pub/RWin" > > available.packages(URL, method = "internal")
> Warning: unable to access index for repository
> http://www.stats.ox.ac.uk/pub/RWin
> Package Version Priority Depends Imports LinkingTo
Suggests
> Enhances License License_is_FOSS
> License_restricts_use OS_type Archs MD5sum
NeedsCompilation
> File Repository
> > available.packages(URL, method = "libcurl")
> Error: Line starting '<!DOCTYPE HTML PUBLI ...' is malformed!
>
> It looks like libcurl downloads and retrieves the 403 page itself,
> rather than reporting that it was actually forbidden, e.g.:
>
> >
> tempfile(), method = "libcurl")
> trying URL
> Content type 'text/html; charset=iso-8859-1' length 339 bytes
> ==================================================
> downloaded 339 bytes
>
> Using `method = "internal"` gives an error related to the inability
to
> access that URL due to the HTTP status 403.
>
> The overarching issue here is that package installation shouldn't
fail
> even if libcurl fails to access one of the repositories set.
>
With
> R.version.string
[1] "R version 3.2.2 Patched (2015-08-25 r69179)"
the behavior is to warn with an indication of the repository for which
the
problem occurs
> URL <- "http://www.stats.ox.ac.uk/pub/RWin" > available.packages(URL, method="libcurl")
Warning: unable to access index for repository
http://www.stats.ox.ac.uk/pub/RWin:
Line starting '<!DOCTYPE HTML PUBLI ...' is malformed!
Package Version Priority Depends Imports LinkingTo Suggests
Enhances
License License_is_FOSS License_restricts_use OS_type Archs
MD5sum
NeedsCompilation File Repository
> available.packages(URL, method="internal")
Warning: unable to access index for repository
http://www.stats.ox.ac.uk/pub/RWin:
cannot open URL 'http://www.stats.ox.ac.uk/pub/RWin/PACKAGES'
Package Version Priority Depends Imports LinkingTo Suggests
Enhances
License License_is_FOSS License_restricts_use OS_type Archs
MD5sum
NeedsCompilation File Repository
Does that work for you / address the problem?
Martin
>> sessionInfo()
> R version 3.2.2 (2015-08-14)
> Platform: x86_64-apple-darwin13.4.0 (64-bit)
> Running under: OS X 10.10.4 (Yosemite)
>
> locale:
> [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods
base
>
> other attached packages:
> [1] testthat_0.8.1.0.99 knitr_1.11 devtools_1.5.0.9001
> [4] BiocInstaller_1.15.5
>
> loaded via a namespace (and not attached):
> [1] httr_1.0.0 R6_2.0.0.9000 tools_3.2.2 parallel_3.2.2
whisker_0.3-2
> [6] RCurl_1.95-4.1 memoise_0.2.1 stringr_0.6.2 digest_0.6.4
evaluate_0.7.2
>
> Thanks,
> Kevin
>
> ______________________________________________
> R-devel at r-project.org <mailto:R-devel at r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
-- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793