HTTP User-Agent header
OK, that suggests setting at the options level would solve both of your problems and that seems like the best approach. I don't really want to pass this around as a parameter through the maze of functions that might actually download something if we don't have to. I think we can provide something early next week on R-devel for folks to test. But I suspect that as Henrik also does, the set of sites that will refuse us with a User-Agent header will be much larger than those that James has found that refuse us without it. best wishes Robert
Henrik Bengtsson wrote:
On 7/28/06, Robert Gentleman <rgentlem at fhcrc.org> wrote:
I wonder if it would not be better to make the user agent string something that is configurable (at the time R is built) rather than at run time. This would make Seth's patch about 1% as long. Or this could be handled as an option. The patches are pretty extensive and allow for setting the agent header by setting parameters in function calls (eg download.files). I am not sure there is a good use case for that level of flexibility and the additional code is substantial. The issue that I think arises is that there are potentially other systems that will be unhappy with R's identification of itself and so some users may also need to turn it off. Any strong opinions?
Actually two: 1) If you wish to pull down (read extract from HTML or similar) live data from the web, you might want to be able to "immitate" a certain browser. For instance, if you tell some webserver you're a simple "mobile phone" or "lynx", you might be able get back very clean data. Some servers might also block unknown web browsers. 2) If the webserver of a package reprocitory decided to make use of the user-agent string to decide what version of the reprocitory it should deliver, I would like to be able to trick the server. Why? Many times I found myself working on a system where I do not have the rights to update to the latest or the developers version of R. However, although I have not the very latest version of R you can do work. For instance, in Bioconductor the biocLite() & co gives you either the stable or the developers of Bioconductor depending on your R version, but looking into the biocLite() code and beyond, you find that you actually can install a Bioconductor v1.9 package in R v2.3.1. It can be risky business, but if you know what you're doing, it can save your day (or week). Cheers Henrik
James P. Howard, II wrote:
On 7/28/06, Seth Falcon <sfalcon at fhcrc.org> wrote:
I have a rough draft patch, see below, that adds a User-Agent header to HTTP requests made in R via download.file. If there is interest, I will polish it.
It looks right, but I am running under Windows without a compiler.
-- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 rgentlem at fhcrc.org
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 rgentlem at fhcrc.org