Hi David,
----- Original Message -----
From: "David Smith" <davidsmi at microsoft.com>
To: "Dan Tenenbaum" <dtenenba at fredhutch.org>, "Uwe Ligges"
<ligges at statistik.tu-dortmund.de>, "Elliot Waingold"
<Elliot.Waingold at microsoft.com>
Cc: "R-devel at r-project.org" <r-devel at r-project.org>
Sent: Wednesday, August 12, 2015 12:42:39 PM
Subject: RE: [Rd] download.file() on ftp URL fails in windows with
default download method
We were also able to reproduce the issue on Windows Server 2012. If
there's anything we can do to help please let me know; Elliot
Waingold (CC'd here) can provide access to the VM we used for
testing if that's of any help.
Thanks!
I have just been looking at this issue with Martin Morgan. We found
that if we "or" the additional flag INTERNET_FLAG_PASSIVE on line
1012 of src/modules/internet/internet.c (R-3.2 branch, last changed
in r68393)
that the ftp connection works.
Further investigation reveals that in a passive ftp connection,
certain ports on the client need to be open.
This machine is in the Amazon cloud so it was easy to open the ports.
But we still have a problem and I believe it's that the wrong IP
address is being sent to the server (on an AWS machine, the machine
thinks of itself as having one IP address, but that is a private
address that is valid inside AWS only).
Here's a curl command line that gets around this by sending the
correct address (or hostname):
curl --ftp-port myhostname.com
ftp://ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/All/GCF_000001405.13.assembly.txt
Curl normally uses passive mode which is why it works, but the
--ftp-port switch tells it to use active mode with the specified ip
address or hostname.
So I'm not sure where we go from here. One easy fix is just to add
the INTERNET_FLAG_PASSIVE flag as described above. Another would be
to first check if active mode works, and if not, use passive mode.
Dan
# David Smith
--
David M Smith <davidsmi at microsoft.com>
R Community Lead, Revolution Analytics (a Microsoft company)
Tel: +1 (312) 9205766 (Chicago IL, USA)
Twitter: @revodavid | Blog: ?http://blog.revolutionanalytics.com
We are hiring engineers for Revolution R and Azure Machine
Learning.
-----Original Message-----
From: R-devel [mailto:r-devel-bounces at r-project.org] On Behalf Of
Dan
Tenenbaum
Sent: Tuesday, August 11, 2015 09:51
To: Uwe Ligges <ligges at statistik.tu-dortmund.de>
Cc: R-devel at r-project.org
Subject: Re: [Rd] download.file() on ftp URL fails in windows with
default download method
----- Original Message -----
From: "Dan Tenenbaum" <dtenenba at fredhutch.org>
To: "Uwe Ligges" <ligges at statistik.tu-dortmund.de>
Cc: "R-devel at r-project.org" <r-devel at r-project.org>
Sent: Saturday, August 8, 2015 4:02:54 PM
Subject: Re: [Rd] download.file() on ftp URL fails in windows
with
default download method
----- Original Message -----
From: "Uwe Ligges" <ligges at statistik.tu-dortmund.de>
To: "Dan Tenenbaum" <dtenenba at fredhutch.org>,
"R-devel at r-project.org" <r-devel at r-project.org>
Sent: Saturday, August 8, 2015 3:57:34 PM
Subject: Re: [Rd] download.file() on ftp URL fails in windows
with
default download method
On 08.08.2015 01:11, Dan Tenenbaum wrote:
url <-
"ftp://ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/All/GCF_000001405.13.assembly.txt"
download.file(url, tempfile())
trying URL
'ftp://ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/All/GCF_000001405.13.assembly.txt'
Error in download.file(url, tempfile()) :
cannot open URL
'ftp://ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/All/GCF_000001405.13.assembly.txt'
In addition: Warning message:
In download.file(url, tempfile()) : InternetOpenUrl failed:
''
If I set method="curl" it works fine. This was on
R-3.2.2-beta
(sessionInfo() below) but I got the same results in R-3.2.1
and
R-devel.
This does not happen on Windows Server 2008 but it happens on
Windows Server 2012.
Thanks for letting us know. The kot recent machine I checked
with
is
Windows Server 2008 R2 and I have not got problems on those.
Can
someone else rerpoduce this on Windows Server 2012?
If you like I can give you temporary access (via remote desktop)
to
a
machine in the Amazon cloud.
You can also download a Vagrant box here:
https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fatlas
.hashicorp.com%2fboxes%2fsearch%3futf8%3d%25E2%259C%2593%26sort%3d%26p
rovider%3d%26q%3dwindows%2bserver%2b2012&data=01%7c01%7cdavidsmi%40mic
rosoft.com%7ce6746faa79b6426c81a508d2a26d3d35%7c72f988bf86f141af91ab2d
7cd011db47%7c1&sdata=Z5pE32RJ7wEs4UBfRxXSDEqG6ESxFSFmHdFCU78kuaA%3d
Just wanted to check in about this to see whether anyone else has
been able to reproduce this, or if Uwe has, or if anyone needs help
setting up a test environment either in the cloud or by using a VM
(like with Vagrant). I would be more than happy to help. I can set
up a temporary instance in the cloud that interested parties could
access at no cost.
This issue looks like a showstopper for Bioconductor; we are in the
process of moving our build system, and we were upgrading from
Windows Server 2008 to Windows Server 2012 in the process, but this
issue is going to affect a lot of packages if it is not resolved.
What I can say is that it does not seem like a firewall issue, as
the
download works fine if I specify method="curl" (or libcurl) or
paste the url into a browser, and I get the same results whether
Windows Firewall is on or off.
My naive guess is that the InternetOpenUrl API has changed in
between
Windows Server 2008 and Windows Server 2012.
The offending call to this API seems to be at
src/modules/internet/internet.c:#908 (in the R-3.2 branch; I did
try
R-devel as of r68987 and it still has this problem).
I am really hoping something can be done about this before the
release of R-3.2.2.
Thanks!
Dan
R version 3.2.2 beta (2015-08-05 r68859)
Platform: x86_64-w64-mingw32/x64 (64-bit) Running under:
Windows
Server 2012 x64 (build 9200)
locale:
[1] LC_COLLATE=English_United States.1252 [2]
LC_CTYPE=English_United States.1252 [3]
LC_MONETARY=English_United
States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United
States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods
base