I'd add my support for mode = "wb" to (eventually) become the default,
though I respect Tomas's comments about backwards-compatibility.
Instead of making the argument mandatory (which would immediately
break scripts -- even ones that won't be helped by changing to mode =
'wb') or otherwise changing behaviour, perhaps download.file could
start to emit a message (not a warning) whenever the argument is
missing on Windows. The message could say something like 'Using `mode
= 'w'` which will corrupt non-text files. Set `mode = 'wb'` for binary
downloads or see the help page for other options.' Emitting a message
has the lightest impact on existing scripts, while alerting new users
to future mistakes.
On 7 May 2018 at 18:49, Joris Meys <jorismeys at gmail.com> wrote:
Martin, also from me a heartfelt thank you for taking care of this. Some
thoughts on Henrik's response:
On Mon, May 7, 2018 at 2:28 AM, Henrik Bengtsson <
henrik.bengtsson at gmail.com
I still argue that the current behavior cause more harm than it helps.
I agree with your analysis of the problems this legacy behaviour causes.
Deprecating the default mode="w" on Windows can be done in steps, e.g.
by making the argument mandatory for a while. This could be done on
all platforms because we're already all affected, i.e. we need to
specify 'mode' to avoid surprises.
That sounds like a reasonable way to move away from this discrepancy
between OS.
What about case-insensitive matching, e.g. data.ZIP and data.Rdata?
Totally agree, and easily solved by eg adding ignore.case = TRUE to the
grep() call.
A quick scan of the R source code suggests that R is also working with
the following filename extensions (using various case styles):
What about all the other file extensions that we know for sure are
If the default isn't changed, doesn't it make more sense to actually turn
the logic around? Text files that are downloaded over the internet are
almost always .txt, .csv, or a few other extensions used for text data .
Those are actually the only files where some people with very old Windows
programs for text processing can get into trouble. So instead of adding
every possible binary extension, one can put "wb" as default and change
"w" if it is a text file instead of the other way around. That would not
change the concept of the behaviour, but ensures that the function
fail to detect a binary file. Not detecting a text file is far less of a
problem, as not converting the line endings doesn't destruct the file.
Cheers
Joris
--
Joris Meys
Statistical consultant
Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)
<https://maps.google.com/?q=Coupure+links+653,%C2%A0B-
9000+Gent,%C2%A0Belgium&entry=gmail&source=g>