Skip to content
Prev 15552 / 21312 Next

[Bioc-devel] read.table fails with https protocol

Hey Bioc-devel community,

My package OUTRIDER fails again sometimes on the build system but rather
randomly. First I thought it was due to the ImageMagick problem I posted
some days ago. But this is really only a warning.

I guess I found the problem. But this I dont really understand. Any help
is appreciated.

I assume from the docs that *read.table* works for http and https. But
on the build system and also locally sometimes this fails with the error:

Error in read.table(URL, sep = "\t") : no lines available in input

I digged into it a bit and it looks like *readLines* has problems
reading from https connections. See below my examples:

library(data.table)
library(utils)
library(curl)

# Link to a count table in TSV format at nature.com
URL <-
"media.nature.com/original/nature-assets/ncomms/2017/170612/ncomms15824/extref/ncomms15824-s1.txt"

# Fails with https
read.table(paste0("https://", URL), sep="\t", nrows=10)[,1:10]

# Works with plain http
read.table(paste0("http://", URL),? sep="\t", nrows=10)[,1:10]

# Works if using curl to read lines first
read.table(text=readLines(curl(paste0("https://", URL))), sep="\t",
nrows=10)[,1:10]

# Fails if using only readLines
read.table(text=readLines(paste0("https://", URL)), sep="\t",
nrows=10)[,1:10]

# Works with fread from data.table package (it uses curl to dump first
the file)
data.frame(fread(paste0("https://", URL), sep="\t", nrows=10)[,1:10],
row.names=1)
data.frame(fread(paste0("http://", URL),? sep="\t", nrows=10)[,1:10],
row.names=1)

I guess my solution is to use http or move to use fread or curl. But I
think the clean way is to use read.table or?

Best,

Christian