Skip to content

read.csv fails in R console in Ubuntu terminal but works in RStudio after R 3.6.3 upgrade to R 4.0.2?

10 messages · Sam H, Bert Gunter, Rolf Turner +5 more

#
Hi,

I am trying to download some data using read.csv and it works perfectly in
RStudio and fails in the R console in the terminal in Ubuntu 18.04 after
upgrading from R 3.6.3 to 4.0.2. Before upgrading this worked in the R
console in the terminal also without any issues.

Why would that be? How to fix this?

Below please find R code output and sessionInfo().

*Works in RStudio*
Symbol                                                Name
LastSale MarketCap IPOyear1      TXG
10x Genomics, Inc.  87.4400     $8.6B    20192       YI
                           111, Inc.   6.4800  $533.69M    20183
PIH              1347 Property Insurance Holdings, Inc.   4.5350
$27.52M    2014
 sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:[1] stats     graphics  grDevices utils
datasets  methods   base

loaded via a namespace (and not attached):[1] compiler_4.0.2 tools_4.0.2

*Fails in R console in terminal*

    > read.csv("https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download",
header=TRUE, as.is=TRUE, na="n/a")
Error in file(file, "rt") :
  cannot open the connection to
'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download'
In addition: Warning message:
In file(file, "rt") :
  URL 'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download':
status was 'Failure when receiving data from the peer'> traceback()3:
file(file, "rt")2: read.table(file = file, header = header, sep = sep,
quote = quote,
       dec = dec, fill = fill, comment.char = comment.char, ...)1:
read.csv("https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download",
       header = TRUE, as.is = TRUE, na = "n/a")>  sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            [11]
LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:[1] stats     graphics  grDevices utils
datasets  methods   base

loaded via a namespace (and not attached):[1] compiler_4.0.2>

I also asked this question here
https://stackoverflow.com/questions/62898008/why-read-csv-fails-in-r-console-in-ubuntu-terminal-but-works-in-rstudio-after-r
. Since there was no answer on stackoverflow I sent this question also to
this list.

Best regards,
Sam
#
I may be wrong, but probably better posted on r-sig-debian rather than here.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Wed, Jul 15, 2020 at 8:44 AM Sam H <sam.hhh1 at gmail.com> wrote:

            

  
  
#
On 15/07/20 7:45 pm, Sam H wrote:

            
<SNIP>
<SNIP>

I'm running Ubuntu 18.04.4 and R 4.0.2.

When I tried the foregoing read.csv() command, it just hung; nothing 
happened until I hit <ctrl>-c; no output was returned (and no error 
message appeared).  However I was able to download the file 
"companylist.csv" from the site, and then read.csv() happily read that file.

I'm afraid that I have no insight into what's going on here, and 
certainly no insight into why Rstudio can read directly from the URL but 
raw R apparently can't.  Sorry.

Perhaps enquire of Rstudio and see if *they* have any insight.

cheers,

Rolf Turner
#
Hello,

R 4.0.2 on Ubuntu 20.04 LTS, sessionInfo below.

I'm also unable to read the file with Rscript from the Ubuntu terminal 
but the error is not the same as the OP's.


The first try was a file test1.R with the following commands.

x<-"https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download"
read.csv(x, as.is=TRUE, na="n/a")


And run with Rscript

rui at rui:~$ Rscript --vanilla test1.R
Error in file(file, "rt") :
   cannot open the connection to 
'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download'
Calls: read.csv -> read.table -> file
In addition: Warning message:
In file(file, "rt") :
   cannot open URL 
'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download': 
HTTP status was '403 Forbidden'
Execution halted



The second try was download.file() and then read it.
File test2.R is:

x<-"https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download"
download.file(x, "companylist.csv")
read.csv("companylist.csv", as.is=TRUE, na="n/a")


But this too failed with error 403 Forbiden.

rui at rui:~$ Rscript --vanilla test2.R
trying URL 
'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download'
Error in download.file(x, "companylist.csv") :
   cannot open URL 
'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download'
In addition: Warning message:
In download.file(x, "companylist.csv") :
   cannot open URL 
'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download': 
HTTP status was '403 Forbidden'
Execution halted


This is my session info.

rui at rui:~$ Rscript --vanilla -e 'sessionInfo()'
R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
  [1] LC_CTYPE=pt_PT.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=pt_PT.UTF-8        LC_COLLATE=pt_PT.UTF-8
  [5] LC_MONETARY=pt_PT.UTF-8    LC_MESSAGES=pt_PT.UTF-8
  [7] LC_PAPER=pt_PT.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=pt_PT.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_4.0.2



?s 08:45 de 15/07/20, Sam H escreveu:
#
Perhaps read FAQ 7.43? [1]

[1] https://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-enable-secure-https-downloads-in-R_003f
On July 15, 2020 4:02:27 PM PDT, Rui Barradas <ruipbarradas at sapo.pt> wrote:

  
    
#
Hello,

Thanks, but no, download.file still gives 403 Forbidden with both method 
= "libcurl" and method = "wget".

Rui Barradas

?s 05:31 de 16/07/20, Jeff Newmiller escreveu:
#
On 7/15/2020 4:02 PM, Rui Barradas wrote:
Works fine in Windows 10 64-bit with R-4.0.2, so I would echo Bert 
Gunter's advise to try r-sig-debian list.

Dan
#
On Thu, Jul 16, 2020 at 8:18 AM Rui Barradas <ruipbarradas at sapo.pt> wrote:
I think that makes it "not an R question". Ask on
https://unix.stackexchange.com/ maybe?

Best,
Ista
#
On Thu, Jul 16, 2020 at 5:15 PM Ista Zahn <istazahn at gmail.com> wrote:
Oh, sorry I misread your message. Nevertheless:

$ curl "https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download"
<HTML><HEAD>
<TITLE>Access Denied</TITLE>
</HEAD><BODY>
<H1>Access Denied</H1>

You don't have permission to access
"http&#58;&#47;&#47;old&#46;nasdaq&#46;com&#47;screening&#47;companies&#45;by&#45;name&#46;aspx&#63;"
on this server.<P>
Reference&#32;&#35;18&#46;5506d217&#46;1594934303&#46;938edcb
</BODY>
</HTML>

$ wget "https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download"
--2020-07-16 17:19:12--
https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download
Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt'
Resolving old.nasdaq.com (old.nasdaq.com)... 2600:1400:9000:28f::1b46,
2600:1400:9000:29b::1b46, 23.78.161.120
Connecting to old.nasdaq.com
(old.nasdaq.com)|2600:1400:9000:28f::1b46|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2020-07-16 17:19:12 ERROR 403: Forbidden.

I don't think this is an R problem.

Best,
Ista
#
On my Ubuntu system the download with read.csv succeeds in an R
console if I set the HTTPUserAgent and download.file.method options to
match the ones used by RStudio.

Given how picky the server is being I would worry about whether this
use is in line with the site's terms of service.

Best,

luke
On Thu, 16 Jul 2020, Ista Zahn wrote: