Skip to content

problem with pipes, textConnection and read.dcf

4 messages · Gabor Grothendieck, Peter Dalgaard, iuke-tier@ey m@iii@g oii uiow@@edu +1 more

#
This gives an error bit if the first gsub line is commented out then there is no
error even though it is equivalent code.

  L <- c("Variable:id", "Length:112630     ")

  L |>
    gsub(pattern = " ", replacement = "") |>
    gsub(pattern = " ", replacement = "") |>
    textConnection() |>
    read.dcf()
  ## Error in textConnection(gsub(gsub(L, pattern = " ", replacement = ""),  :
  ##  argument 'object' must deparse to a single character string

That is this works:

  L |>
    # gsub(pattern = " ", replacement = "") |>
    gsub(pattern = " ", replacement = "") |>
    textConnection() |>
    read.dcf()
  ##      Variable Length
  ## [1,] "id"     "112630"

  R.version.string
  ## [1] "R version 4.1.0 RC (2021-05-16 r80303)"
  win.version()
  ## [1] "Windows 10 x64 (build 19042)"
#
It's not a pipe issue:
Error in textConnection(gsub(gsub(L, pattern = " ", replacement = ""),  : 
  argument 'object' must deparse to a single character string
A connection with                                                          
description "gsub(L, pattern = \" \", replacement = \"\")"
class       "textConnection"                              
mode        "r"                                           
text        "text"                                        
opened      "opened"                                      
can read    "yes"                                         
can write   "no"                                          

I suppose the culprit is that the deparse(substitute(...)) construct in the definition of textConnection() can generate multiple lines if the object expression gets complicated.
function (object, open = "r", local = FALSE, name = deparse(substitute(object)), 
    encoding = c("", "bytes", "UTF-8")) 

This also suggests that setting name=something might be a cure.

-pd

  
    
#
Not an issue with pipes. The pipe just rewrites the expression to a
nested call and that is then evaluated. The call this produces is
+    gsub(pattern = " ", replacement = "") |>
+    gsub(pattern = " ", replacement = "") |>
+    textConnection() |>
+    read.dcf())
read.dcf(textConnection(gsub(gsub(L, pattern = " ", replacement = ""),
     pattern = " ", replacement = "")))

If you run that expression, or just the argument to read.dcf, then you
get the error you report. So the issue is somewhere in textConnection().
This produces a similar message:

read.dcf(textConnection(c(L, "aaaaaaaaaaaaaaaaaa", "bbbbbbbbbbbbbbbb", "cccccccccccccccc", "ddddddddddddddddddd")))

File a bug report and someone who understands the textConnection()
internals better than I do can take a look.

Best,

luke
On Tue, 10 Aug 2021, Gabor Grothendieck wrote:

            

  
    
#
> It's not a pipe issue:

    >> textConnection(gsub(gsub(L, pattern = " ", replacement = ""), pattern = " ", replacement = ""))
    > Error in textConnection(gsub(gsub(L, pattern = " ", replacement = ""),  : 
    > argument 'object' must deparse to a single character string
    >> textConnection(gsub(L, pattern = " ", replacement = ""))
    > A connection with                                                          
    > description "gsub(L, pattern = \" \", replacement = \"\")"
    > class       "textConnection"                              
    > mode        "r"                                           
    > text        "text"                                        
    > opened      "opened"                                      
    > can read    "yes"                                         
    > can write   "no"                                          

    > I suppose the culprit is that the deparse(substitute(...)) construct in the definition of textConnection() can generate multiple lines if the object expression gets complicated.

    >> textConnection
    > function (object, open = "r", local = FALSE, name = deparse(substitute(object)), 
    > encoding = c("", "bytes", "UTF-8")) 

    > This also suggests that setting name=something might be a cure.

    > -pd

Indeed.

In R 4.0.0, I had introduced the deparse1() short cut to be used
in place of  deparse() in such cases:

NEWS has said

    ? New function deparse1() produces one string, wrapping deparse(),
      to be used typically in deparse1(substitute(*)), e.g., to fix
      PR#17671.

and the definition is a simple but useful oneliner

  deparse1 <- function (expr, collapse = " ", width.cutoff = 500L, ...) 
  paste(deparse(expr, width.cutoff, ...), collapse = collapse)


So I'm almost sure we should use  deparse1() in textConnection
(and will make check and potentially commit that unless ...)

Martin
>> On 10 Aug 2021, at 21:33 , Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
>> 
    >> This gives an error bit if the first gsub line is commented out then there is no
    >> error even though it is equivalent code.
    >> 
    >> L <- c("Variable:id", "Length:112630     ")
    >> 
    >> L |>
    >> gsub(pattern = " ", replacement = "") |>
    >> gsub(pattern = " ", replacement = "") |>
    >> textConnection() |>
    >> read.dcf()
    >> ## Error in textConnection(gsub(gsub(L, pattern = " ", replacement = ""),  :
    >> ##  argument 'object' must deparse to a single character string
    >> 
    >> That is this works:
    >> 
    >> L |>
    >> # gsub(pattern = " ", replacement = "") |>
    >> gsub(pattern = " ", replacement = "") |>
    >> textConnection() |>
    >> read.dcf()
    >> ##      Variable Length
    >> ## [1,] "id"     "112630"
    >> 
    >> R.version.string
    >> ## [1] "R version 4.1.0 RC (2021-05-16 r80303)"
    >> win.version()
    >> ## [1] "Windows 10 x64 (build 19042)"
    >> 
    >> -- 
    >> Statistics & Software Consulting
    >> GKX Group, GKX Associates Inc.
    >> tel: 1-877-GKX-GROUP
    >> email: ggrothendieck at gmail.com
    >> 
    >> ______________________________________________
    >> R-devel at r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/r-devel

    > -- 
    > Peter Dalgaard, Professor,
    > Center for Statistics, Copenhagen Business School
    > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    > Phone: (+45)38153501
    > Office: A 4.23
    > Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

    > ______________________________________________
    > R-devel at r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-devel