Skip to content
Prev 56028 / 63424 Next

PATCH: Asserting that 'connection' used has not changed + R_GetConnection2()

SUMMARY:

I'm proposing that R assert that 'connection' options have not changed
since first created such that R will produce the following error:
Error in cat("hello again\n", file = fh) :
  invalid connection (non-existing 'conn_id')

Note that, currently in R, the latter silently writes to 'b.txt' - not
'a.txt' (for more details, see
https://github.com/HenrikBengtsson/Wishlist-for-R/issues/81).


BACKGROUND:

In R, connections are indexed by their (zero-based) row indices in the
table of available connections.  For example,
description class      mode text   isopen   can read can write
0 "stdin"     "terminal" "r"  "text" "opened" "yes"    "no"
1 "stdout"    "terminal" "w"  "text" "opened" "no"     "yes"
2 "stderr"    "terminal" "w"  "text" "opened" "no"     "yes"
3 "a.txt"     "file"     "w"  "text" "opened" "no"     "yes"
[1] TRUE


ISSUE:

The problem with the current design/implementation where connections
are referred to by their index (only), is that

(i) the table of connections changes over time and
(ii) connection indices are recycled.

Because a `connection` object holds the connection row index, it means
that *the actual underlying connection that a `connection` object
refers to may change over its lifetime*.


SUGGESTION:

Make use of the 'Rconn' struct field 'id', which is unique, to assert
that the 'connection' object used is referring to the
original/expected connection.  The 'id' field is available via
attribute 'conn_id' part of a 'connection' object.


PATCH:

See attached 'connection.patch' file (or
https://github.com/HenrikBengtsson/Wishlist-for-R/issues/81#issuecomment-434210222).
The patch introduces a new SEXP R_GetConnection2(SEXP sConn) function,
which looks up a connection by its index *and* the 'id' field. This
function is backward compatible with R_GetConnection(), which looks up
a connection by its index (only). In addition, R_GetConnection2() also
accepts 'sConn' of type integer, which the looks up the connection
similar to how the internal getConnection() function does it.

Comment: The patch is just one of many alternatives.  Hopefully, it
helps clarify what I'm suggesting.  It passes 'make check' and I've
tested it on a few packages of mine that make heavy use of different
types of connections.

In addition to "overridden" connections, the patch protects against
invalid 'connection':s that have been serialized, e.g.
<pointer: 0x78>
<pointer: (nil)>  #<== NIL because external pointer was lost when serialized
[1] TRUE
Error in isOpen(fh3) : invalid connection ('conn_id' is NULL)

This is useful, when for instance 'connection':s are (incorrectly)
passed to background R sessions (e.g. PSOCK cluster nodes).


SEE ALSO:

* More details of the above are scribbled down on
https://github.com/HenrikBengtsson/Wishlist-for-R/issues/81
* R-devel post 'closeAllConnections() can really mess things up',
2016-10-30, https://stat.ethz.ch/pipermail/r-devel/2016-October/073331.html

All the best,

Henrik

-------------- next part --------------
A non-text attachment was scrubbed...
Name: connections.patch
Type: text/x-patch
Size: 15539 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20181030/d2a946b2/attachment.bin>