[R-pkg-devel] New test in R-devel causes existing packages to fail: "Error: connections left open"
Henrik, Thanks for the suggest. Yes, definitely, I think your more nuanced test would be a big improvement. The only wrinkle is that the connection is established *not* when the package is *loaded* but rather when the connection is *first needed* (using delayedAssign when the package is loaded). That way, loading the package doesn't block the REPL for ~5 seconds while Scala and the JVM first start. -- David On Thu, Aug 23, 2018 at 11:19 PM Henrik Bengtsson
<henrik.bengtsson at gmail.com> wrote:
Does R CMD check --as-cran test for newly opened connections or any open connections? Could the check for stray connection in examples/vignettes be: 1. Record what connections are open 2. Attach the package 3. Record what connections are open 4. Run the example 5. Assert that no *new* connections in addition to what's recorded in Step 3 are open 6. Unload the package 7. Assert that no *new* connections in addition to what's recorded in Step 1 are open Step 5 asserts that the code in the example does not leave stray connections behind, and Step 7 that the package itself does not leave stray connections behind. /Henrik On Thu, Aug 23, 2018 at 1:25 PM David B. Dahl <dahl at stat.byu.edu> wrote:
Oops, I accidentally did not "reply-all".... Here is my message: Thanks Uwe, Duncan, and Gabor for the response, advise, and flexibility. Regarding Uwe's suggestion: "... there should be a function that creates the connction and one that closes the connection," I should clarify. The rscala package does just that. There is a function (named "scala") that creates the connection (using delayedAssign) and another the closes the function (namely an S3 close method). The examples for the rscala package do this full open/close semantics, but... The problem comes when authors of another package, let's call it the "FooBar" package, want to implement an algorithm in Scala based on functionality provided by the rscala package. Let's say they write a function called "neatAlgorithm" based on Scala. Yes, the FooBar package author could require that, before the user calls the "neatAlgorithm" function, they first call a function to set up the connection (which itself would call the "rscala::scala" function) and then, after calling the "neatAlgorithm" function, they call a function to close the connection. But that is not very user friendly and exposes the user to implementation details of the algorithm. The user of the FooBar package don't really care whether the "neatAlgorithm" is implemented in pure R, C++, Scala, or whatever, much like the users of the 'lm' function don't need to know the implementation details or do any setup before and after calling the function. The current approach is that the connection to Scala is transparent to the end user of a package. Behind the scenes, the package author establish the connection once it is needed and the rscala package manages the connection and explicitly closes it when 1. the package is unloaded or 2. the R session ends. This approach does not leave dangling connections --- which I believe is the point of the new test --- yet my package is caught up in the test. I hope that this approach is still valid. Perhaps the test could result in a warning (instead of an error) and CRAN could accept packages with such a warning. If not, a work-around is to have a \dontshow section in the examples that will close the connection (but leave the Scala process running) and then automatically reestablish the connection as needed. This would not be very efficient but, as Duncan mentioned, it mostly only effects the package examples themselves. Plus, it would not be too burdensome for package developers. Again, thanks for considering my situation. Best regards, -- David On Mon, Aug 20, 2018 at 11:11 PM Uwe Ligges <ligges at statistik.tu-dortmund.de> wrote:
My advise: Apparently you want to have communication via sockets to scala. So there should be a function that creates the connction and one tha closes the connection. Comparable to starting some parallel cluster and stopping it again. In the meantime, you can allow for all sorts of communication. So that's fine. Then in your examples, simply design them to be standalone, i.e. in *your* examples always start the connection and stop it again at the end of one examples block, i.e. the exampels defined in one Rd file. Best, Uwe Ligges On 20.08.2018 02:11, Duncan Murdoch wrote:
On 19/08/2018 12:34 PM, G?bor Cs?rdi wrote:
Sorry, missed that these were examples, so, yeah, that's harder. G.
How about a function that checks if the connection is open before doing anything, and then at the end you close it if it wasn't already open? This will make all examples run slower on CRAN, but won't affect most users who are doing their own stuff as well as running examples. Or, how about the startup code for the package opens the connection? Or perhaps CRAN will respond to this thread with another suggestion. Duncan Murdoch
On Sun, Aug 19, 2018 at 6:32 PM G?bor Cs?rdi <csardi.gabor at gmail.com> wrote:
You could just create a function to close the connection and then people could call it at the end of their test suites. >> Gabor On Sun, Aug 19, 2018 at 6:22 PM David B. Dahl <dahl at stat.byu.edu> wrote:
In preparing to submit an update of my package to CRAN, I found that R-devel has a new test regarding "connects left open" that my packages fail. The new test appears to have been committed by Uwe Ligges in revisions 74959 and 74964 on 2018-07-14 and 2018-07-15, respectively. The commit message says, "check after each example whether open connections exist, indicating e.g. file connections were left open or parallel clusters still running." I am hoping for advice on how to pass "R CMD check --as-cran". Or, perhaps my situation will prompt a change to the test or, at least, having it result in a warning instead of an error. Below I describe the situation. My rscala package allows developers to write R packages based on Scala (much like rJava and Rcpp for Java and C++, respectively). Scala runs as a separate process and interprocess communication is implemented using socket connections. Suppose a package using rscala has functions that call Scala code. (Such packages are 'bamboo', 'sdols', and 'shallot' on CRAN.) The first time a user executes an R function calling down into Scala, a socket connect between Scala and R is established. For the sake of low latency, after the call to the function ends, the connection stays open until the package is unloaded or the R session ends. But, this approach runs afoul of the new test mentioned above that appears to be designed to catch connections that are *accidentally* left open. I definitely do not want to users of my packages 'bamboo', 'sdols', and 'shallot' to have to think about managing connection between Scala and R. That's an implementation detail and uing the package should be transparent for the user (who doesn't care about the implementation details). On my end, I see two solutions: 1. I could try to reengineer my approach --- establishing a new connection for every single call into Scala --- although I am loath to do anything to increase the latency, or 2. I could wrap all the examples in \donttest so that CRAN checks are passed. Or, again, perhaps my situation will prompt a reevaluation of the test. Perhaps it could result in a warning (instead of an error) and the CRAN maintainers would accept packages with such a warning. Any advise? Thanks a lot! -- David
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel