[External] Mitigating Stalls Caused by Call Deparse on Error

Tue, Jul 16, 2019 1:53 AM

We also have a few other suggestions and wishes about backtrace
storage and display on the one hand, and display of constructed calls
on the other hand. Perhaps it would be better to open a different
wishlist item for traceback() to keep the discussions focused?

FWIW I think deparsing backtraces lazily is a great idea. Displaying 1
line per call by default in interactive sessions, while being able to
get a more exhaustive output by calling `traceback(max.lines = Inf)`,
offers the best of both worlds. Understanding the structure of the
backtrace quickly is often more helpful than having immediate access
to all the information.

Best,
Lionel

On 15 Jul 2019, at 16:04, Tierney, Luke <luke-tierney at uiowa.edu> wrote:

Better to add this to the wishlist item. This all needs to be looked
at together, and nothing is likely to happen until after
vacation/conference season.  It will disappear from everyone's radar
if it is just in R_devel.

Best,

luke

On Sun, 14 Jul 2019, brodie gaslam wrote:

Luke, thanks for considering the issue. I would like to
try to separate the problem into two parts, as I _think_
your comments address primarily part 2 below:

1. How can we avoid significant and possibly crippling
stalls on error with these non-standard calls.
2. What is the best way to view these non-standard calls.

I agree that issue 2. requires further thought and
discussion under a wishlist issue ([on bugzilla now][1]).
While I did raise issue 2., the patch itself makes no
attempt to resolve it.

The proposed patch resolves issue 1., which is a big
usability problem. Right now if you have the misfortune of
using `do.call` with a big object and trigger an error, you
have the choice of waiting a possibly long time for
the deparse to complete, or killing your entire R session
externally.

It seems a shame to allow a big usability issue for `do.call`
to remain when there is a simple solution at hand, especially
since the complete deparse of large objects likely serves no
purpose in this case. Obviously, if storing the actual calls
instead of their deparsed equivalents in .Traceback causes
problems I'm not anticipating, then that's different.
Is that the case?

Best,

Brodie.

[1]: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17580

On Sunday, July 14, 2019, 8:52:45 AM EDT, Tierney, Luke <luke-tierney at uiowa.edu> wrote:

This is probably best viewed in the context of other issue with
displaying calls, such as issues arising from calls constructed in
non-standard evaluation contexts. Might be good to move to a wishlist
item in bugzilla.

Best,

luke

On Sat, 13 Jul 2019, brodie gaslam via R-devel wrote:

When large calls cause errors R may stall for extended periods.  This
is particularly likely to happen with `do.call`, as in this example
with a 24 second stall:

    x <- runif(1e7)
    system.time(do.call(paste0, list(abs, x)))  # intentional error
    ## Error in (function (..., collapse = NULL)  :
    ##   cannot coerce type 'builtin' to vector of type 'character'
    ## Calls: system.time -> do.call -> <Anonymous>
    ## Timing stopped at: 23.81 0.149 24.04

    str(.Traceback)
    ## Dotted pair list of 3
    ##  $ : chr [1:2500488] "(function (..., collapse = NULL) " ".Internal(paste0(list(...), collapse)))(.Primitive(\"abs\"), c(0.718117154669017, " "0.494785501621664, 0.1453434410505, 0.635028422810137, 0.0353180423844606, " "0.688418723642826, 0.889682895969599, 0.728154224809259, 0.292572240810841, " ...
    ##  $ : chr "do.call(paste0, list(abs, x))"
    ##  $ : chr "system.time(do.call(paste0, list(abs, x)))"

The first time I noticed this I thought my session had frozen/crashed
as the standard interrupt ^C does not work during the deparse.  The
stall happens when on error the call stack is deparsed prior to being
saved to `.Traceback`.  The deparsing is done by `deparse1m` in native
code, with the value of `getOption('deparse.max.lines')` which
defaults to all lines.

Since there is little value to seeing millions of lines of deparsed
objects in `traceback()`, a simple work-around is to change the
`deparse.max.lines` value:

    options(deparse.max.lines=1)
    system.time(do.call(paste0, list(abs, x)))
    ## Error in (function (..., collapse = NULL)  :
    ##   cannot coerce type 'builtin' to vector of type 'character'
    ## Calls: system.time -> do.call -> <Anonymous>
    ## Timing stopped at: 0 0 0

Unfortunately this will affect all `deparse` calls, and it seems
undesirable to pre-emptively enable it just for calls that might cause
large deparses on error.

An alternative is to store the actual calls instead of their deparsed
character equivalents in `.Traceback`.  This defers the deparsing to
when `traceback()` is used.  As per `?traceback`, it should be
relatively safe to modify `.Traceback` in this way:

It is undocumented where .Traceback is stored nor that it is
visible, and this is subject to change.

Deferring the deparsing to `traceback()` will give us the 
opportunity to use a different `max.lines` setting as we do here
with the patch applied:

    system.time(do.call(paste0, list(abs, x)))
    ## Error in (function (..., collapse = NULL)  :
    ##   cannot coerce type 'builtin' to vector of type 'character'
    ## Timing stopped at: 0.028 0 0.029

    system.time(traceback(max.lines=3))
    ## 3: (function (..., collapse = NULL)
    ##    .Internal(paste0(list(...), collapse)))(.Primitive("abs"), c(0.535468587651849,
    ##    0.0540027911774814, 0.732930393889546, 0.565360915614292, 0.544816034380347,
    ##     ...
    ## 2: do.call(paste0, list(abs, x))
    ## 1: system.time(do.call(paste0, list(abs, x)))
    ##    user  system elapsed
    ##   0.000   0.000   0.003


More generally, it might be better to have a different smaller default
value for the lines to deparse when calls  are _displayed_ as parts of
lists, as is the case with `traceback()`, or in `print(sys.calls())` and
similar.

I attach a patch that does this.  I have run some basic tests
and `make check-devel` passes. I can file an issue on bugzilla
if that is a better place to have this conversation (assuming there
is interest in it).

Best,

Brodie

PS: for some reason my mail client is refusing to attach the patch so I paste it
starting on the next line.
Index: src/gnuwin32/Rdll.hide
===================================================================
--- src/gnuwin32/Rdll.hide    (revision 76827)
+++ src/gnuwin32/Rdll.hide    (working copy)
@@ -94,6 +94,7 @@
  R_GetMaxNSize
  R_GetMaxVSize
  R_GetTraceback
+ R_GetTracebackParsed
  R_GetVarLocSymbol
  R_GetVarLocValue
  R_HandlerStack
Index: src/include/Defn.h
===================================================================
--- src/include/Defn.h    (revision 76827)
+++ src/include/Defn.h    (working copy)
@@ -1296,6 +1296,7 @@
 void NORET ErrorMessage(SEXP, int, ...);
 void WarningMessage(SEXP, R_WARNING, ...);
 SEXP R_GetTraceback(int);
+SEXP R_GetTracebackParsed(int);
 
 R_size_t R_GetMaxVSize(void);
 void R_SetMaxVSize(R_size_t);
Index: src/library/base/R/traceback.R
===================================================================
--- src/library/base/R/traceback.R    (revision 76827)
+++ src/library/base/R/traceback.R    (working copy)
@@ -16,9 +16,19 @@
 #  A copy of the GNU General Public License is available at
 #  https://www.R-project.org/Licenses/
 
-.traceback <- function(x = NULL) {
-    if(is.null(x) && !is.null(x <- get0(".Traceback", envir = baseenv())))
-    {}
+.traceback <- function(x = NULL, max.lines=getOption("deparse.max.lines")) {
+    if(!(is.numeric(max.lines) && !is.na(max.lines) &&
+         as.integer(max.lines) > 0L)
+    ) {
+        max.lines <- -1L
+    }
+    if(is.null(x) && !is.null(x <- get0(".Traceback", envir = baseenv()))) {
+        for(i in seq_along(x)) {
+            srcref <- attr(x[[i]], 'srcref')
+            x[[i]] <- deparse(x[[i]], nlines=max.lines)
+            attr(x[[i]], 'srcref') <- srcref
+        }
+    }
     else if (is.numeric(x))
         x <- .Internal(traceback(x))
     x
@@ -26,7 +36,16 @@
 
 traceback <- function(x = NULL, max.lines = getOption("deparse.max.lines"))
 {
-    n <- length(x <- .traceback(x))
+    valid.max.lines <- is.numeric(max.lines) && !is.na(max.lines) &&
+         as.integer(max.lines) > 0L
+
+    if(valid.max.lines) {
+        # max.lines + 1L so we can know that output was truncated by .traceback
+        max.lines.2 <- as.integer(max.lines) + 1L
+    } else {
+        max.lines.2 <- -1L
+    }
+    n <- length(x <- .traceback(x, max.lines=max.lines.2))
     if(n == 0L)
         cat(gettext("No traceback available"), "\n")
     else {
@@ -40,7 +59,7 @@
                 paste0(" at ", basename(srcfile$filename), "#", srcref[1L])
             }
             ## Truncate deparsed code (destroys attributes of xi)
-            if(is.numeric(max.lines) && max.lines > 0L && max.lines < m) {
+            if(valid.max.lines &&  max.lines < m) {
                 xi <- c(xi[seq_len(max.lines)], " ...")
                 m <- length(xi)
             }
Index: src/library/base/man/traceback.Rd
===================================================================
--- src/library/base/man/traceback.Rd    (revision 76827)
+++ src/library/base/man/traceback.Rd    (working copy)
@@ -21,7 +21,7 @@
 }
 \usage{
 traceback(x = NULL, max.lines = getOption("deparse.max.lines"))
-.traceback(x = NULL)
+.traceback(x = NULL, max.lines = getOption("deparse.max.lines"))
 }
 \arguments{
   \item{x}{\code{NULL} (default, meaning \code{.Traceback}), or an
Index: src/main/errors.c
===================================================================
--- src/main/errors.c    (revision 76827)
+++ src/main/errors.c    (working copy)
@@ -1008,7 +1008,7 @@
        (which should not happen) */
     if (traceback && inError < 2 && inError == oldInError) {
         inError = 2;
-        PROTECT(s = R_GetTraceback(0));
+        PROTECT(s = R_GetTracebackParsed(0));
         SET_SYMVALUE(install(".Traceback"), s);
         /* should have been defineVar
            setVar(install(".Traceback"), s, R_GlobalEnv); */
@@ -1440,9 +1440,11 @@
     PrintWarnings();
     }
 }
-
+/*
+ * Return the traceback without deparsing the calls
+ */
 attribute_hidden
-SEXP R_GetTraceback(int skip)
+SEXP R_GetTracebackParsed(int skip)
 {
     int nback = 0, ns;
     RCNTXT *c;
@@ -1467,7 +1469,9 @@
         if (skip > 0)
         skip--;
         else {
-        SETCAR(t, deparse1m(c->call, 0, DEFAULTDEPARSE));
+                // Extra paranoid PROTECTS
+        SETCAR(t, PROTECT(duplicate(c->call)));
+                UNPROTECT(1);
         if (c->srcref && !isNull(c->srcref)) {
             SEXP sref;
             if (c->srcref == R_InBCInterpreter)
@@ -1482,7 +1486,26 @@
     UNPROTECT(1);
     return s;
 }
+/*
+ * Return the traceback with calls deparsed
+ */
+attribute_hidden
+SEXP R_GetTraceback(int skip)
+{
+    int nback = 0;
+    SEXP s, t, u, v;
+    s = PROTECT(R_GetTracebackParsed(skip));
+    for(t = s; t != R_NilValue; t = CDR(t)) nback++;
+    u = v = PROTECT(allocList(nback));
 
+    for(t = s; t != R_NilValue; t = CDR(t), v=CDR(v)) {
+        SETCAR(v, PROTECT(deparse1m(CAR(t), 0, DEFAULTDEPARSE)));
+        UNPROTECT(1);
+    }
+    UNPROTECT(1);
+    return u;
+}
+
 SEXP attribute_hidden do_traceback(SEXP call, SEXP op, SEXP args, SEXP rho)
 {
     int skip;

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[External] Mitigating Stalls Caused by Call Deparse on Error

Thread (6 messages)