Skip to content

Interrupting C++ code execution

15 messages · schattenpflanze at arcor.de, Simon Urbanek, Thomas Friedrichsmeier +3 more

#
Hello,

I am writing an R interface for one of my C++ programs. The computations 
in C++ are very time consuming (several hours), so the user needs to be 
able to interrupt them. Currently, the only way I found to do so is 
calling R_CheckUserInterrupt() frequently. Unfortunately, there are 
several problems with that:

1. Calling R_CheckUserInterrupt() interrupts immediately, so I have no 
possibility to exit my code gracefully. In particular, I suppose that 
objects created on the heap (e.g., STL containers) are not destructed 
properly.

2. Calling R_CheckUserInterrupt() within a parallel OpenMP loop causes 
memory corruptions. Even if I do so within a critical section, it 
usually results in segfaults, crashes, or invalid variable contents 
afterwards. I suppose this is due to the threads not being destroyed 
properly. Since most of the time critical computations are done in 
parallel, this means I can hardly interrupt anything.

Having a function similar to R_CheckUserInterrupt() but returning a 
boolean variable (has an interrupt occurred or not?) would solve these 
problems. Is there a way to find out about user interrupt requests (the 
user pressing ctrl+c or maybe a different set of keys) without 
interrupting immediately?

I would appreciate your advice on this topic.


Best regards,
Peter
#
On Apr 25, 2011, at 5:22 AM, schattenpflanze at arcor.de wrote:

            
In general, you're responsible for the cleanup. See R-devel archives for discussion on the interactions of C++ and R error handling. Generally, you should not use local objects and you should use on.exit to make sure you clean up.
As you know R is not thread-safe so you cannot call any R API from a thread - including OMP threads - so obviously you can't call R_CheckUserInterrupt(). Since you're using threads the safe way is to perform your computations on a separate thread and let R handle events so that you can abort your computation thread as part of on.exit.
Checking for interrupts may involve running the OS event loop (to allow the user to interact with R) and thus is not guaranteed to return. There is no general solution - if you're worried only about your, local code, then on unix, for example, you could use custom signal handlers to set a flag and co-operatively interrupt your program. On Windows there is the UserBreak flag which can be set by a separate thread and thus you may check on it. That said, all this is very much platform-specific.

Cheers,
Simon
#
Thank you for your response, Simon.
I am using Rcpp (Rcpp-modules, to be precise). This means, I do actually 
not write any R code. Moreover, the C++ code does not use the R API. My 
C++ functions are 'exposed' to R via Rcpp, which creates suitable S4 
classes. Rcpp does the exception handling.
In particular, there is no obvious possibility for me to add an 
'on.exit' statement to a particular exposed C++ method.
We are talking about large amounts of code, dozens of nested function 
calls, and even external libraries. So "not using local objects" is 
definitely no option.
That is very interesting. Not being thread safe does not necessarily 
imply that a function cannot be called from within a thread (as long as 
it is not done concurrently from several threads). In particular, the 
main program itself is also a thread, isn't it?
Since no cleanup is done, however, it is now clear that calling 
R_CheckUserInterrupt() _anywhere_ in my program, parallel section or 
not, is a bad idea.
Starting the computations in a separate thread is a nice idea. I could 
then call R_CheckUserInterrupt() every x milliseconds in the function 
which dispatches the worker thread. Unfortunately, I see no obvious way 
of adding an "on.exit" statement to an Rcpp module method. So I would 
probably have to call an R function from C++ (e.g., using RInside) which 
contains the on.exit statement, which in turn calls again a C++ function 
setting a global 'abort' flag and waits for the threads to be 
terminated. Hmmm.

How does on.exit work? Could I mimic that behaviour directly in C++?
I see.
Being able to set a flag is all I need and would be the perfect solution 
imho. However, I do not yet see how I could achieve that.

How can I write a signal handler within C++ code which does not create a 
GUI and has no dedicated event dispatching thread?
Would it be possible to use, e.g., a Qt keyboard event handler within 
the C++ code? Would a keyboard event be visible to such an event 
handler? Is it not intercepted by R / the terminal window / the OS?

Does any existing R package contain signal handlers?


Best regards,
Peter
#
On Apr 25, 2011, at 11:09 AM, schattenpflanze at arcor.de wrote:

            
But that would imply that the library calls R! Note that we're talking about the stack at the point of R API call, so you can do what you want until you cal R API. At the moment you touch R API you should have no local C++ objects on the stack (all the way down) - that's what I meant.
Yes, but each thread has a separate stack, and you can only enter R with the same stack you left (because the stack will be restored to the state of the calling context).
It sets the conexit object of the current context structure to the closure to be evaluated when the context is left. endcontext() then simply evaluates that closure when the context is left.
Unfortunately there is no C-level onexit hook and the internal structure of RCNTXT is not revealed to packages. So AFAICS the closest you can get is to use eval to call on.exit().

However, I think it would be useful to have a provision for creating a context with a C-level hook - the question is whether the others have the feeling that it's going to a too low level ...
It is GUI-specific, unfortunately. AFAIR the Windows GUI does that because it's running on a separate thread. I think the X11-based GUIs use fds so the are synchronous and on OS X runs the OS loop inside the R event loop - so, again, synchronous.
That's simple just use signal() to register your handler.
Meshing R's loop, GUI loop and your own code will be a nightmare. For example, one problem is that if you are running the GUI loop and it triggers an event that R would otherwise handle (e.g. resizing plot window) you're in trouble since you can't let R do anything...
I'm not sure - I would definitely not recommend that to be used in packages since it's platform-dependent and changes the semantics of signals defined by R. But you can play with it ;).

Cheers,
Simo
#
Actually, it just came to me that there is a hack you could use. The problem with it is that it will eat all errors, even if they were not yours (e.g. those resulting from events triggered the event loop), so I would not recommend it for general use. But here we go:

static void chkIntFn(void *dummy) {
  R_CheckUserInterrupt();
}

// this will call the above in a top-level context so it won't longjmp-out of your context
bool checkInterrupt() {
  return (R_ToplevelExec(chkIntFn, NULL) == FALSE);
}

// your code somewhere ...
if (checkInterrupt()) { // user interrupted ... }

You must call it on the main thread and you should be prepared that it may take some time and may interact with the OS...

Cheers,
Simon
On Apr 25, 2011, at 12:23 PM, Simon Urbanek wrote:

            
#
Dear Simon,

thanks again for your explanations. Your previous e-mail clarified 
several points for me.
That actually looks quite nice. At least when compared to my currently 
only alternative of not interrupting at all. I will test it, in 
particular with respect to computational speed. Perhaps I can at least 
call it once per second.

Best regards,
Peter
#
On Monday 25 April 2011, Simon Urbanek wrote:
Here's another option which is probably not recommendable for general use, 
since it is not part of the documented API:

On Windows you can look at the variable "UserBreak", available from 
Rembedded.h. Outside of Windows, you can look at R_interrupts_pending, 
available from R_ext/GraphicsDevice.h. R_ext/GraphicsDevice.h also has 
R_interrupts_suspended, which you may or may not want to take into account, 
depending on your use-case.

BTW, being able to check for a pending interrupt or to schedule an interrupt 
from a separate thread is something that can come in handy in GUI development 
as well, and personally, I would appreciate, if there was some slightly more 
official support for this.

Regards
Thomas
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20110425/0be7c269/attachment.bin>
#
I have tested the solutions suggested by Simon and Thomas on a Linux 
machine. These are my findings:
I did not manage to get this to work. Neither R_interrupts_pending nor 
R_interrupts_suspended seem to change when I press ctrl+c. Perhaps this 
is due to the fact that I run R in a terminal without any graphical 
interface?
This solution works perfectly! It takes slightly longer to call this 
function than the plan R_CheckUserInterrupt() call, but in any 
reasonable scenario, the additional time is absolutely insignificant.

Inside OpenMP parallel for constructs, one has to make sure that only 
the thread satisfying omp_get_thread_num()==0 makes the call (the 
'master' construct cannot be nested inside a loop). I can then set a 
flag, which is queried by every thread in every loop cycle, causing fast 
termination of the parallel loop. After the loop, I throw an exception. 
Thus, my code is terminated gracefully with minimal effort. I can do 
additional cleanup operations (which usually is not necessary, since I 
use smart pointers), and report details on the interrupt to the user.

With my limited testing, so far I have not noticed any downsides. Of 
course, there is the obvious drawback of not being supported officially 
(and thus maybe being subject to change), the question of portability, 
and the question of interoperability with other errors.

Moreover, I have found an old thread discussing almost the same topic:
http://tolstoy.newcastle.edu.au/R/e4/devel/08/05/1686.html .
The thread was created in 2008, so the issue is not really a new one. 
The solution proposed there is actually the same as the one suggested by 
Simon, namely using R_ToplevelExec().

An officially supported, portable solution would of course be much 
appreciated!


Best regards,
Peter
#
On Apr 26, 2011, at 7:30 AM, schattenpflanze at arcor.de wrote:

            
Thomas' suggestion was not aimed at your problem - it was sort of the inverse (more at your Qt question). If you want to interrupt R you can mess with those flags and them let R run the event loop. It doesn't work in your (original) case.
Actually, it is in the official API (Rinternals.h) so I don't think that is the issue.
It is portable as well, so I'd say the main concern is what happens when events trigger something that is not related to you and you eat those errors. They will act as user-interrupt to you even if it's not what the user intended. One could argue that it's the lesser of the evils, because if you don't do anything R will just block so those events would have to wait until you're done anyway.
Interesting - I'm glad Luke also suggested C-level onexit bac then - it is something I was thinking about before ..

Cheers,
Simon
#
Hi,
I've been thinking about how to handle c++ threads that were started via
Rcpp calls to some of my c++ libraries from R. My main obstacle is trying to
make sure that users don't try to process files that are being generated by
a thread before the thread finishes. One thing I am considering is having my
threaded code return a class to R that contains a pointer that it remembers.
Then maybe I could just change the value at that pointer when my thread
finishes. Does that seem like a reasonable approach? I'm not completely sure
if this is related to your issue or not, but it might be similar enough to
be worth asking...
Thanks,
Sean
On 4/26/11 9:21 AM, "Simon Urbanek" <simon.urbanek at r-project.org> wrote:

            
#
Sean,
On Apr 26, 2011, at 5:06 PM, Sean Robert McGuffee wrote:

            
It depends. For a simple flag it's actually much more simple than that - you can create a boolean vector (make sure you preserve it) and just update its value when it's done - you don't even need an external pointer for that (if your'e careful).

But the slight problem with that approach is rather that you don't have a way to tell R about the status change, so essentially you can only poll on the R side. A more proper way to deal with this is to use the event loop signaling to signal in R that the flag has changed. I'm working on a "threads" package that should help with that, but it's not complete yet (you can spawn threads from R and you can actually even synchronize them with R [so if the result is all you want it's there], but semaphores are not implemented yet  --- your inquiry should shift it further up on my todo stack ;)).

Cheers,
Simon
#
Hi Simon,
That makes a lot of sense to me. I'll start reading about R's event loop
signaling. I'm not sure what the best method will be for me to flag the
completeness of a threaded process in my case. In abstract it seems that I
could get R's event loop to look for any type of flag. I think key for me in
this case will be identifying whether a particular file has been completely
produced or not. In principle I could put that type of info into the file
itself, but I think I could also make a temp file somewhere with it's full
path and flag info about it. Then the event loop could look for a particular
pattern of temp file names. On the other hand, if I pass in that info when I
start the event loop, that might work too. Regarding the external pointer
idea, I was thinking about passing an object to R as a return value after
launching the thread, and then I might be able to access a pointer inside
that object to reference it from my thread. That could be a binary vector or
any type of object if I can figure out how to get to it from my thread.
Honestly, I don't know much about dynamic referencing of objects from
separate threads, but in principle memory is shared in this case. I'll let
you know if I come up with anything generic... Please keep me posted on your
package. Are any versions of it available yet? It didn't happen to come up
on my list of R packages. I haven't necessarily been maintaining an
up-to-date version of R though. I don't know if that influences the package
list it shows me.
Sean
On 4/26/11 8:51 PM, "Simon Urbanek" <simon.urbanek at r-project.org> wrote:

            
#
Sean,
On Apr 27, 2011, at 3:21 PM, Sean Robert McGuffee wrote:

            
Usually, the easiest on unix is to register a file handle as input handler (addInputHandler) - in practice a pipe - one end is owned by the thread and the other is owned by R. Then all you need is to write anything on the thread's end and it will wake up R's even loop and let you handle the read on that end so you can do anything. You could even have multiple threads share this one pipe since you could distinguish by payload which thread is calling. One example of this is the integrated HTTP server in R - see Rhttpd sources (it has also a variant that works on Windows using synchronization via OS event loop).
Yes, it is not released yet since it's not quite complete, but here we go, at your own risk ;):

http://rforge.net/threads

It will work on all platforms, eventually, but currently only unix is supported. The idea is sort of taking the multicore paradigm (parallel + collect) but using threads (threadEval + yield). The documentation it currently non-existent, but I plan to write a vignette for it ... maybe later this week ...

Cheers,
Simon
#
Peter,
On 25/04/11 10:22, schattenpflanze at arcor.de wrote:
Sorry not to have seen this thread sooner.

You may like to give CXXR a try 
(http://www.cs.kent.ac.uk/projects/cxxr/).  In CXXR the R interpreter is 
written in C++, and a user interrupt is handled by throwing a C++ 
exception, so the stack is unwound in an orderly fashion, destructors 
are invoked, etc.

However, it's fair to say that in using CXXR with a multi-threaded 
program you'll be on the bleeding edge...

Andrew
#
Andrew,
Thank you for this suggestion. CXXR is a very interesting project!

For my current project, however, I aim at distributing the program to
other R users on pre-installed cluster nodes. Thus, I have no choice
with respect to the underlying R interpreter.

Best regards,
Peter