Hello, I am writing an R interface for one of my C++ programs. The computations in C++ are very time consuming (several hours), so the user needs to be able to interrupt them. Currently, the only way I found to do so is calling R_CheckUserInterrupt() frequently. Unfortunately, there are several problems with that: 1. Calling R_CheckUserInterrupt() interrupts immediately, so I have no possibility to exit my code gracefully. In particular, I suppose that objects created on the heap (e.g., STL containers) are not destructed properly. 2. Calling R_CheckUserInterrupt() within a parallel OpenMP loop causes memory corruptions. Even if I do so within a critical section, it usually results in segfaults, crashes, or invalid variable contents afterwards. I suppose this is due to the threads not being destroyed properly. Since most of the time critical computations are done in parallel, this means I can hardly interrupt anything. Having a function similar to R_CheckUserInterrupt() but returning a boolean variable (has an interrupt occurred or not?) would solve these problems. Is there a way to find out about user interrupt requests (the user pressing ctrl+c or maybe a different set of keys) without interrupting immediately? I would appreciate your advice on this topic. Best regards, Peter
Interrupting C++ code execution
15 messages · schattenpflanze at arcor.de, Simon Urbanek, Thomas Friedrichsmeier +3 more
On Apr 25, 2011, at 5:22 AM, schattenpflanze at arcor.de wrote:
Hello, I am writing an R interface for one of my C++ programs. The computations in C++ are very time consuming (several hours), so the user needs to be able to interrupt them. Currently, the only way I found to do so is calling R_CheckUserInterrupt() frequently. Unfortunately, there are several problems with that: 1. Calling R_CheckUserInterrupt() interrupts immediately, so I have no possibility to exit my code gracefully. In particular, I suppose that objects created on the heap (e.g., STL containers) are not destructed properly.
In general, you're responsible for the cleanup. See R-devel archives for discussion on the interactions of C++ and R error handling. Generally, you should not use local objects and you should use on.exit to make sure you clean up.
2. Calling R_CheckUserInterrupt() within a parallel OpenMP loop causes memory corruptions. Even if I do so within a critical section, it usually results in segfaults, crashes, or invalid variable contents afterwards. I suppose this is due to the threads not being destroyed properly. Since most of the time critical computations are done in parallel, this means I can hardly interrupt anything.
As you know R is not thread-safe so you cannot call any R API from a thread - including OMP threads - so obviously you can't call R_CheckUserInterrupt(). Since you're using threads the safe way is to perform your computations on a separate thread and let R handle events so that you can abort your computation thread as part of on.exit.
Having a function similar to R_CheckUserInterrupt() but returning a boolean variable (has an interrupt occurred or not?) would solve these problems. Is there a way to find out about user interrupt requests (the user pressing ctrl+c or maybe a different set of keys) without interrupting immediately?
Checking for interrupts may involve running the OS event loop (to allow the user to interact with R) and thus is not guaranteed to return. There is no general solution - if you're worried only about your, local code, then on unix, for example, you could use custom signal handlers to set a flag and co-operatively interrupt your program. On Windows there is the UserBreak flag which can be set by a separate thread and thus you may check on it. That said, all this is very much platform-specific. Cheers, Simon
Thank you for your response, Simon.
1. Calling R_CheckUserInterrupt() interrupts immediately, so I have no possibility to exit my code gracefully. In particular, I suppose that objects created on the heap (e.g., STL containers) are not destructed properly.
In general, you're responsible for the cleanup. See R-devel archives for discussion on the interactions of C++ and R error handling. Generally, you should not use local objects and you should use on.exit to make sure you clean up.
I am using Rcpp (Rcpp-modules, to be precise). This means, I do actually not write any R code. Moreover, the C++ code does not use the R API. My C++ functions are 'exposed' to R via Rcpp, which creates suitable S4 classes. Rcpp does the exception handling. In particular, there is no obvious possibility for me to add an 'on.exit' statement to a particular exposed C++ method.
Generally, you should not use local objects
We are talking about large amounts of code, dozens of nested function calls, and even external libraries. So "not using local objects" is definitely no option.
2. Calling R_CheckUserInterrupt() within a parallel OpenMP loop causes memory corruptions. Even if I do so within a critical section, it usually results in segfaults, crashes, or invalid variable contents afterwards. I suppose this is due to the threads not being destroyed properly. Since most of the time critical computations are done in parallel, this means I can hardly interrupt anything.
As you know R is not thread-safe so you cannot call any R API from a thread - including OMP threads - so obviously you can't call R_CheckUserInterrupt().
That is very interesting. Not being thread safe does not necessarily imply that a function cannot be called from within a thread (as long as it is not done concurrently from several threads). In particular, the main program itself is also a thread, isn't it? Since no cleanup is done, however, it is now clear that calling R_CheckUserInterrupt() _anywhere_ in my program, parallel section or not, is a bad idea.
Since you're using threads the safe way is to perform your computations on a separate thread and let R handle events so that you can abort your computation thread as part of on.exit.
Starting the computations in a separate thread is a nice idea. I could then call R_CheckUserInterrupt() every x milliseconds in the function which dispatches the worker thread. Unfortunately, I see no obvious way of adding an "on.exit" statement to an Rcpp module method. So I would probably have to call an R function from C++ (e.g., using RInside) which contains the on.exit statement, which in turn calls again a C++ function setting a global 'abort' flag and waits for the threads to be terminated. Hmmm. How does on.exit work? Could I mimic that behaviour directly in C++?
Having a function similar to R_CheckUserInterrupt() but returning a boolean variable (has an interrupt occurred or not?) would solve these problems. Is there a way to find out about user interrupt requests (the user pressing ctrl+c or maybe a different set of keys) without interrupting immediately?
Checking for interrupts may involve running the OS event loop (to allow the user to interact with R) and thus is not guaranteed to return.
I see.
There is no general solution - if you're worried only about your, local code, then on unix, for example, you could use custom signal handlers to set a flag and co-operatively interrupt your program. On Windows there is the UserBreak flag which can be set by a separate thread and thus you may check on it. That said, all this is very much platform-specific.
Being able to set a flag is all I need and would be the perfect solution imho. However, I do not yet see how I could achieve that. How can I write a signal handler within C++ code which does not create a GUI and has no dedicated event dispatching thread? Would it be possible to use, e.g., a Qt keyboard event handler within the C++ code? Would a keyboard event be visible to such an event handler? Is it not intercepted by R / the terminal window / the OS? Does any existing R package contain signal handlers? Best regards, Peter
On Apr 25, 2011, at 11:09 AM, schattenpflanze at arcor.de wrote:
Thank you for your response, Simon.
1. Calling R_CheckUserInterrupt() interrupts immediately, so I have no possibility to exit my code gracefully. In particular, I suppose that objects created on the heap (e.g., STL containers) are not destructed properly.
In general, you're responsible for the cleanup. See R-devel archives for discussion on the interactions of C++ and R error handling. Generally, you should not use local objects and you should use on.exit to make sure you clean up.
I am using Rcpp (Rcpp-modules, to be precise). This means, I do actually not write any R code. Moreover, the C++ code does not use the R API. My C++ functions are 'exposed' to R via Rcpp, which creates suitable S4 classes. Rcpp does the exception handling. In particular, there is no obvious possibility for me to add an 'on.exit' statement to a particular exposed C++ method.
Generally, you should not use local objects
We are talking about large amounts of code, dozens of nested function calls, and even external libraries. So "not using local objects" is definitely no option.
But that would imply that the library calls R! Note that we're talking about the stack at the point of R API call, so you can do what you want until you cal R API. At the moment you touch R API you should have no local C++ objects on the stack (all the way down) - that's what I meant.
2. Calling R_CheckUserInterrupt() within a parallel OpenMP loop causes memory corruptions. Even if I do so within a critical section, it usually results in segfaults, crashes, or invalid variable contents afterwards. I suppose this is due to the threads not being destroyed properly. Since most of the time critical computations are done in parallel, this means I can hardly interrupt anything.
As you know R is not thread-safe so you cannot call any R API from a thread - including OMP threads - so obviously you can't call R_CheckUserInterrupt().
That is very interesting. Not being thread safe does not necessarily imply that a function cannot be called from within a thread (as long as it is not done concurrently from several threads). In particular, the main program itself is also a thread, isn't it?
Yes, but each thread has a separate stack, and you can only enter R with the same stack you left (because the stack will be restored to the state of the calling context).
Since no cleanup is done, however, it is now clear that calling R_CheckUserInterrupt() _anywhere_ in my program, parallel section or not, is a bad idea.
Since you're using threads the safe way is to perform your computations on a separate thread and let R handle events so that you can abort your computation thread as part of on.exit.
Starting the computations in a separate thread is a nice idea. I could then call R_CheckUserInterrupt() every x milliseconds in the function which dispatches the worker thread. Unfortunately, I see no obvious way of adding an "on.exit" statement to an Rcpp module method. So I would probably have to call an R function from C++ (e.g., using RInside) which contains the on.exit statement, which in turn calls again a C++ function setting a global 'abort' flag and waits for the threads to be terminated. Hmmm. How does on.exit work?
It sets the conexit object of the current context structure to the closure to be evaluated when the context is left. endcontext() then simply evaluates that closure when the context is left.
Could I mimic that behaviour directly in C++?
Unfortunately there is no C-level onexit hook and the internal structure of RCNTXT is not revealed to packages. So AFAICS the closest you can get is to use eval to call on.exit(). However, I think it would be useful to have a provision for creating a context with a C-level hook - the question is whether the others have the feeling that it's going to a too low level ...
Having a function similar to R_CheckUserInterrupt() but returning a boolean variable (has an interrupt occurred or not?) would solve these problems. Is there a way to find out about user interrupt requests (the user pressing ctrl+c or maybe a different set of keys) without interrupting immediately?
Checking for interrupts may involve running the OS event loop (to allow the user to interact with R) and thus is not guaranteed to return.
I see.
There is no general solution - if you're worried only about your, local code, then on unix, for example, you could use custom signal handlers to set a flag and co-operatively interrupt your program. On Windows there is the UserBreak flag which can be set by a separate thread and thus you may check on it. That said, all this is very much platform-specific.
Being able to set a flag is all I need and would be the perfect solution imho. However, I do not yet see how I could achieve that.
It is GUI-specific, unfortunately. AFAIR the Windows GUI does that because it's running on a separate thread. I think the X11-based GUIs use fds so the are synchronous and on OS X runs the OS loop inside the R event loop - so, again, synchronous.
How can I write a signal handler within C++ code which does not create a GUI and has no dedicated event dispatching thread?
That's simple just use signal() to register your handler.
Would it be possible to use, e.g., a Qt keyboard event handler within the C++ code? Would a keyboard event be visible to such an event handler? Is it not intercepted by R / the terminal window / the OS?
Meshing R's loop, GUI loop and your own code will be a nightmare. For example, one problem is that if you are running the GUI loop and it triggers an event that R would otherwise handle (e.g. resizing plot window) you're in trouble since you can't let R do anything...
Does any existing R package contain signal handlers?
I'm not sure - I would definitely not recommend that to be used in packages since it's platform-dependent and changes the semantics of signals defined by R. But you can play with it ;). Cheers, Simo
Actually, it just came to me that there is a hack you could use. The problem with it is that it will eat all errors, even if they were not yours (e.g. those resulting from events triggered the event loop), so I would not recommend it for general use. But here we go:
static void chkIntFn(void *dummy) {
R_CheckUserInterrupt();
}
// this will call the above in a top-level context so it won't longjmp-out of your context
bool checkInterrupt() {
return (R_ToplevelExec(chkIntFn, NULL) == FALSE);
}
// your code somewhere ...
if (checkInterrupt()) { // user interrupted ... }
You must call it on the main thread and you should be prepared that it may take some time and may interact with the OS...
Cheers,
Simon
On Apr 25, 2011, at 12:23 PM, Simon Urbanek wrote:
On Apr 25, 2011, at 11:09 AM, schattenpflanze at arcor.de wrote:
Thank you for your response, Simon.
1. Calling R_CheckUserInterrupt() interrupts immediately, so I have no possibility to exit my code gracefully. In particular, I suppose that objects created on the heap (e.g., STL containers) are not destructed properly.
In general, you're responsible for the cleanup. See R-devel archives for discussion on the interactions of C++ and R error handling. Generally, you should not use local objects and you should use on.exit to make sure you clean up.
I am using Rcpp (Rcpp-modules, to be precise). This means, I do actually not write any R code. Moreover, the C++ code does not use the R API. My C++ functions are 'exposed' to R via Rcpp, which creates suitable S4 classes. Rcpp does the exception handling. In particular, there is no obvious possibility for me to add an 'on.exit' statement to a particular exposed C++ method.
Generally, you should not use local objects
We are talking about large amounts of code, dozens of nested function calls, and even external libraries. So "not using local objects" is definitely no option.
But that would imply that the library calls R! Note that we're talking about the stack at the point of R API call, so you can do what you want until you cal R API. At the moment you touch R API you should have no local C++ objects on the stack (all the way down) - that's what I meant.
2. Calling R_CheckUserInterrupt() within a parallel OpenMP loop causes memory corruptions. Even if I do so within a critical section, it usually results in segfaults, crashes, or invalid variable contents afterwards. I suppose this is due to the threads not being destroyed properly. Since most of the time critical computations are done in parallel, this means I can hardly interrupt anything.
As you know R is not thread-safe so you cannot call any R API from a thread - including OMP threads - so obviously you can't call R_CheckUserInterrupt().
That is very interesting. Not being thread safe does not necessarily imply that a function cannot be called from within a thread (as long as it is not done concurrently from several threads). In particular, the main program itself is also a thread, isn't it?
Yes, but each thread has a separate stack, and you can only enter R with the same stack you left (because the stack will be restored to the state of the calling context).
Since no cleanup is done, however, it is now clear that calling R_CheckUserInterrupt() _anywhere_ in my program, parallel section or not, is a bad idea.
Since you're using threads the safe way is to perform your computations on a separate thread and let R handle events so that you can abort your computation thread as part of on.exit.
Starting the computations in a separate thread is a nice idea. I could then call R_CheckUserInterrupt() every x milliseconds in the function which dispatches the worker thread. Unfortunately, I see no obvious way of adding an "on.exit" statement to an Rcpp module method. So I would probably have to call an R function from C++ (e.g., using RInside) which contains the on.exit statement, which in turn calls again a C++ function setting a global 'abort' flag and waits for the threads to be terminated. Hmmm. How does on.exit work?
It sets the conexit object of the current context structure to the closure to be evaluated when the context is left. endcontext() then simply evaluates that closure when the context is left.
Could I mimic that behaviour directly in C++?
Unfortunately there is no C-level onexit hook and the internal structure of RCNTXT is not revealed to packages. So AFAICS the closest you can get is to use eval to call on.exit(). However, I think it would be useful to have a provision for creating a context with a C-level hook - the question is whether the others have the feeling that it's going to a too low level ...
Having a function similar to R_CheckUserInterrupt() but returning a boolean variable (has an interrupt occurred or not?) would solve these problems. Is there a way to find out about user interrupt requests (the user pressing ctrl+c or maybe a different set of keys) without interrupting immediately?
Checking for interrupts may involve running the OS event loop (to allow the user to interact with R) and thus is not guaranteed to return.
I see.
There is no general solution - if you're worried only about your, local code, then on unix, for example, you could use custom signal handlers to set a flag and co-operatively interrupt your program. On Windows there is the UserBreak flag which can be set by a separate thread and thus you may check on it. That said, all this is very much platform-specific.
Being able to set a flag is all I need and would be the perfect solution imho. However, I do not yet see how I could achieve that.
It is GUI-specific, unfortunately. AFAIR the Windows GUI does that because it's running on a separate thread. I think the X11-based GUIs use fds so the are synchronous and on OS X runs the OS loop inside the R event loop - so, again, synchronous.
How can I write a signal handler within C++ code which does not create a GUI and has no dedicated event dispatching thread?
That's simple just use signal() to register your handler.
Would it be possible to use, e.g., a Qt keyboard event handler within the C++ code? Would a keyboard event be visible to such an event handler? Is it not intercepted by R / the terminal window / the OS?
Meshing R's loop, GUI loop and your own code will be a nightmare. For example, one problem is that if you are running the GUI loop and it triggers an event that R would otherwise handle (e.g. resizing plot window) you're in trouble since you can't let R do anything...
Does any existing R package contain signal handlers?
I'm not sure - I would definitely not recommend that to be used in packages since it's platform-dependent and changes the semantics of signals defined by R. But you can play with it ;). Cheers, Simo
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Dear Simon, thanks again for your explanations. Your previous e-mail clarified several points for me.
Actually, it just came to me that there is a hack you could use. [...]
That actually looks quite nice. At least when compared to my currently only alternative of not interrupting at all. I will test it, in particular with respect to computational speed. Perhaps I can at least call it once per second. Best regards, Peter
The
problem with it is that it will eat all errors, even if they were not
yours (e.g. those resulting from events triggered the event loop), so
I would not recommend it for general use. But here we go:
static void chkIntFn(void *dummy) { R_CheckUserInterrupt(); }
// this will call the above in a top-level context so it won't
longjmp-out of your context bool checkInterrupt() { return
(R_ToplevelExec(chkIntFn, NULL) == FALSE); }
// your code somewhere ... if (checkInterrupt()) { // user
interrupted ... }
You must call it on the main thread and you should be prepared that
it may take some time and may interact with the OS...
Cheers, Simon
On Apr 25, 2011, at 12:23 PM, Simon Urbanek wrote:
On Apr 25, 2011, at 11:09 AM, schattenpflanze at arcor.de wrote:
Thank you for your response, Simon.
1. Calling R_CheckUserInterrupt() interrupts immediately, so I have no possibility to exit my code gracefully. In particular, I suppose that objects created on the heap (e.g., STL containers) are not destructed properly.
In general, you're responsible for the cleanup. See R-devel archives for discussion on the interactions of C++ and R error handling. Generally, you should not use local objects and you should use on.exit to make sure you clean up.
I am using Rcpp (Rcpp-modules, to be precise). This means, I do actually not write any R code. Moreover, the C++ code does not use the R API. My C++ functions are 'exposed' to R via Rcpp, which creates suitable S4 classes. Rcpp does the exception handling. In particular, there is no obvious possibility for me to add an 'on.exit' statement to a particular exposed C++ method.
Generally, you should not use local objects
We are talking about large amounts of code, dozens of nested function calls, and even external libraries. So "not using local objects" is definitely no option.
But that would imply that the library calls R! Note that we're talking about the stack at the point of R API call, so you can do what you want until you cal R API. At the moment you touch R API you should have no local C++ objects on the stack (all the way down) - that's what I meant.
2. Calling R_CheckUserInterrupt() within a parallel OpenMP loop causes memory corruptions. Even if I do so within a critical section, it usually results in segfaults, crashes, or invalid variable contents afterwards. I suppose this is due to the threads not being destroyed properly. Since most of the time critical computations are done in parallel, this means I can hardly interrupt anything.
As you know R is not thread-safe so you cannot call any R API from a thread - including OMP threads - so obviously you can't call R_CheckUserInterrupt().
That is very interesting. Not being thread safe does not necessarily imply that a function cannot be called from within a thread (as long as it is not done concurrently from several threads). In particular, the main program itself is also a thread, isn't it?
Yes, but each thread has a separate stack, and you can only enter R with the same stack you left (because the stack will be restored to the state of the calling context).
Since no cleanup is done, however, it is now clear that calling R_CheckUserInterrupt() _anywhere_ in my program, parallel section or not, is a bad idea.
Since you're using threads the safe way is to perform your computations on a separate thread and let R handle events so that you can abort your computation thread as part of on.exit.
Starting the computations in a separate thread is a nice idea. I could then call R_CheckUserInterrupt() every x milliseconds in the function which dispatches the worker thread. Unfortunately, I see no obvious way of adding an "on.exit" statement to an Rcpp module method. So I would probably have to call an R function from C++ (e.g., using RInside) which contains the on.exit statement, which in turn calls again a C++ function setting a global 'abort' flag and waits for the threads to be terminated. Hmmm. How does on.exit work?
It sets the conexit object of the current context structure to the closure to be evaluated when the context is left. endcontext() then simply evaluates that closure when the context is left.
Could I mimic that behaviour directly in C++?
Unfortunately there is no C-level onexit hook and the internal structure of RCNTXT is not revealed to packages. So AFAICS the closest you can get is to use eval to call on.exit(). However, I think it would be useful to have a provision for creating a context with a C-level hook - the question is whether the others have the feeling that it's going to a too low level ...
Having a function similar to R_CheckUserInterrupt() but returning a boolean variable (has an interrupt occurred or not?) would solve these problems. Is there a way to find out about user interrupt requests (the user pressing ctrl+c or maybe a different set of keys) without interrupting immediately?
Checking for interrupts may involve running the OS event loop (to allow the user to interact with R) and thus is not guaranteed to return.
I see.
There is no general solution - if you're worried only about your, local code, then on unix, for example, you could use custom signal handlers to set a flag and co-operatively interrupt your program. On Windows there is the UserBreak flag which can be set by a separate thread and thus you may check on it. That said, all this is very much platform-specific.
Being able to set a flag is all I need and would be the perfect solution imho. However, I do not yet see how I could achieve that.
It is GUI-specific, unfortunately. AFAIR the Windows GUI does that because it's running on a separate thread. I think the X11-based GUIs use fds so the are synchronous and on OS X runs the OS loop inside the R event loop - so, again, synchronous.
How can I write a signal handler within C++ code which does not create a GUI and has no dedicated event dispatching thread?
That's simple just use signal() to register your handler.
Would it be possible to use, e.g., a Qt keyboard event handler within the C++ code? Would a keyboard event be visible to such an event handler? Is it not intercepted by R / the terminal window / the OS?
Meshing R's loop, GUI loop and your own code will be a nightmare. For example, one problem is that if you are running the GUI loop and it triggers an event that R would otherwise handle (e.g. resizing plot window) you're in trouble since you can't let R do anything...
Does any existing R package contain signal handlers?
I'm not sure - I would definitely not recommend that to be used in packages since it's platform-dependent and changes the semantics of signals defined by R. But you can play with it ;). Cheers, Simo
On Monday 25 April 2011, Simon Urbanek wrote:
Actually, it just came to me that there is a hack you could use. The problem with it is that it will eat all errors, even if they were not yours (e.g. those resulting from events triggered the event loop), so I would not recommend it for general use.
Here's another option which is probably not recommendable for general use, since it is not part of the documented API: On Windows you can look at the variable "UserBreak", available from Rembedded.h. Outside of Windows, you can look at R_interrupts_pending, available from R_ext/GraphicsDevice.h. R_ext/GraphicsDevice.h also has R_interrupts_suspended, which you may or may not want to take into account, depending on your use-case. BTW, being able to check for a pending interrupt or to schedule an interrupt from a separate thread is something that can come in handy in GUI development as well, and personally, I would appreciate, if there was some slightly more official support for this. Regards Thomas -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part. URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20110425/0be7c269/attachment.bin>
I have tested the solutions suggested by Simon and Thomas on a Linux machine. These are my findings:
On Windows you can look at the variable "UserBreak", available from Rembedded.h. Outside of Windows, you can look at R_interrupts_pending, available from R_ext/GraphicsDevice.h. R_ext/GraphicsDevice.h also has R_interrupts_suspended, which you may or may not want to take into account, depending on your use-case.
I did not manage to get this to work. Neither R_interrupts_pending nor R_interrupts_suspended seem to change when I press ctrl+c. Perhaps this is due to the fact that I run R in a terminal without any graphical interface?
static void chkIntFn(void *dummy) {
R_CheckUserInterrupt();
}
// this will call the above in a top-level context so it won't longjmp-out of your context
bool checkInterrupt() {
return (R_ToplevelExec(chkIntFn, NULL) == FALSE);
}
// your code somewhere ...
if (checkInterrupt()) { // user interrupted ... }
This solution works perfectly! It takes slightly longer to call this function than the plan R_CheckUserInterrupt() call, but in any reasonable scenario, the additional time is absolutely insignificant. Inside OpenMP parallel for constructs, one has to make sure that only the thread satisfying omp_get_thread_num()==0 makes the call (the 'master' construct cannot be nested inside a loop). I can then set a flag, which is queried by every thread in every loop cycle, causing fast termination of the parallel loop. After the loop, I throw an exception. Thus, my code is terminated gracefully with minimal effort. I can do additional cleanup operations (which usually is not necessary, since I use smart pointers), and report details on the interrupt to the user. With my limited testing, so far I have not noticed any downsides. Of course, there is the obvious drawback of not being supported officially (and thus maybe being subject to change), the question of portability, and the question of interoperability with other errors. Moreover, I have found an old thread discussing almost the same topic: http://tolstoy.newcastle.edu.au/R/e4/devel/08/05/1686.html . The thread was created in 2008, so the issue is not really a new one. The solution proposed there is actually the same as the one suggested by Simon, namely using R_ToplevelExec(). An officially supported, portable solution would of course be much appreciated! Best regards, Peter
On Apr 26, 2011, at 7:30 AM, schattenpflanze at arcor.de wrote:
I have tested the solutions suggested by Simon and Thomas on a Linux machine. These are my findings:
On Windows you can look at the variable "UserBreak", available from Rembedded.h. Outside of Windows, you can look at R_interrupts_pending, available from R_ext/GraphicsDevice.h. R_ext/GraphicsDevice.h also has R_interrupts_suspended, which you may or may not want to take into account, depending on your use-case.
I did not manage to get this to work. Neither R_interrupts_pending nor R_interrupts_suspended seem to change when I press ctrl+c. Perhaps this is due to the fact that I run R in a terminal without any graphical interface?
Thomas' suggestion was not aimed at your problem - it was sort of the inverse (more at your Qt question). If you want to interrupt R you can mess with those flags and them let R run the event loop. It doesn't work in your (original) case.
static void chkIntFn(void *dummy) {
R_CheckUserInterrupt();
}
// this will call the above in a top-level context so it won't longjmp-out of your context
bool checkInterrupt() {
return (R_ToplevelExec(chkIntFn, NULL) == FALSE);
}
// your code somewhere ...
if (checkInterrupt()) { // user interrupted ... }
This solution works perfectly! It takes slightly longer to call this function than the plan R_CheckUserInterrupt() call, but in any reasonable scenario, the additional time is absolutely insignificant. Inside OpenMP parallel for constructs, one has to make sure that only the thread satisfying omp_get_thread_num()==0 makes the call (the 'master' construct cannot be nested inside a loop). I can then set a flag, which is queried by every thread in every loop cycle, causing fast termination of the parallel loop. After the loop, I throw an exception. Thus, my code is terminated gracefully with minimal effort. I can do additional cleanup operations (which usually is not necessary, since I use smart pointers), and report details on the interrupt to the user. With my limited testing, so far I have not noticed any downsides. Of course, there is the obvious drawback of not being supported officially (and thus maybe being subject to change),
Actually, it is in the official API (Rinternals.h) so I don't think that is the issue.
the question of portability, and the question of interoperability with other errors.
It is portable as well, so I'd say the main concern is what happens when events trigger something that is not related to you and you eat those errors. They will act as user-interrupt to you even if it's not what the user intended. One could argue that it's the lesser of the evils, because if you don't do anything R will just block so those events would have to wait until you're done anyway.
Moreover, I have found an old thread discussing almost the same topic: http://tolstoy.newcastle.edu.au/R/e4/devel/08/05/1686.html . The thread was created in 2008, so the issue is not really a new one. The solution proposed there is actually the same as the one suggested by Simon, namely using R_ToplevelExec().
Interesting - I'm glad Luke also suggested C-level onexit bac then - it is something I was thinking about before .. Cheers, Simon
An officially supported, portable solution would of course be much appreciated! Best regards, Peter
Hi, I've been thinking about how to handle c++ threads that were started via Rcpp calls to some of my c++ libraries from R. My main obstacle is trying to make sure that users don't try to process files that are being generated by a thread before the thread finishes. One thing I am considering is having my threaded code return a class to R that contains a pointer that it remembers. Then maybe I could just change the value at that pointer when my thread finishes. Does that seem like a reasonable approach? I'm not completely sure if this is related to your issue or not, but it might be similar enough to be worth asking... Thanks, Sean
On 4/26/11 9:21 AM, "Simon Urbanek" <simon.urbanek at r-project.org> wrote:
On Apr 26, 2011, at 7:30 AM, schattenpflanze at arcor.de wrote:
I have tested the solutions suggested by Simon and Thomas on a Linux machine. These are my findings:
On Windows you can look at the variable "UserBreak", available from Rembedded.h. Outside of Windows, you can look at R_interrupts_pending, available from R_ext/GraphicsDevice.h. R_ext/GraphicsDevice.h also has R_interrupts_suspended, which you may or may not want to take into account, depending on your use-case.
I did not manage to get this to work. Neither R_interrupts_pending nor R_interrupts_suspended seem to change when I press ctrl+c. Perhaps this is due to the fact that I run R in a terminal without any graphical interface?
Thomas' suggestion was not aimed at your problem - it was sort of the inverse (more at your Qt question). If you want to interrupt R you can mess with those flags and them let R run the event loop. It doesn't work in your (original) case.
static void chkIntFn(void *dummy) {
R_CheckUserInterrupt();
}
// this will call the above in a top-level context so it won't longjmp-out
of your context
bool checkInterrupt() {
return (R_ToplevelExec(chkIntFn, NULL) == FALSE);
}
// your code somewhere ...
if (checkInterrupt()) { // user interrupted ... }
This solution works perfectly! It takes slightly longer to call this function than the plan R_CheckUserInterrupt() call, but in any reasonable scenario, the additional time is absolutely insignificant. Inside OpenMP parallel for constructs, one has to make sure that only the thread satisfying omp_get_thread_num()==0 makes the call (the 'master' construct cannot be nested inside a loop). I can then set a flag, which is queried by every thread in every loop cycle, causing fast termination of the parallel loop. After the loop, I throw an exception. Thus, my code is terminated gracefully with minimal effort. I can do additional cleanup operations (which usually is not necessary, since I use smart pointers), and report details on the interrupt to the user. With my limited testing, so far I have not noticed any downsides. Of course, there is the obvious drawback of not being supported officially (and thus maybe being subject to change),
Actually, it is in the official API (Rinternals.h) so I don't think that is the issue.
the question of portability, and the question of interoperability with other errors.
It is portable as well, so I'd say the main concern is what happens when events trigger something that is not related to you and you eat those errors. They will act as user-interrupt to you even if it's not what the user intended. One could argue that it's the lesser of the evils, because if you don't do anything R will just block so those events would have to wait until you're done anyway.
Moreover, I have found an old thread discussing almost the same topic: http://tolstoy.newcastle.edu.au/R/e4/devel/08/05/1686.html . The thread was created in 2008, so the issue is not really a new one. The solution proposed there is actually the same as the one suggested by Simon, namely using R_ToplevelExec().
Interesting - I'm glad Luke also suggested C-level onexit bac then - it is something I was thinking about before .. Cheers, Simon
An officially supported, portable solution would of course be much appreciated! Best regards, Peter
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Sean,
On Apr 26, 2011, at 5:06 PM, Sean Robert McGuffee wrote:
I've been thinking about how to handle c++ threads that were started via Rcpp calls to some of my c++ libraries from R. My main obstacle is trying to make sure that users don't try to process files that are being generated by a thread before the thread finishes. One thing I am considering is having my threaded code return a class to R that contains a pointer that it remembers. Then maybe I could just change the value at that pointer when my thread finishes. Does that seem like a reasonable approach? I'm not completely sure if this is related to your issue or not, but it might be similar enough to be worth asking...
It depends. For a simple flag it's actually much more simple than that - you can create a boolean vector (make sure you preserve it) and just update its value when it's done - you don't even need an external pointer for that (if your'e careful). But the slight problem with that approach is rather that you don't have a way to tell R about the status change, so essentially you can only poll on the R side. A more proper way to deal with this is to use the event loop signaling to signal in R that the flag has changed. I'm working on a "threads" package that should help with that, but it's not complete yet (you can spawn threads from R and you can actually even synchronize them with R [so if the result is all you want it's there], but semaphores are not implemented yet --- your inquiry should shift it further up on my todo stack ;)). Cheers, Simon
On 4/26/11 9:21 AM, "Simon Urbanek" <simon.urbanek at r-project.org> wrote:
On Apr 26, 2011, at 7:30 AM, schattenpflanze at arcor.de wrote:
I have tested the solutions suggested by Simon and Thomas on a Linux machine. These are my findings:
On Windows you can look at the variable "UserBreak", available from Rembedded.h. Outside of Windows, you can look at R_interrupts_pending, available from R_ext/GraphicsDevice.h. R_ext/GraphicsDevice.h also has R_interrupts_suspended, which you may or may not want to take into account, depending on your use-case.
I did not manage to get this to work. Neither R_interrupts_pending nor R_interrupts_suspended seem to change when I press ctrl+c. Perhaps this is due to the fact that I run R in a terminal without any graphical interface?
Thomas' suggestion was not aimed at your problem - it was sort of the inverse (more at your Qt question). If you want to interrupt R you can mess with those flags and them let R run the event loop. It doesn't work in your (original) case.
static void chkIntFn(void *dummy) {
R_CheckUserInterrupt();
}
// this will call the above in a top-level context so it won't longjmp-out
of your context
bool checkInterrupt() {
return (R_ToplevelExec(chkIntFn, NULL) == FALSE);
}
// your code somewhere ...
if (checkInterrupt()) { // user interrupted ... }
This solution works perfectly! It takes slightly longer to call this function than the plan R_CheckUserInterrupt() call, but in any reasonable scenario, the additional time is absolutely insignificant. Inside OpenMP parallel for constructs, one has to make sure that only the thread satisfying omp_get_thread_num()==0 makes the call (the 'master' construct cannot be nested inside a loop). I can then set a flag, which is queried by every thread in every loop cycle, causing fast termination of the parallel loop. After the loop, I throw an exception. Thus, my code is terminated gracefully with minimal effort. I can do additional cleanup operations (which usually is not necessary, since I use smart pointers), and report details on the interrupt to the user. With my limited testing, so far I have not noticed any downsides. Of course, there is the obvious drawback of not being supported officially (and thus maybe being subject to change),
Actually, it is in the official API (Rinternals.h) so I don't think that is the issue.
the question of portability, and the question of interoperability with other errors.
It is portable as well, so I'd say the main concern is what happens when events trigger something that is not related to you and you eat those errors. They will act as user-interrupt to you even if it's not what the user intended. One could argue that it's the lesser of the evils, because if you don't do anything R will just block so those events would have to wait until you're done anyway.
Moreover, I have found an old thread discussing almost the same topic: http://tolstoy.newcastle.edu.au/R/e4/devel/08/05/1686.html . The thread was created in 2008, so the issue is not really a new one. The solution proposed there is actually the same as the one suggested by Simon, namely using R_ToplevelExec().
Interesting - I'm glad Luke also suggested C-level onexit bac then - it is something I was thinking about before .. Cheers, Simon
An officially supported, portable solution would of course be much appreciated! Best regards, Peter
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Hi Simon, That makes a lot of sense to me. I'll start reading about R's event loop signaling. I'm not sure what the best method will be for me to flag the completeness of a threaded process in my case. In abstract it seems that I could get R's event loop to look for any type of flag. I think key for me in this case will be identifying whether a particular file has been completely produced or not. In principle I could put that type of info into the file itself, but I think I could also make a temp file somewhere with it's full path and flag info about it. Then the event loop could look for a particular pattern of temp file names. On the other hand, if I pass in that info when I start the event loop, that might work too. Regarding the external pointer idea, I was thinking about passing an object to R as a return value after launching the thread, and then I might be able to access a pointer inside that object to reference it from my thread. That could be a binary vector or any type of object if I can figure out how to get to it from my thread. Honestly, I don't know much about dynamic referencing of objects from separate threads, but in principle memory is shared in this case. I'll let you know if I come up with anything generic... Please keep me posted on your package. Are any versions of it available yet? It didn't happen to come up on my list of R packages. I haven't necessarily been maintaining an up-to-date version of R though. I don't know if that influences the package list it shows me. Sean
On 4/26/11 8:51 PM, "Simon Urbanek" <simon.urbanek at r-project.org> wrote:
Sean, On Apr 26, 2011, at 5:06 PM, Sean Robert McGuffee wrote:
I've been thinking about how to handle c++ threads that were started via Rcpp calls to some of my c++ libraries from R. My main obstacle is trying to make sure that users don't try to process files that are being generated by a thread before the thread finishes. One thing I am considering is having my threaded code return a class to R that contains a pointer that it remembers. Then maybe I could just change the value at that pointer when my thread finishes. Does that seem like a reasonable approach? I'm not completely sure if this is related to your issue or not, but it might be similar enough to be worth asking...
It depends. For a simple flag it's actually much more simple than that - you can create a boolean vector (make sure you preserve it) and just update its value when it's done - you don't even need an external pointer for that (if your'e careful). But the slight problem with that approach is rather that you don't have a way to tell R about the status change, so essentially you can only poll on the R side. A more proper way to deal with this is to use the event loop signaling to signal in R that the flag has changed. I'm working on a "threads" package that should help with that, but it's not complete yet (you can spawn threads from R and you can actually even synchronize them with R [so if the result is all you want it's there], but semaphores are not implemented yet --- your inquiry should shift it further up on my todo stack ;)). Cheers, Simon
Sean,
On Apr 27, 2011, at 3:21 PM, Sean Robert McGuffee wrote:
Hi Simon, That makes a lot of sense to me. I'll start reading about R's event loop signaling. I'm not sure what the best method will be for me to flag the completeness of a threaded process in my case. In abstract it seems that I could get R's event loop to look for any type of flag. I think key for me in this case will be identifying whether a particular file has been completely produced or not. In principle I could put that type of info into the file itself, but I think I could also make a temp file somewhere with it's full path and flag info about it. Then the event loop could look for a particular pattern of temp file names. On the other hand, if I pass in that info when I start the event loop, that might work too.
Usually, the easiest on unix is to register a file handle as input handler (addInputHandler) - in practice a pipe - one end is owned by the thread and the other is owned by R. Then all you need is to write anything on the thread's end and it will wake up R's even loop and let you handle the read on that end so you can do anything. You could even have multiple threads share this one pipe since you could distinguish by payload which thread is calling. One example of this is the integrated HTTP server in R - see Rhttpd sources (it has also a variant that works on Windows using synchronization via OS event loop).
Regarding the external pointer idea, I was thinking about passing an object to R as a return value after launching the thread, and then I might be able to access a pointer inside that object to reference it from my thread. That could be a binary vector or any type of object if I can figure out how to get to it from my thread. Honestly, I don't know much about dynamic referencing of objects from separate threads, but in principle memory is shared in this case. I'll let you know if I come up with anything generic... Please keep me posted on your package. Are any versions of it available yet?
Yes, it is not released yet since it's not quite complete, but here we go, at your own risk ;): http://rforge.net/threads It will work on all platforms, eventually, but currently only unix is supported. The idea is sort of taking the multicore paradigm (parallel + collect) but using threads (threadEval + yield). The documentation it currently non-existent, but I plan to write a vignette for it ... maybe later this week ... Cheers, Simon
It didn't happen to come up on my list of R packages. I haven't necessarily been maintaining an up-to-date version of R though. I don't know if that influences the package list it shows me. Sean On 4/26/11 8:51 PM, "Simon Urbanek" <simon.urbanek at r-project.org> wrote:
Sean, On Apr 26, 2011, at 5:06 PM, Sean Robert McGuffee wrote:
I've been thinking about how to handle c++ threads that were started via Rcpp calls to some of my c++ libraries from R. My main obstacle is trying to make sure that users don't try to process files that are being generated by a thread before the thread finishes. One thing I am considering is having my threaded code return a class to R that contains a pointer that it remembers. Then maybe I could just change the value at that pointer when my thread finishes. Does that seem like a reasonable approach? I'm not completely sure if this is related to your issue or not, but it might be similar enough to be worth asking...
It depends. For a simple flag it's actually much more simple than that - you can create a boolean vector (make sure you preserve it) and just update its value when it's done - you don't even need an external pointer for that (if your'e careful). But the slight problem with that approach is rather that you don't have a way to tell R about the status change, so essentially you can only poll on the R side. A more proper way to deal with this is to use the event loop signaling to signal in R that the flag has changed. I'm working on a "threads" package that should help with that, but it's not complete yet (you can spawn threads from R and you can actually even synchronize them with R [so if the result is all you want it's there], but semaphores are not implemented yet --- your inquiry should shift it further up on my todo stack ;)). Cheers, Simon
Peter,
On 25/04/11 10:22, schattenpflanze at arcor.de wrote:
1. Calling R_CheckUserInterrupt() interrupts immediately, so I have no possibility to exit my code gracefully. In particular, I suppose that objects created on the heap (e.g., STL containers) are not destructed properly.
Sorry not to have seen this thread sooner. You may like to give CXXR a try (http://www.cs.kent.ac.uk/projects/cxxr/). In CXXR the R interpreter is written in C++, and a user interrupt is handled by throwing a C++ exception, so the stack is unwound in an orderly fashion, destructors are invoked, etc. However, it's fair to say that in using CXXR with a multi-threaded program you'll be on the bleeding edge... Andrew
Andrew,
You may like to give CXXR a try (http://www.cs.kent.ac.uk/projects/cxxr/). In CXXR the R interpreter is written in C++, and a user interrupt is handled by throwing a C++ exception, so the stack is unwound in an orderly fashion, destructors are invoked, etc.
Thank you for this suggestion. CXXR is a very interesting project! For my current project, however, I aim at distributing the program to other R users on pre-installed cluster nodes. Thus, I have no choice with respect to the underlying R interpreter. Best regards, Peter