Skip to content

concurrent requests (Rook, but I think the question is more general)

13 messages · Richard Morey, Dan Tenenbaum, Simon Urbanek

#
This question involves Rook, but I think the answer will be general 
enough that it pays to post here. At any rate, I don't know enough to 
know whether this is a Rook only issue or a general R issue.

Here's what I'd like to do (and indeed, have code that should do this):

1. Start R, Rook
2. Start an analysis via a HTTP request to Rook. This analysis uses 
.Call() to some compiled C code, if that matters. The C code calls a 
callback function to update a variable with its progress.
3. While the analysis is happening, use Rook to obtain current status 
with an HTTP request

The problem is that once the analysis starts, Rook does not respond to 
requests. All of the status requests to Rook pile up, and then are 
answered when the analysis (step 2) is done. Here is some example code 
to demonstrate what the issue:

##########

library(Rook)
s <- Rhttpd$new()
s$add(
   name="pingpong",
   app=Rook::URLMap$new(
     '/ping' = function(env){
       req <- Rook::Request$new(env)
       res <- Rook::Response$new()
       res$write('This is ping.')
       Sys.sleep(20)
       res$finish()
     },
     '/pong' = function(env){
       req <- Rook::Request$new(env)
       res <- Rook::Response$new()
       res$write("This is pong.")
       res$finish()
     },
     '/?' = function(env){
       req <- Rook::Request$new(env)
       res <- Rook::Response$new()
       res$redirect(req$to_url('/pong'))
       res$finish()
     }
   )
)

s$start(quiet=TRUE)
s$browse('pingpong')

#############################

If you request /ping, R runs Sys.sleep() for 20 seconds. This is where 
my .Call() statement would be. While the .Call() (Sys.sleep()) function 
is doing its thing, I need to get Rook to respond on /pong (which would 
simply respond with the progress), but if you run this code, request 
/ping, then immediately request /pong, you'll see that the /pong request 
will not be answered until the Sys.sleep() is done.

Of course, for a progress report to be useful, the requests have to be 
answered immediately. Is this a Rook issue, or an R issue? Or am I 
asking something unreasonable?
#
On Wed, Oct 24, 2012 at 11:13 AM, Richard D. Morey <r.d.morey at rug.nl> wrote:
One answer would be to start an Rserve instance on your local machine.
When your web app initiates processing, it actually starts the
long-running task on the server with RS.eval(wait=FALSE). See ?RCC
with the RS.client package loaded.
Then when you check for task completion, call RS.collect () with a
short timeout, and if it has something for you it will give it to you.

That doesn't give you a numeric progress report, but perhaps if your
long-running task writes its status somewhere (to a file?) the
progress-checking task could look there as well.

Dan
#
On Oct 24, 2012, at 2:13 PM, Richard D. Morey wrote:

            
You can't. R doesn't support threading so it's simply not possible to have an asynchronous eval. The R HTTP server works by simply enqueuing an eval to run while R is idle, it can't do that if R is busy. (Note that the HTTP server was *only* designed for the internal help).

What you can do is have your C code start another thread that reports the progress when asked e.g. on a socket, but that thread is not allowed to call any R API so you want that progress to be entirely in your C code.

Note that if your C code is robust enough, it can call R_CheckUserInterrupt() to allow external events to happen, but a) your C code must in that case be prepared for early termination (clean up memory etc.) and b) I don't remember if the httpd is allowed to run during interrupt check on all platforms - you may want to check that first.

Cheers,
Simon
#
On 24/10/12 8:53 PM, Simon Urbanek wrote:
How can I start a new thread? By running R again from the command line, 
or is there a better way?
I do use R_CheckUserInterrupt(), but if I understand what you're saying 
then given that it doesn't work, httpd must not run during the interrupt 
check. At least, on OSX, which is what I'm testing on.
#
On Oct 24, 2012, at 3:09 PM, Richard D. Morey wrote:

            
No, you have to use the system thread API like pthreads, NSThread etc. If you have to ask about this, you probably don't want to go there ;) - threads can be quite dangerous if you are not familiar with them.

Another poor man's solution is to simply have your C code write out a file with the progress. Along the same lines you could use a shared object to store the progress (e.g. via bigmemory) ...
Yes, if your code calls R_CheckUserInterrupt() and httpd doesn't respond at that point then it may not be allowed to run. On OS X you can try your luck with R_ProcessEvents() as well.

Cheers,
Simon
#
Richard D. Morey
Assistant Professor
Psychometrics and Statistics
Rijksuniversiteit Groningen / University of Groningen
http://drsmorey.org/research/rdmorey
On 24/10/12 9:23 PM, Simon Urbanek wrote:
I'd be fine with the poor man's solution (maybe with tempfile()?) if I 
can get access to the local file via javascript. But I don't think I 
can, due to the security limitations of the browser. I may have to 
rethink this significantly.
#
On Oct 24, 2012, at 3:47 PM, Richard D. Morey wrote:

            
That should be no problem, you can have another R instance serving the monitoring page (or you can use Rserve's HTTP server and have just one instance with arbitrarily many connections as needed).

Cheers,
Simon
#
On 24/10/12 10:07 PM, Simon Urbanek wrote:
OK, I'm looking at Rserve. I get the impression that one needs to start 
a separate server, so it would be difficult to make this transparent to 
a user who installs my package and just wants to do an analysis with a 
GUI. It also appears that there is a separate binary install, at least 
on Windows, which would mean anyone using my package would need to 
install and run something separate. Is that accurate?
#
On Oct 24, 2012, at 4:35 PM, Richard D. Morey wrote:

            
Not quite - it is all in the Rserve package, many years ago we used to supply Rserve.exe just because it did not require installation of any packages, but we don't do that anymore. Typically Java GUIs use Rserve to run R - it's trivial to start it (e.g., there is the StartRserve.java example) so your GUI can start it and shut it down. I don't know how you are driving your GUI so this may or may not be a good way - on unix you get the benefit of parallel HTTP server, that's really all I was pointing out. BTW: if you are talking web-GUI then Rserve is great for that because the GUI consist simply of R scripts (see FastRWeb) - but I mat be thinking bigger than what you have in mind.

The point is that you need a separate monitoring process or threads. That process can be R, Rserve or any thing else.

Cheers,
Simon
#
On 24/10/12 10:55 PM, Simon Urbanek wrote:
Thanks for the tips. This is what I'm currently contemplating:

1. Main interface starts in user's R session, and opens up the interface 
(HTML/Javascript using Rook package)
2. When analysis starts, Rserve is started, with its own web server, 
using Rook, for status updates
3. During analysis, main process calls a callback function which uses 
RSassign() to send progress updates to the Rserve server
4. HTML/Javascript interface can connect to the webserver on the Rserve 
server to get status updates
5. When analysis is done, use RSshutdown() and RSclose() to clean up.

Does this seem reasonable?

One problem I'm having is that when I start Rook on the Rserve server, 
the webserver does not respond (although it is started). Does Rserve 
only respond to requests on the port assigned for RSclient commands?

Best,
Richard

Here's an example:

#######################

library(Rserve)

### This works:

stuff = expression({
   library(Rook)
   s <- Rhttpd$new()
   s$add(
     app=system.file('exampleApps/helloworld.R',package='Rook'),
     name='hello'
   )
   s$start(quiet=TRUE)
   s$browse(1)
   print(s$full_url(1))
})

# This will open the browser to the test app, asking for your name
eval(stuff)

### This does not:

Rserve(args="--no-save")
c <- RSconnect()
RSassign(c, stuff)

# This opens the browser to the correct URL, but the webserver doesn't 
respond.
RSeval(c, quote(eval(stuff)))


#####
# cleanup
RSshutdown(c)
RSclose(c)
#
On Thu, Oct 25, 2012 at 8:45 AM, Richard D. Morey <r.d.morey at rug.nl> wrote:
This "works" for me if I omit the quote(). I get an error, but the
webapp seems to work. Also, I didn't eval(stuff) locally, only on the
server:
Loading required package: tools
Loading required package: brew
starting httpd help server ... done
[1] "http://127.0.0.1:15583/custom/hello"
Error in parse(text = paste("{", paste(expr, collapse = "\n"), "}")) :
  <text>:1:8: unexpected '/'
1: { http:/
          ^
3: parse(text = paste("{", paste(expr, collapse = "\n"), "}"))
2: serialize(parse(text = paste("{", paste(expr, collapse = "\n"),
       "}"))[[1]], NULL)
1: RSeval(c, eval(stuff))
R Under development (unstable) (2012-10-23 r61007)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] tools     stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] Rook_1.0-8          brew_1.0-6          Rserve_0.6-8
[4] BiocInstaller_1.9.4

Dan
#
On 25/10/12 7:14 PM, Dan Tenenbaum wrote:
I'll try this in a bit and see what happens.

I forgot my sessionInfo()...

 > sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8

attached base packages:
[1] tools     stats     graphics  grDevices utils     datasets methods   
base

other attached packages:
[1] Rook_1.0-8   brew_1.0-6   Rserve_0.6-8
#
Richard,

that is not what I had in mind :). Also Rserve 0.x doesn't have built-in HTTP, only the 1.x series does. Unfortunately I don't have any time today to write example code, but I would suggest using HTTP for both - just have a hook that simply accepts serialized R object in the body of the HTTP request and assigns it to your result. Then you have a second hook that delivers the result back (in whatever form you want it). It should be really easy to do with Rook in a separate process. In your computation session you simply fire off a HTTP POST request with the current status as a serialized R object.

Cheers,
Simon
On Oct 25, 2012, at 11:45 AM, Richard D. Morey wrote: