Dear Rcpp developers: Thanks for developing and maintaining the Rcpp package. I wrote a function in Rcpp. It is available throughout the package and works as expected; however, it is not available for praLapply workers. A temporary fix is just using Rcpp::cppFunction inside the function that parLapply workers call and copy the entire function over there. However, this does not seem right for bigger and more complicated functions. I would be grateful if you could let me know whether there is a better long-term solution. Here is the package and three functions that you might want to take a look at. Original cpp function: https://github.com/fasrc/CausalGPS/blob/master/src/compute_closest_wgps_helper.cpp Wrapper function that calls this function + temporal fix: https://github.com/fasrc/CausalGPS/blob/master/R/compute_closest_wgps.R The function that uses parLapply (please see line 63-89) to run the c++ code: https://github.com/fasrc/CausalGPS/blob/master/R/compute_closest_wgps.R Best regards, Naeem -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20210514/c5d5dc6e/attachment.html>
[Rcpp-devel] Exporting rcpp-based function into parLapply workers in an R package
12 messages · R. Michael Weylandt, Jeff Newmiller, Naeem Khoshnevis +2 more
Hi Naeem, My (very quick) guess is that this isn't an Rcpp problem per se, but a parLapply problem. You need to explicitly load your package on each worker so that functions from it are available. See, e.g., the brief discussion here: https://stackoverflow.com/questions/18357788/parallel-parlapply-setup#18358875 The "parallel" packages do not exactly replicate your environment on each worker node (to avoid expensive set-up / communication costs) so you need to do a bit more set-up. Best, Michael On Fri, May 14, 2021 at 11:49 AM Naeem Khoshnevis
<khoshnevis.naeem at gmail.com> wrote:
Dear Rcpp developers: Thanks for developing and maintaining the Rcpp package. I wrote a function in Rcpp. It is available throughout the package and works as expected; however, it is not available for praLapply workers. A temporary fix is just using Rcpp::cppFunction inside the function that parLapply workers call and copy the entire function over there. However, this does not seem right for bigger and more complicated functions. I would be grateful if you could let me know whether there is a better long-term solution. Here is the package and three functions that you might want to take a look at. Original cpp function: https://github.com/fasrc/CausalGPS/blob/master/src/compute_closest_wgps_helper.cpp Wrapper function that calls this function + temporal fix: https://github.com/fasrc/CausalGPS/blob/master/R/compute_closest_wgps.R The function that uses parLapply (please see line 63-89) to run the c++ code: https://github.com/fasrc/CausalGPS/blob/master/R/compute_closest_wgps.R Best regards, Naeem
_______________________________________________ Rcpp-devel mailing list Rcpp-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Hi Michael, Thank you so much for your response. That is correct. One method for exporting required variables/functions is using the clusterExport function, which does not work for Rcpp-based functions. Another option is using clusterEvalQ (as mentioned in the shared post); however, I am not sure if CRAN likes to see the library(package name) inside the codebase. What are your thoughts? Best regards, Naeem On Fri, May 14, 2021 at 11:57 AM Michael Weylandt <
michael.weylandt at gmail.com> wrote:
Hi Naeem, My (very quick) guess is that this isn't an Rcpp problem per se, but a parLapply problem. You need to explicitly load your package on each worker so that functions from it are available. See, e.g., the brief discussion here: https://stackoverflow.com/questions/18357788/parallel-parlapply-setup#18358875 The "parallel" packages do not exactly replicate your environment on each worker node (to avoid expensive set-up / communication costs) so you need to do a bit more set-up. Best, Michael On Fri, May 14, 2021 at 11:49 AM Naeem Khoshnevis <khoshnevis.naeem at gmail.com> wrote:
Dear Rcpp developers: Thanks for developing and maintaining the Rcpp package. I wrote a function in Rcpp. It is available throughout the package and
works as expected; however, it is not available for praLapply workers. A temporary fix is just using Rcpp::cppFunction inside the function that parLapply workers call and copy the entire function over there. However, this does not seem right for bigger and more complicated functions.
I would be grateful if you could let me know whether there is a better
long-term solution. Here is the package and three functions that you might want to take a look at.
Original cpp function:
Wrapper function that calls this function + temporal fix: https://github.com/fasrc/CausalGPS/blob/master/R/compute_closest_wgps.R The function that uses parLapply (please see line 63-89) to run the c++
code:
https://github.com/fasrc/CausalGPS/blob/master/R/compute_closest_wgps.R Best regards, Naeem
_______________________________________________ Rcpp-devel mailing list Rcpp-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20210514/e51b5e2d/attachment.html>
clusterExport works just fine if you put your Rcpp code into your own package and make that package available in your worker environment. Given the need for compilation in possibly a variety of computing environments for parallel processing this is definitely recommended.
On May 14, 2021 10:35:25 AM PDT, Naeem Khoshnevis <khoshnevis.naeem at gmail.com> wrote:
Hi Michael, Thank you so much for your response. That is correct. One method for exporting required variables/functions is using the clusterExport function, which does not work for Rcpp-based functions. Another option is using clusterEvalQ (as mentioned in the shared post); however, I am not sure if CRAN likes to see the library(package name) inside the codebase. What are your thoughts? Best regards, Naeem On Fri, May 14, 2021 at 11:57 AM Michael Weylandt < michael.weylandt at gmail.com> wrote:
Hi Naeem, My (very quick) guess is that this isn't an Rcpp problem per se, but
a
parLapply problem. You need to explicitly load your package on each worker so that functions from it are available. See, e.g., the brief discussion here:
The "parallel" packages do not exactly replicate your environment on each worker node (to avoid expensive set-up / communication costs) so you need to do a bit more set-up. Best, Michael On Fri, May 14, 2021 at 11:49 AM Naeem Khoshnevis <khoshnevis.naeem at gmail.com> wrote:
Dear Rcpp developers: Thanks for developing and maintaining the Rcpp package. I wrote a function in Rcpp. It is available throughout the package
and
works as expected; however, it is not available for praLapply
workers. A
temporary fix is just using Rcpp::cppFunction inside the function
that
parLapply workers call and copy the entire function over there.
However,
this does not seem right for bigger and more complicated functions.
I would be grateful if you could let me know whether there is a
better
long-term solution. Here is the package and three functions that you
might
want to take a look at.
Original cpp function:
Wrapper function that calls this function + temporal fix:
The function that uses parLapply (please see line 63-89) to run the
c++
code:
Best regards, Naeem
_______________________________________________ Rcpp-devel mailing list Rcpp-devel at lists.r-forge.r-project.org
Sent from my phone. Please excuse my brevity.
Thank you so much, Jeff. The part that I do not understand is the "and make that package available in your worker environment" part. Could you please let me know how I can make the package available for each worker. On Fri, May 14, 2021 at 1:44 PM Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
clusterExport works just fine if you put your Rcpp code into your own package and make that package available in your worker environment. Given the need for compilation in possibly a variety of computing environments for parallel processing this is definitely recommended. On May 14, 2021 10:35:25 AM PDT, Naeem Khoshnevis < khoshnevis.naeem at gmail.com> wrote:
Hi Michael, Thank you so much for your response. That is correct. One method for exporting required variables/functions is using the clusterExport function, which does not work for Rcpp-based functions. Another option is using clusterEvalQ (as mentioned in the shared post); however, I am not sure if CRAN likes to see the library(package name) inside the codebase. What are your thoughts? Best regards, Naeem On Fri, May 14, 2021 at 11:57 AM Michael Weylandt < michael.weylandt at gmail.com> wrote:
Hi Naeem, My (very quick) guess is that this isn't an Rcpp problem per se, but
a
parLapply problem. You need to explicitly load your package on each worker so that functions from it are available. See, e.g., the brief discussion here:
The "parallel" packages do not exactly replicate your environment on each worker node (to avoid expensive set-up / communication costs) so you need to do a bit more set-up. Best, Michael On Fri, May 14, 2021 at 11:49 AM Naeem Khoshnevis <khoshnevis.naeem at gmail.com> wrote:
Dear Rcpp developers: Thanks for developing and maintaining the Rcpp package. I wrote a function in Rcpp. It is available throughout the package
and
works as expected; however, it is not available for praLapply
workers. A
temporary fix is just using Rcpp::cppFunction inside the function
that
parLapply workers call and copy the entire function over there.
However,
this does not seem right for bigger and more complicated functions.
I would be grateful if you could let me know whether there is a
better
long-term solution. Here is the package and three functions that you
might
want to take a look at.
Original cpp function:
Wrapper function that calls this function + temporal fix:
The function that uses parLapply (please see line 63-89) to run the
c++
code:
Best regards, Naeem
_______________________________________________ Rcpp-devel mailing list Rcpp-devel at lists.r-forge.r-project.org
-- Sent from my phone. Please excuse my brevity.
-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20210514/c9ddba15/attachment.html>
You "just" run clusterEvalQ(library(PKG)) on each cluster. Like Jeff
says, this is much easier for you as a package developer and, I'd
argue, easier for your users as well, since they just have to make
sure the package can be installed once, rather than having compilers
ready and behaving for each use.
The CRAN organization / mirror on GitHub (github.com/cran) is very
useful for this sort of thing.
Searching for "library(" and "clusterEvalQ"
(https://github.com/search?q=org%3Acran+library%28+clusterEvalQ&type=code)
in that organization yields the following result (chosen at random):
https://github.com/cran/textmineR/blob/889b400b2ccdc4eac7b9fee5dd7678bd71f0b290/R/other_utilities.R#L51
where you can see how the textmineR package loads itself on each worker.
To your earlier question, I *think* CRAN is ok to
"clusterEvalQ(library(PKG))" within one of your functions (as
evidenced by the search of CRAN above), but I've never done it myself,
so can't confirm.
Michael
On Fri, May 14, 2021 at 1:57 PM Naeem Khoshnevis
<khoshnevis.naeem at gmail.com> wrote:
Thank you so much, Jeff. The part that I do not understand is the "and make that package available in your worker environment" part. Could you please let me know how I can make the package available for each worker. On Fri, May 14, 2021 at 1:44 PM Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
clusterExport works just fine if you put your Rcpp code into your own package and make that package available in your worker environment. Given the need for compilation in possibly a variety of computing environments for parallel processing this is definitely recommended. On May 14, 2021 10:35:25 AM PDT, Naeem Khoshnevis <khoshnevis.naeem at gmail.com> wrote:
Hi Michael, Thank you so much for your response. That is correct. One method for exporting required variables/functions is using the clusterExport function, which does not work for Rcpp-based functions. Another option is using clusterEvalQ (as mentioned in the shared post); however, I am not sure if CRAN likes to see the library(package name) inside the codebase. What are your thoughts? Best regards, Naeem On Fri, May 14, 2021 at 11:57 AM Michael Weylandt < michael.weylandt at gmail.com> wrote:
Hi Naeem, My (very quick) guess is that this isn't an Rcpp problem per se, but
a
parLapply problem. You need to explicitly load your package on each worker so that functions from it are available. See, e.g., the brief discussion here:
The "parallel" packages do not exactly replicate your environment on each worker node (to avoid expensive set-up / communication costs) so you need to do a bit more set-up. Best, Michael On Fri, May 14, 2021 at 11:49 AM Naeem Khoshnevis <khoshnevis.naeem at gmail.com> wrote:
Dear Rcpp developers: Thanks for developing and maintaining the Rcpp package. I wrote a function in Rcpp. It is available throughout the package
and
works as expected; however, it is not available for praLapply
workers. A
temporary fix is just using Rcpp::cppFunction inside the function
that
parLapply workers call and copy the entire function over there.
However,
this does not seem right for bigger and more complicated functions.
I would be grateful if you could let me know whether there is a
better
long-term solution. Here is the package and three functions that you
might
want to take a look at.
Original cpp function:
Wrapper function that calls this function + temporal fix:
The function that uses parLapply (please see line 63-89) to run the
c++
code:
Best regards, Naeem
_______________________________________________ Rcpp-devel mailing list Rcpp-devel at lists.r-forge.r-project.org
-- Sent from my phone. Please excuse my brevity.
Thank you so much, Michael and Jeff. I really appreciate your help. These are invaluable suggestions and recommendations. Thanks, Dirk, for bringing this email list to my attention. Best regards, Naeem -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20210514/682e9b81/attachment.html>
Naeem, The best path for including compiled code in a package is to place it within the `src/` directory instead of using `Rcpp::cppFunction()` to compile. The reasons for this are stated succiently here: https://stackoverflow.com/a/6074391/1345455
This is great, James. Thank you so much for sharing. Best regards, Naeem On Fri, May 14, 2021 at 3:09 PM Balamuta, James Joseph <
balamut2 at illinois.edu> wrote:
Naeem, The best path for including compiled code in a package is to place it within the `src/` directory instead of using `Rcpp::cppFunction()` to compile. The reasons for this are stated succiently here: https://stackoverflow.com/a/6074391/1345455 From there, the C++ can easily be exported across parallel workers just by loading the package. Consider looking at how this example package using doParallel is structured. https://github.com/r-pkg-examples/rcpp-and-doparallel Best, JJB *From: *Rcpp-devel <rcpp-devel-bounces at lists.r-forge.r-project.org> on behalf of Naeem Khoshnevis <khoshnevis.naeem at gmail.com> *Date: *Friday, May 14, 2021 at 11:49 AM *To: *"rcpp-devel at lists.r-forge.r-project.org" < rcpp-devel at lists.r-forge.r-project.org> *Subject: *[Rcpp-devel] Exporting rcpp-based function into parLapply workers in an R package Dear Rcpp developers: Thanks for developing and maintaining the Rcpp package. I wrote a function in Rcpp. It is available throughout the package and works as expected; however, it is not available for praLapply workers. A temporary fix is just using Rcpp::cppFunction inside the function that parLapply workers call and copy the entire function over there. However, this does not seem right for bigger and more complicated functions. I would be grateful if you could let me know whether there is a better long-term solution. Here is the package and three functions that you might want to take a look at. Original cpp function: https://github.com/fasrc/CausalGPS/blob/master/src/compute_closest_wgps_helper.cpp <https://urldefense.com/v3/__https:/github.com/fasrc/CausalGPS/blob/master/src/compute_closest_wgps_helper.cpp__;!!DZ3fjg!oyv9eCC8FkfL4RzQ_LE613qZNCplLbfU22AlYI8Faem0SaWx-GcDeRWHef6zf-42AH4$> Wrapper function that calls this function + temporal fix: https://github.com/fasrc/CausalGPS/blob/master/R/compute_closest_wgps.R <https://urldefense.com/v3/__https:/github.com/fasrc/CausalGPS/blob/master/R/compute_closest_wgps.R__;!!DZ3fjg!oyv9eCC8FkfL4RzQ_LE613qZNCplLbfU22AlYI8Faem0SaWx-GcDeRWHef6z0KitDq4$> The function that uses parLapply (please see line 63-89) to run the c++ code: https://github.com/fasrc/CausalGPS/blob/master/R/compute_closest_wgps.R <https://urldefense.com/v3/__https:/github.com/fasrc/CausalGPS/blob/master/R/compute_closest_wgps.R__;!!DZ3fjg!oyv9eCC8FkfL4RzQ_LE613qZNCplLbfU22AlYI8Faem0SaWx-GcDeRWHef6z0KitDq4$> Best regards, Naeem
-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20210514/c5032110/attachment.html>
On 14 May 2021 at 14:07, Michael Weylandt wrote:
| The CRAN organization / mirror on GitHub (github.com/cran) is very
| useful for this sort of thing.
|
| Searching for "library(" and "clusterEvalQ"
| (https://github.com/search?q=org%3Acran+library%28+clusterEvalQ&type=code)
| in that organization yields the following result (chosen at random):
Yes! I actually do these type of searches all the time myself (and am old
enough to bemoan the disappearance the Google code search tool that preceded
it ages ago).
Dirk
https://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
For code searches, consider using the {searcher} package: https://github.com/r-assist/searcher
In particular, the search_github() function handles the query formatting. As an example, try:
searcher::search_github("clusterEvalQ ")
This opens a web browser with:
https://github.com/search?q=clusterEvalQ%20%20language:r%20type:issue&type=Issues
Lastly, there is an R-specific Google search engine available at: https://rseek.org/ It's not quite google code search, but it's useful! Plus, there is a {searcher} function for that as well, e.g. searcher::search_rseek().
(Thanks Alex Rossell Hayes for that contribution.)
Best,
JJB
On 5/14/21, 3:49 PM, "Rcpp-devel on behalf of Dirk Eddelbuettel" <rcpp-devel-bounces at lists.r-forge.r-project.org on behalf of edd at debian.org> wrote:
On 14 May 2021 at 14:07, Michael Weylandt wrote:
| The CRAN organization / mirror on GitHub (github.com/cran) is very
| useful for this sort of thing.
|
| Searching for "library(" and "clusterEvalQ"
| (https://urldefense.com/v3/__https://github.com/search?q=org*3Acran*library*28*clusterEvalQ&type=code__;JSslKw!!DZ3fjg!uSCS0rJpO5S9EvzzjplvK1kTsvK9ju6pokUJjHxfDgCr2J7oJFfAnTbVKpfD7RInvQA$ )
| in that organization yields the following result (chosen at random):
Yes! I actually do these type of searches all the time myself (and am old
enough to bemoan the disappearance the Google code search tool that preceded
it ages ago).
Dirk
--
https://urldefense.com/v3/__https://dirk.eddelbuettel.com__;!!DZ3fjg!uSCS0rJpO5S9EvzzjplvK1kTsvK9ju6pokUJjHxfDgCr2J7oJFfAnTbVKpfDGavZ2gk$ | @eddelbuettel | edd at debian.org
_______________________________________________
Rcpp-devel mailing list
Rcpp-devel at lists.r-forge.r-project.org
https://urldefense.com/v3/__https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel__;!!DZ3fjg!uSCS0rJpO5S9EvzzjplvK1kTsvK9ju6pokUJjHxfDgCr2J7oJFfAnTbVKpfDlciWISw$
This is getting off topic but as you James saw fit to advertise his package (as he should, it is clearly helpful to some, himself included), here are my $0.02 of why it is not for me:
On 14 May 2021 at 20:54, Balamuta, James Joseph wrote:
| For code searches, consider using the {searcher} package: https://github.com/r-assist/searcher
|
| In particular, the search_github() function handles the query formatting. As an example, try:
I keep the R prompt(s) (in Emacs, generally) to data work, and do more work
like this on the shell. Where this is less useful (though I sometimes wrap R
commands in littler script). Here I particularly dislike
|
| searcher::search_github("clusterEvalQ ")
|
| This opens a web browser with:
|
| https://github.com/search?q=clusterEvalQ%20%20language:r%20type:issue&type=Issues
the shell-to-browser pivot. I have some "permanent tabs" dedicated to GH, I
prefer to search therein. Also, why default to issues when the query didn't
have it? Anyway ...
| Lastly, there is an R-specific Google search engine available at: https://rseek.org/ It's not quite google code search, but it's useful! Plus, there is a {searcher} function for that as well, e.g. searcher::search_rseek().
| (Thanks Alex Rossell Hayes for that contribution.)
Yes that's been around for a while, but codesearch.google.com was still
better and is missed. One of the other code aggregators that (AFAIK is also
gone now) had a different search engine too.
This whole thread is highly off-tocpic and likely answered by some SO answers
as has bee pointed out as well as possibly some discussions in the r-sig-hpc
list (which is mostly dormant these days).
Dirk
https://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org