multicore with functions calling an exe/.sh file
Thanks Steve for your answer.
Stephen Weston wrote:
If you're executing an external program from R using mclapply, and that program is creating output files, then you need to arrange for that program to create the output files with different names, or to create them in different directories.
Yes, I'm running an external program with mclapply, and I thought that there could exist some package that mange this automatically, by creating a copy of the working process (and all the files) on the local memory, in some way, but probably it was just too much imagination :) Usually that would be done with a command line
argument to the program. If it doesn't have such an argument, the best thing to do is add one.
I think this could be the only way of solving my problem. However, I also the additional issue that each process needs to modify the same position of the same input files at the same time. So I should also add an argument for the location of the input files. If you can't do that, you could try executing them
from different directories by executing setwd in the parallel function before executing the external program. Unfortunately, that could cause other problems, especially if the program is reading data files from the current working directory. But hopefully, the program supports an option to change the output file name or directory.
The external program I'm running now doesn't have an option for setting the location of the input/output files. So probably the fastest solution could be to create as many copies of the input files as the amount of cores I want to use, and then to re-direct (in some way) each process to a different directory, each one storing all the necessary input files. I wanted to avoid the previous solution, but I think is the only way to go forward...
- Steve P.S. Why is a program running on a Linux machine using a file with the path "Z:\home\zambrhe\S090-test\file.cio"?
Good question. The directory that hold the input/output/exe/.sh files is: model.drty <- "/home/zambrhe/S090-test" and that is the argument I pass to R for executing my script. However, my '/home/zambrhe/' directory is in a network drive, ad probably R see it as "Z:\home\zambrhe\S090-test\file.cio" for some reason... Thanks you very much again. Mauricio
======================================================= FLOODS Action Land Management and Natural Hazards Unit Institute for Environment and Sustainability (IES) European Commission, Joint Research Centre (JRC) ======================================================= DISCLAIMER: "The views expressed are purely those of the writer and may not in any circumstances be regarded as stating an official position of the European Commission." ======================================================= Linux user #454569 -- Ubuntu user #17469 ======================================================= "Don't wish for less problems, wish for more skills. Don't wish it were easier, wish you were better." (Jim Rohn) > > > On Fri, Jun 24, 2011 at 5:12 AM, Mauricio Zambrano-Bigiarini > <mauricio.zambrano at jrc.ec.europa.eu> wrote: >> Dear List, >> >> I'm just doing my first trials with HPC, and I would like to ask your >> opinion regarding the following issue. >> >> In R 2.13.0, I have an optimization algorithm for hydrological models, which >> internally runs the .exe/.sh file of the model, and then computes and writes >> into a file the results of different parameter values. >> >> So far this algorithm runs only in a sequential mode, i.e., trying different >> parameter values one after another, and I would like to parallelize it. >> >> My first attempt was using the multicore library, and changing the existing >> 'lapply' loop for an 'mclapply' one. However, when I run the optimization >> algorithm, I got several error messages: >> >> forrtl: Sharing violation >> forrtl: Sharing violation >> forrtl: Sharing violation >> >> forrtl: severe (30): open failure, unit 1, file >> Z:\home\zambrhe\S090-test\file.cio >> >> which are due to the fact that all the 4 process are trying to access the >> same .exe/.sh file and to modify the same input files at the same time. >> >> >> Is there any way to use multicore for parallelizing this type of >> optimization function or should I move to some master/slave option ? >> >> >> Thanks in advance for any help. >> >> >> Mauricio Zambrano-Bigiarini >> >> >> PS, >> SesionInfo(): >> R version 2.13.0 (2011-04-13) >> Platform: i386-redhat-linux-gnu (32-bit) >> >> -- >> ======================================================= >> FLOODS Action >> Land Management and Natural Hazards Unit >> Institute for Environment and Sustainability (IES) >> European Commission, Joint Research Centre (JRC) >> TP 261, Via Enrico Fermi 2749, 21027 Ispra (VA), Italy >> webinfo : http://floods.jrc.ec.europa.eu/ >> work-phone : (+39)-(0332)-789588 >> work-fax : (+39)-(0332)-786653 >> ======================================================= >> DISCLAIMER:\ "The views expressed are purely those of th...{{dropped:11}} >> >> _______________________________________________ >> R-sig-hpc mailing list >> R-sig-hpc at r-project.org >> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc >> >