Hi, I wonder what is a better way to organize a lot of R source files. I have a lot of utility functions written and store them in several source files (e.g util1.R, util2.R,..utilN.R). I also have a master file in which the source command is used to load all the util.R files. When I need to use the utility functions in a new project, I create a new R file (e.g main.R) in which I "source" the master file. The problem with this approach is that anytime a single utility function is modified, I need to rerun the source command in main.R to load the master file, which loads all the utility R files via a loop over each file. Sometimes I have to wait for 10 seconds to get them all loaded. Sometimes I forget to run the source command. Is there a way in R to 1) only reload the file changed (like a make utility) when I run source on all utility files and/or even better 2) reload the changed utility files, when I run a command that use one of those utility functions, without the need for me to source those files. Not sure if packaging solves this issue because the library command has be used every time a utility function is modified and in addition the package has to be rebuilt. I don't worry about sharing the source files at this moment as I am the only user of those utility files. This may be a common issue many R users face. I wonder how other R users solve this issue. thanks Jeff
how to organize a lot of R source files
5 messages · Jim Lemon, Henrik Bengtsson, Hao Cen
Hi Jeff,
Your request makes a lot of sense. I often modify files in the packages
I maintain, typically by loading the package, then working on a copy of
the function, continually "sourcing" the new code until it works
correctly, and then checking and building the package. Apart from the
official packages I maintain, I keep a few local packages with odd
functions that I don't think are worth uploading to an already loaded
CRAN. This shell script can be used to automate the building of a package.
#!/bin/sh
cp $1 $2/R
if R CMD check $2; then
R CMD build $2;
R CMD INSTALL $3;
else
echo "Problem with R check of $2"
fi
If I had modified the "clinsig.R" file in the clinsig package, I could
call this script like this:
Rpackage /home/jim/R/clinsig.R /home/jim/R/clinsig clinsig_1.0-1.tar.gz
and it would rebuild the package with the new function. Because I
usually keep the files I am modifying in /home/jim/R I could simplify
the command line a bit. This may seem like a lot of work, but when I
worked out a way to get a function to check the timestamp of its source
file and compare it against the timestamp of the latest package:
if(max(file.info(system("find /home/jim/R -name 'clinsig.R'
-type f",intern=TRUE))$mtime) >
max(file.info(system("find /home/jim/R -name 'clinsig_*'
-type f",intern=TRUE))$mtime))
source("/home/jim/R/clinsig.R")
a lot of hard coding of file locations ends up in your function file.
Jim
library("R.utils");
sourceDirectory("myRFiles/", modifiedOnly=TRUE);
See ?sourceDirectory (regardless what the Rd help say, any '...'
argument is passed to sourceTo()).
/Henrik
On Fri, Jan 8, 2010 at 7:38 AM, Hao Cen <hcen at andrew.cmu.edu> wrote:
Hi, I wonder what is a better way to organize a lot of R source files. I have a lot of utility functions written and store them in several source files (e.g util1.R, util2.R,..utilN.R). I also have a master file in which the source command is used to load all the util.R files. When I need to use the utility functions in a new project, I create a new R file (e.g main.R) in which I "source" the master file. The problem with this approach is that anytime a single utility function is modified, I need to rerun the source command in main.R to load the master file, which loads all the utility R files via a loop over each file. Sometimes I have to wait for 10 seconds to get them all loaded. Sometimes I forget to run the source command. Is there a way in R to 1) only reload the file changed (like a make utility) when I run source on all utility files and/or even better 2) ?reload the changed utility files, when I run a command that use one of those utility functions, without the need for me to source those files. Not sure if packaging solves this issue because the library command has be used every time a utility function is modified and in addition the package has to be rebuilt. I don't worry about sharing the source files at this moment as I am the only user of those utility files. This may be a common issue many R users face. I wonder how other R users solve this issue. thanks Jeff
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
1 day later
Hi Henrik,
Thanks for your suggestion. I created a directory with 10 R files and
tried the following and measured its time
system.time(sourceDirectory("~/fun", modifiedOnly = F))
system.time(sourceDirectory("~/fun", modifiedOnly = T))
But the second line seems to spend as much time as the first line, I
thought the second line would be faster since no modification is made.
Also the first line reports a warning as follows
In readLines(con = fh) :
incomplete final line found on "~/fun/util1.R"
I don't see such a warning when I use source.
Maybe the two issues are related. Please advise.
thanks
Jeff
On Fri, January 8, 2010 7:56 pm, Henrik Bengtsson wrote:
library("R.utils"); sourceDirectory("myRFiles/", modifiedOnly=TRUE);
See ?sourceDirectory (regardless what the Rd help say, any '...'
argument is passed to sourceTo()).
/Henrik
On Fri, Jan 8, 2010 at 7:38 AM, Hao Cen <hcen at andrew.cmu.edu> wrote:
Hi, I wonder what is a better way to organize a lot of R source files. I have a lot of utility functions written and store them in several source files (e.g util1.R, util2.R,..utilN.R). I also have a master file in which the source command is used to load all the util.R files. When I need to use the utility functions in a new project, I create a new R file (e.g main.R) in which I "source" the master file. The problem with this approach is that anytime a single utility function is modified, I need to rerun the source command in main.R to load the master file, which loads all the utility R files via a loop over each file. Sometimes I have to wait for 10 seconds to get them all loaded. Sometimes I forget to run the source command. Is there a way in R to 1) only reload the file changed (like a make utility) when I run source on all utility files and/or even better 2) reload the changed utility files, when I run a command that use one of those utility functions, without the need for me to source those files. Not sure if packaging solves this issue because the library command has be used every time a utility function is modified and in addition the package has to be rebuilt. I don't worry about sharing the source files at this moment as I am the only user of those utility files. This may be a common issue many R users face. I wonder how other R users solve this issue. thanks Jeff
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi.
On Sat, Jan 9, 2010 at 6:16 PM, Hao Cen <hcen at andrew.cmu.edu> wrote:
Hi Henrik,
Thanks for your suggestion. I created a directory with 10 R files and
tried the following and measured its time
system.time(sourceDirectory("~/fun", modifiedOnly = F))
system.time(sourceDirectory("~/fun", modifiedOnly = T))
But the second line seems to spend as much time as the first line, I
thought ? the second line would be faster since no modification is made.
Use modifiedOnly=TRUE the first time too, and you'll see it'll work the 2nd time. When you call it the first time, R consider it as "modified" (since last time), because it has never seen the code before (in that R session). If you add some verbose output in your scripts, you'll definitely see when the scripts get sourced. I guess you could say it should the way you did it, but the way it is currently designed/implemented is that it does not record the last "source" time unless you use modifiedOnly=TRUE. Next release will also support what you did.
Also the first line reports a warning as follows In readLines(con = fh) : ?incomplete final line found on "~/fun/util1.R" I don't see such a warning when I use source.
Unrelated. Nothing to worry about. I've added readLines(con=fh, warn=FALSE) for the next release to get rid of such warnings. /Henrik
Maybe the two issues are related. Please advise. thanks Jeff On Fri, January 8, 2010 7:56 pm, Henrik Bengtsson wrote:
library("R.utils"); sourceDirectory("myRFiles/", modifiedOnly=TRUE);
See ?sourceDirectory (regardless what the Rd help say, any '...'
argument is passed to sourceTo()).
/Henrik
On Fri, Jan 8, 2010 at 7:38 AM, Hao Cen <hcen at andrew.cmu.edu> wrote:
Hi, I wonder what is a better way to organize a lot of R source files. I have a lot of utility functions written and store them in several source files (e.g util1.R, util2.R,..utilN.R). I also have a master file in which the source command is used to load all the util.R files. When I need to use the utility functions in a new project, I create a new R file (e.g main.R) in which I "source" the master file. The problem with this approach is that anytime a single utility function is modified, I need to rerun the source command in main.R to load the master file, which loads all the utility R files via a loop over each file. Sometimes I have to wait for 10 seconds to get them all loaded. Sometimes I forget to run the source command. Is there a way in R to 1) only reload the file changed (like a make utility) when I run source on all utility files and/or even better 2) ?reload the changed utility files, when I run a command that use one of those utility functions, without the need for me to source those files. Not sure if packaging solves this issue because the library command has be used every time a utility function is modified and in addition the package has to be rebuilt. I don't worry about sharing the source files at this moment as I am the only user of those utility files. This may be a common issue many R users face. I wonder how other R users solve this issue. thanks Jeff
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.