Skip to content

[R-pkg-devel] environment scoping

6 messages · Duncan Murdoch, Jennifer Bryan, William Dunlap +2 more

#
All,

I would like to have some inputs available to many functions. ?For example, I have coefficient for a term structure fit which I would like to make a available to total return functions. ?Thereby eliminating the need to fit the same term structure over and over again. ?However, I still reading and researching scoping. ?My understanding is that I could fit the term structure and keep the coefficients in the global environment which would then make those coefficients available to all functions requiring a term structure object input. ?Am I correct in my understanding. ?What is the downside of environment global, if any?

Best Regards,
Glenn
#
On 27/10/2016 10:05 AM, Glenn Schultz wrote:
You're writing in R-package-devel, so I'll assume you're thinking of 
doing this in a package that you will release to others.

In that case, the downside is huge.  The global environment doesn't 
belong to you, it belongs to the user.  If you choose to write to it, 
you could clobber something there that the user doesn't want clobbered.  
You'll hopefully get warnings from "R CMD check" about doing this, and 
CRAN will not accept your package.

There are a couple of ways to do what you want.  The simplest is to have 
the function that creates the common data return it in some sort of 
structure, and you pass that structure to other functions that need to 
work with it.  You can use S3 or S4 or other class systems to mark the 
structure for what it is.

Another approach that works sometimes is to have the function that 
creates the common data also create functions to work on it, and return 
them.  Since functions created in another function can see its local 
variables (even after it has returned!), those functions will have 
access to the common data, and nobody else will (without going through 
some contortions).  Some of the newer object systems support this 
approach, but you don't need to use them, you can just return a function 
(or a list of functions) and make calls to it/them.

I hope this helps.

Duncan Murdoch
#
Hi Glenn,

It sounds like you should create an environment for your package. If you store these objects in the global environment, they are exposed to the user and, more importantly, user could modify or delete them. If you use an environment specific to your package, you can be sure you are the only one messing with them.

You create it like so:

.package_env <- new.env(parent = emptyenv())

And read from / write into it like so (very much like a list):

.package_env$foo <- ...
.package_env$foo

or with assign() and get():

assign(foo, bar, envir = .package_env)
get(foo, envir = .package_env)

This blog post by Jeff Allen is a nice write-up of what you're trying to do:

http://trestletech.com/2013/04/package-wide-variablescache-in-r-package/

-- Jenny
#
If I were writing a package for factoring integers I might store a vector
of known primes in an environment in my package and have my factoring
functions append to the list when they find some more primes.  This works
because there is only one set of of primes (given we stick with ordinary
integers).

However the term structure (of interest rates, I assume) depends on the
date, the county, the type of bond, the method used to estimate it, etc.  I
think it would be better to store it in some sort of object (function,
list, or environment) that the user would be expected to know about.  The
user would be expected to pass that data into each function that need that
information (via the argument list, by using a function made in the
environment containing the term structure, etc.).


Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Thu, Oct 27, 2016 at 7:55 AM, Jenny Bryan <jenny at stat.ubc.ca> wrote:

            

  
  
#
All,

Thank you, I have built the package such that it passes an object TermStrutrure to those functions which require it as an input.  The package is here

https://github.com/glennmschultz/BondLab <https://github.com/glennmschultz/BondLab>

The question is largely motivated by the source MortgageScenario.R and MortgageOAS.R.  I have decided to default to Diebold Lee model in termstuc.

 In the case of MortgageScenario.R:

Currently shift in the curve is driven by shift in the coupon curve and the term structure model is refit.  Naturally one could use the coefficients from the original fit to shift the curve making the 		analysis much faster and more intuitive than the current paradigm of shifting the coupon curve.  However, some portfolio prefer to be able to use either or.  So, I need to make either/or available.

In the case of MortgageOAS.R

This is currently a single factor CIR OAS model.  It will remain for teaching purposes as a single factor model is the gateway to multi-factor models.  Nevertheless, one may choose to batch job OAS across say 500 MBS.  In which case it makes little sense to replicate the paths 500 times.  Simulate the paths and pass them to the function.  Obviously, I am looking at parallel processing of a portfolio mortgage and asset backed securities.  Thank-you for the time that you have taken to consider and reply to my question.  This has been very helpful.

Best,
Glenn

  
  
#
It sounds like memoization could help here. Package examples for this:
memoise and R.cache

Henrik
On Oct 27, 2016 11:19, "Glenn Schultz" <glennmschultz at me.com> wrote: