Skip to content
Prev 366271 / 398502 Next

Restricted Simulation from GPD & Normal Distributions

Hi R-users,
(fixing some typos in my previous mail)

I have data on one equity-related variable X, denoted by
x1,x2,x3,.......x1000 which has been ordered as x1<x2<....<x1000. I have
identified the upper and lower 5 percentiles, i.e. x50 and x950
respectively. Based on some analysis, I have inferred that three different
density functions fit the three parts of the data decently well,

   - f1 fits the data for all x<x50 ----- 50 observations
   - f2 fits the data well for all x50<x<x950 ------- 900 observations
   - f3 fits the data well for all x>x950------ 50 obsrvations

Idea is to simulate 50 new observations from f1 *restricted to (- infinity,
x50 ]*, 50 new observations from f3 *restricted to  ( x950, infinity )* and
900 new observations from f2 *restricted between (x50, x950 ]*. So total
number of observations in the simulated data = 1000 as before.

For the example I am working with, f1 and f3 are GPD ( Generalized Pareto
Distribution ) while f2 is Normal with some parameters.

I want to write a function which will take as inputs

   - the entire data (of size 1000)
   - the cut-off points x50 and x950
   - the 3 distributions (along with their parameters)
   - the number of data points from each of the 3 segments (50, 900, 50 in
   this example)
   - note that f1, f2 and f3 need to be properly restricted to the
   corresponding intervals (mentioned in Bold in the description above)

and will output the simulated data with original sample size (here, 1000).

I'll really appreciate any help writing this function. If anything else is
required, please let me know.
On Tue, Dec 27, 2016 at 3:46 PM, Preetam Pal <lordpreetam at gmail.com> wrote:

            

  
    
Message-ID: <CAHVFrXEovrb=zU2Ue4DRs0EP8DpOtz1SMRY3daP0_cnrBSAUAA@mail.gmail.com>
In-Reply-To: <CAHVFrXHM4u4ZS0=Ld-XpJftuzwf0X6Rdqf8h_ny1Kf8deK=JzA@mail.gmail.com>