help

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20090317/8dd7264b/attachment.pl>
Hello,
I am trying to do block reasampling to rearrange my data and not succeed to do random permutation and assugnement.
I would like to divide original time series to subsamples and then to rearange this subsamples randomly.

Function tsboot works only if I need to check statistic, I am interested in just rearranging the data while keeping its structure.

The problem is defined as follows.
1. I define llentgh of block , b.
2.Divide an original time series by b and receive k=n/b subsamles.
3. I need to generate random vector of integers from 1 to k
4 Let Z*(j) be for j=1....k be the j th row of a matrix with num of rows equal to number of blocks and number of columns equal to number of simulations.
5. Assigne to each Z*(j) the blocks according to generated random vector(each column of matrix is a different order of permutations)
For future reference, please provide reproducible code as per the posting guidelines.  It makes it easier for others to help you.  Also, please use a desciptive subject, as we all get a quite a lot of mail.

Your procedure appears incorrect. 

Your steps 3-5 look like a homework assignment, so I'm going to ignore those and focus on the block bootstrap, which has some applicability to other members of this list in financial time series analysis.

I suspect that you simply misunderstood the "statistic" parameter of tsboot().  I expect that you do indeed intend to use the bootstrapped data to calculate one or more statstics, this is what the statistic parameter is for.

Block bootstrapping works by randomly sampling blocks of length l from your original series.  The tsboot function also applies one or more statistics to the bootstrapped data, and uses the multiple samples to calculate the bias and standard error for those statistics, providing you with a sensitivity analysis for those statistics on your data.

Using the data series "acme" included with R, you would do something like:

library(boot)
library(PerformanceAnalytics)
data(acme)

#calculate the sensitivity of standard deviation on the data:
tsboot(tseries=acme[,2],statistic=sd,R=1000,l=12,sim="fixed",endcorr=FALSE,n.sim=1000)
# use blocks of length 12 (one year) to 
# create 1000 bootstrapped time series
# each of length 1000 observations

#Returns:
#Bootstrap Statistics :
#      original       bias    std. error
#t1* 0.05362889 0.0001614213 0.001925484

# calculate sensitivity of VaR:
tsboot(tseries=acme[,2],statistic=VaR.CornishFisher,R=1000,sim="fixed",l=12,endcorr=FALSE,n.sim=1000)

#Returns:
#Bootstrap Statistics :
#    original      bias    std. error
#t1* 0.227064 0.009412978 0.007284343

Normally, this is what you want.  The random bootstrapped series itself is not useful to you, except to calculate a statistic or statistics of interest, and understand their sensitivity.  If you want the bootstrapped series returned, you can modify the code of the tsboot function to do what you want.  

If you want to apply your steps 3-5 to the bootstrapped data, see the documentation of tsboot() for an example of defining a function to use as the statistic parameter.

Regards,

  - Brian
Brian G. Peterson
http://braverock.com/brian/
Ph: 773-459-4973
IM: bgpbraverock
Brian G. Peterson a ?crit :
Yana Roth wrote:
Hello,
I am trying to do block reasampling to rearrange my data and not 
succeed to do random permutation and assugnement.
I would like to divide original time series to subsamples and then to 
rearange this subsamples randomly.

Function tsboot works only if I need to check statistic, I am 
interested in just rearranging the data while keeping its structure.

The problem is defined as follows.
1. I define llentgh of block , b.
2.Divide an original time series by b and receive k=n/b subsamles.
3. I need to generate random vector of integers from 1 to k
4 Let Z*(j) be for j=1....k be the j th row of a matrix with num of 
rows equal to number of blocks and number of columns equal to number 
of simulations.
5. Assigne to each Z*(j) the blocks according to generated random 
vector(each column of matrix is a different order of permutations)
For future reference, please provide reproducible code as per the 
posting guidelines.  It makes it easier for others to help you.  Also, 
please use a desciptive subject, as we all get a quite a lot of mail.

Your procedure appears incorrect.
Your steps 3-5 look like a homework assignment, so I'm going to ignore 
those and focus on the block bootstrap, which has some applicability 
to other members of this list in financial time series analysis.

Thanks Brian for these examples!

Actually even if it is homework I would be really interested in the 
answer ;-) this is a question I always wanted to find out, maybe is it 
the right time to ask? I looked in source code of tsboot() but got lost

Does anyone has an idea about how to generate block resampling with 
function sample()? And with overlapping and non-overlapping blocks? That 
is, (example just taken from Maddala and Li 1998, bootstraping 
cointegrating relationships in journal of econometrics 80,2 also in 
their book unit roots, coint and struc change page 328) you pick blocks:

if series is {3, 6, 7, 2, 1, 5}
-non-overlapping:  {(3,6,7), (2,1,5)}
-overlapping:  {(3,6,7), (6,7,2), (7,2, l), (2, 1,5)}

and then sample those blocks with replacement. I don't have a clear idea 
about how do to that on R... Thanks!

a<-1:100
boot1<-sample(a, replace=TRUE) #length 1
I suspect that you simply misunderstood the "statistic" parameter of 
tsboot().  I expect that you do indeed intend to use the bootstrapped 
data to calculate one or more statstics, this is what the statistic 
parameter is for.

Block bootstrapping works by randomly sampling blocks of length l from 
your original series.  The tsboot function also applies one or more 
statistics to the bootstrapped data, and uses the multiple samples to 
calculate the bias and standard error for those statistics, providing 
you with a sensitivity analysis for those statistics on your data.

Using the data series "acme" included with R, you would do something 
like:

library(boot)
library(PerformanceAnalytics)
data(acme)

#calculate the sensitivity of standard deviation on the data:
tsboot(tseries=acme[,2],statistic=sd,R=1000,l=12,sim="fixed",endcorr=FALSE,n.sim=1000) 

# use blocks of length 12 (one year) to # create 1000 bootstrapped 
time series
# each of length 1000 observations

#Returns:
#Bootstrap Statistics :
#      original       bias    std. error
#t1* 0.05362889 0.0001614213 0.001925484

# calculate sensitivity of VaR:
tsboot(tseries=acme[,2],statistic=VaR.CornishFisher,R=1000,sim="fixed",l=12,endcorr=FALSE,n.sim=1000) 

#Returns:
#Bootstrap Statistics :
#    original      bias    std. error
#t1* 0.227064 0.009412978 0.007284343

Normally, this is what you want.  The random bootstrapped series 
itself is not useful to you, except to calculate a statistic or 
statistics of interest, and understand their sensitivity.  If you want 
the bootstrapped series returned, you can modify the code of the 
tsboot function to do what you want. 
If you want to apply your steps 3-5 to the bootstrapped data, see the 
documentation of tsboot() for an example of defining a function to use 
as the statistic parameter.

Regards,

 - Brian
Hello.

Some time ago I write a seminar work about regressing the oil price on the CDAX. There, I used a nonparamtetric-block-bootstrap approach by hand, because I needed to resample pairs of blocks. I worked with sample(). I think there is some need of further optimization, but the code should give the idea of block-sampling:

The example:
if series is       {3, 6, 7, 2, 1, 5}
- non-overlapping: {(3,6,7), (2,1,5)}
- overlapping:     {(3,6,7), (6,7,2), (7,2,l), (2,1,5)}

For the overlapping case you have 4 blocks with length 3. Spoken in time indices the blocks have the following structure:
1:3
2:4
3:5
4:6

So, one has to resample the starting time indices 1:4 and add 2 to each time index to grab the data right:

x <- c(3,6,7,2,1,5)

x_sample  <- numeric(4*3) #4 blocks, each of length 3
mean_boot <- numeric(10000)

for (i in 1:10000)

{

for (j in 0:3)

{

  idx <- sample(1:4,1,replace=TRUE) #the starting index

  x_sample[(3*j+1):(3*j+3)] <- x[(idx):(idx+2)]

}

mean_boot[i] <- mean(x_sample)

}

Next, the non-overlapping example with 2 blocks. Here we have the following time structure:
1:3
4:6

So one has to resample 1 and 4 and add 2 to grab the data. If the series is longer, one would recognize, that the first time index can be described by the row: 3*t+1, so one only has to draw with replacement from (1:2) or equivalently (0:1):

x <- c(3,6,7,2,1,5)

x_sample  <- numeric(2*3) #2 blocks of length 3
mean_boot <- numeric(10000)

for (i in 1:10000)

{

for (j in 0:1)

{

  idx <- sample(0:1,1,replace=TRUE)

  idx <- 3*idx+1 #the starting index

  x_sample[(3*j+1):(3*j+3)] <- x[(idx):(idx+2)]

}

mean_boot[i] <- mean(x_sample)

}

And when you ask yourself what block length would be the right one, well Politis and White (2004) have the answer: http://econ.ucsd.edu/~mbacci/white/pub_files/hwcv-093.pdf
Also pay attention to the corrections of the algorithm: http://www.economics.ox.ac.uk/members/andrew.patton/SBblockCORRECTION_jan08.pdf

But the most important thing, pay attention to the R implementation:
http://www.math.ucsd.edu/~politis/SOFT/PPW/ppw.R

Hope it works.... it is some time ago since I played around with it, but maybe it is some food for though :)

Matthias.

--- Matthieu Stigler <matthieu.stigler at gmail.com> schrieb am Di, 17.3.2009:
Von: Matthieu Stigler <matthieu.stigler at gmail.com>
Betreff: Re: [R-SIG-Finance] help (regarding block bootstrap)
An: r-sig-finance at stat.math.ethz.ch
CC: "Yana Roth" <yana.roth at yahoo.com>
Datum: Dienstag, 17. M?rz 2009, 14:35
Brian G. Peterson a ?crit :
Yana Roth wrote:
Hello,
I am trying to do block reasampling to rearrange
my data and not succeed to do random permutation and
assugnement.
I would like to divide original time series to
subsamples and then to rearange this subsamples randomly.
 Function tsboot works only if I need to check
statistic, I am interested in just rearranging the data
while keeping its structure.
 The problem is defined as follows.
1. I define llentgh of block , b.
2.Divide an original time series by b and receive
k=n/b subsamles.
3. I need to generate random vector of integers
from 1 to k
4 Let Z*(j) be for j=1....k be the j th row of a
matrix with num of rows equal to number of blocks and number
of columns equal to number of simulations.
5. Assigne to each Z*(j) the blocks according to
generated random vector(each column of matrix is a different
order of permutations)
For future reference, please provide reproducible code
as per the posting guidelines.  It makes it easier for
others to help you.  Also, please use a desciptive subject,
as we all get a quite a lot of mail.
Your procedure appears incorrect.
Your steps 3-5 look like a homework assignment, so
I'm going to ignore those and focus on the block
bootstrap, which has some applicability to other members of
this list in financial time series analysis.

Thanks Brian for these examples!

Actually even if it is homework I would be really
interested in the answer ;-) this is a question I always
wanted to find out, maybe is it the right time to ask? I
looked in source code of tsboot() but got lost

Does anyone has an idea about how to generate block
resampling with function sample()? And with overlapping and
non-overlapping blocks? That is, (example just taken from
Maddala and Li 1998, bootstraping cointegrating
relationships in journal of econometrics 80,2 also in their
book unit roots, coint and struc change page 328) you pick
blocks:

if series is {3, 6, 7, 2, 1, 5}
-non-overlapping:  {(3,6,7), (2,1,5)}
-overlapping:  {(3,6,7), (6,7,2), (7,2, l), (2, 1,5)}

and then sample those blocks with replacement. I don't
have a clear idea about how do to that on R... Thanks!

a<-1:100
boot1<-sample(a, replace=TRUE) #length 1

I suspect that you simply misunderstood the
"statistic" parameter of tsboot().  I expect that
you do indeed intend to use the bootstrapped data to
calculate one or more statstics, this is what the statistic
parameter is for.
Block bootstrapping works by randomly sampling blocks
of length l from your original series.  The tsboot function
also applies one or more statistics to the bootstrapped
data, and uses the multiple samples to calculate the bias
and standard error for those statistics, providing you with
a sensitivity analysis for those statistics on your data.
Using the data series "acme" included with
R, you would do something like:
library(boot)
library(PerformanceAnalytics)
data(acme)

#calculate the sensitivity of standard deviation on
the data:

tsboot(tseries=acme[,2],statistic=sd,R=1000,l=12,sim="fixed",endcorr=FALSE,n.sim=1000)

# use blocks of length 12 (one year) to # create 1000
bootstrapped time series
# each of length 1000 observations

#Returns:
#Bootstrap Statistics :
#      original       bias    std. error
#t1* 0.05362889 0.0001614213 0.001925484

# calculate sensitivity of VaR:

tsboot(tseries=acme[,2],statistic=VaR.CornishFisher,R=1000,sim="fixed",l=12,endcorr=FALSE,n.sim=1000)

#Returns:
#Bootstrap Statistics :
#    original      bias    std. error
#t1* 0.227064 0.009412978 0.007284343

Normally, this is what you want.  The random
bootstrapped series itself is not useful to you, except to
calculate a statistic or statistics of interest, and
understand their sensitivity.  If you want the bootstrapped
series returned, you can modify the code of the tsboot
function to do what you want. If you want to apply your
steps 3-5 to the bootstrapped data, see the documentation of
tsboot() for an example of defining a function to use as the
statistic parameter.
Regards,

 - Brian

_______________________________________________
R-SIG-Finance at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only.
-- If you want to post, subscribe first.