An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20090317/8dd7264b/attachment.pl>
help
4 messages · Yana Roth, Brian G. Peterson, Matthieu Stigler +1 more
Yana Roth wrote:
Hello, I am trying to do block reasampling to rearrange my data and not succeed to do random permutation and assugnement. I would like to divide original time series to subsamples and then to rearange this subsamples randomly. Function tsboot works only if I need to check statistic, I am interested in just rearranging the data while keeping its structure. The problem is defined as follows. 1. I define llentgh of block , b. 2.Divide an original time series by b and receive k=n/b subsamles. 3. I need to generate random vector of integers from 1 to k 4 Let Z*(j) be for j=1....k be the j th row of a matrix with num of rows equal to number of blocks and number of columns equal to number of simulations. 5. Assigne to each Z*(j) the blocks according to generated random vector(each column of matrix is a different order of permutations)
For future reference, please provide reproducible code as per the posting guidelines. It makes it easier for others to help you. Also, please use a desciptive subject, as we all get a quite a lot of mail. Your procedure appears incorrect. Your steps 3-5 look like a homework assignment, so I'm going to ignore those and focus on the block bootstrap, which has some applicability to other members of this list in financial time series analysis. I suspect that you simply misunderstood the "statistic" parameter of tsboot(). I expect that you do indeed intend to use the bootstrapped data to calculate one or more statstics, this is what the statistic parameter is for. Block bootstrapping works by randomly sampling blocks of length l from your original series. The tsboot function also applies one or more statistics to the bootstrapped data, and uses the multiple samples to calculate the bias and standard error for those statistics, providing you with a sensitivity analysis for those statistics on your data. Using the data series "acme" included with R, you would do something like: library(boot) library(PerformanceAnalytics) data(acme) #calculate the sensitivity of standard deviation on the data: tsboot(tseries=acme[,2],statistic=sd,R=1000,l=12,sim="fixed",endcorr=FALSE,n.sim=1000) # use blocks of length 12 (one year) to # create 1000 bootstrapped time series # each of length 1000 observations #Returns: #Bootstrap Statistics : # original bias std. error #t1* 0.05362889 0.0001614213 0.001925484 # calculate sensitivity of VaR: tsboot(tseries=acme[,2],statistic=VaR.CornishFisher,R=1000,sim="fixed",l=12,endcorr=FALSE,n.sim=1000) #Returns: #Bootstrap Statistics : # original bias std. error #t1* 0.227064 0.009412978 0.007284343 Normally, this is what you want. The random bootstrapped series itself is not useful to you, except to calculate a statistic or statistics of interest, and understand their sensitivity. If you want the bootstrapped series returned, you can modify the code of the tsboot function to do what you want. If you want to apply your steps 3-5 to the bootstrapped data, see the documentation of tsboot() for an example of defining a function to use as the statistic parameter. Regards, - Brian
Brian G. Peterson http://braverock.com/brian/ Ph: 773-459-4973 IM: bgpbraverock
Brian G. Peterson a ?crit :
Yana Roth wrote:
Hello, I am trying to do block reasampling to rearrange my data and not succeed to do random permutation and assugnement. I would like to divide original time series to subsamples and then to rearange this subsamples randomly. Function tsboot works only if I need to check statistic, I am interested in just rearranging the data while keeping its structure. The problem is defined as follows. 1. I define llentgh of block , b. 2.Divide an original time series by b and receive k=n/b subsamles. 3. I need to generate random vector of integers from 1 to k 4 Let Z*(j) be for j=1....k be the j th row of a matrix with num of rows equal to number of blocks and number of columns equal to number of simulations. 5. Assigne to each Z*(j) the blocks according to generated random vector(each column of matrix is a different order of permutations)
For future reference, please provide reproducible code as per the posting guidelines. It makes it easier for others to help you. Also, please use a desciptive subject, as we all get a quite a lot of mail. Your procedure appears incorrect. Your steps 3-5 look like a homework assignment, so I'm going to ignore those and focus on the block bootstrap, which has some applicability to other members of this list in financial time series analysis.
Thanks Brian for these examples!
Actually even if it is homework I would be really interested in the
answer ;-) this is a question I always wanted to find out, maybe is it
the right time to ask? I looked in source code of tsboot() but got lost
Does anyone has an idea about how to generate block resampling with
function sample()? And with overlapping and non-overlapping blocks? That
is, (example just taken from Maddala and Li 1998, bootstraping
cointegrating relationships in journal of econometrics 80,2 also in
their book unit roots, coint and struc change page 328) you pick blocks:
if series is {3, 6, 7, 2, 1, 5}
-non-overlapping: {(3,6,7), (2,1,5)}
-overlapping: {(3,6,7), (6,7,2), (7,2, l), (2, 1,5)}
and then sample those blocks with replacement. I don't have a clear idea
about how do to that on R... Thanks!
a<-1:100
boot1<-sample(a, replace=TRUE) #length 1
I suspect that you simply misunderstood the "statistic" parameter of tsboot(). I expect that you do indeed intend to use the bootstrapped data to calculate one or more statstics, this is what the statistic parameter is for. Block bootstrapping works by randomly sampling blocks of length l from your original series. The tsboot function also applies one or more statistics to the bootstrapped data, and uses the multiple samples to calculate the bias and standard error for those statistics, providing you with a sensitivity analysis for those statistics on your data. Using the data series "acme" included with R, you would do something like: library(boot) library(PerformanceAnalytics) data(acme) #calculate the sensitivity of standard deviation on the data: tsboot(tseries=acme[,2],statistic=sd,R=1000,l=12,sim="fixed",endcorr=FALSE,n.sim=1000) # use blocks of length 12 (one year) to # create 1000 bootstrapped time series # each of length 1000 observations #Returns: #Bootstrap Statistics : # original bias std. error #t1* 0.05362889 0.0001614213 0.001925484 # calculate sensitivity of VaR: tsboot(tseries=acme[,2],statistic=VaR.CornishFisher,R=1000,sim="fixed",l=12,endcorr=FALSE,n.sim=1000) #Returns: #Bootstrap Statistics : # original bias std. error #t1* 0.227064 0.009412978 0.007284343 Normally, this is what you want. The random bootstrapped series itself is not useful to you, except to calculate a statistic or statistics of interest, and understand their sensitivity. If you want the bootstrapped series returned, you can modify the code of the tsboot function to do what you want. If you want to apply your steps 3-5 to the bootstrapped data, see the documentation of tsboot() for an example of defining a function to use as the statistic parameter. Regards, - Brian
Hello.
Some time ago I write a seminar work about regressing the oil price on the CDAX. There, I used a nonparamtetric-block-bootstrap approach by hand, because I needed to resample pairs of blocks. I worked with sample(). I think there is some need of further optimization, but the code should give the idea of block-sampling:
The example:
if series is {3, 6, 7, 2, 1, 5}
- non-overlapping: {(3,6,7), (2,1,5)}
- overlapping: {(3,6,7), (6,7,2), (7,2,l), (2,1,5)}
For the overlapping case you have 4 blocks with length 3. Spoken in time indices the blocks have the following structure:
1:3
2:4
3:5
4:6
So, one has to resample the starting time indices 1:4 and add 2 to each time index to grab the data right:
x <- c(3,6,7,2,1,5)
x_sample <- numeric(4*3) #4 blocks, each of length 3
mean_boot <- numeric(10000)
for (i in 1:10000)
{
for (j in 0:3)
{
idx <- sample(1:4,1,replace=TRUE) #the starting index
x_sample[(3*j+1):(3*j+3)] <- x[(idx):(idx+2)]
}
mean_boot[i] <- mean(x_sample)
}
Next, the non-overlapping example with 2 blocks. Here we have the following time structure:
1:3
4:6
So one has to resample 1 and 4 and add 2 to grab the data. If the series is longer, one would recognize, that the first time index can be described by the row: 3*t+1, so one only has to draw with replacement from (1:2) or equivalently (0:1):
x <- c(3,6,7,2,1,5)
x_sample <- numeric(2*3) #2 blocks of length 3
mean_boot <- numeric(10000)
for (i in 1:10000)
{
for (j in 0:1)
{
idx <- sample(0:1,1,replace=TRUE)
idx <- 3*idx+1 #the starting index
x_sample[(3*j+1):(3*j+3)] <- x[(idx):(idx+2)]
}
mean_boot[i] <- mean(x_sample)
}
And when you ask yourself what block length would be the right one, well Politis and White (2004) have the answer: http://econ.ucsd.edu/~mbacci/white/pub_files/hwcv-093.pdf
Also pay attention to the corrections of the algorithm: http://www.economics.ox.ac.uk/members/andrew.patton/SBblockCORRECTION_jan08.pdf
But the most important thing, pay attention to the R implementation:
http://www.math.ucsd.edu/~politis/SOFT/PPW/ppw.R
Hope it works.... it is some time ago since I played around with it, but maybe it is some food for though :)
Matthias.
--- Matthieu Stigler <matthieu.stigler at gmail.com> schrieb am Di, 17.3.2009:
Von: Matthieu Stigler <matthieu.stigler at gmail.com> Betreff: Re: [R-SIG-Finance] help (regarding block bootstrap) An: r-sig-finance at stat.math.ethz.ch CC: "Yana Roth" <yana.roth at yahoo.com> Datum: Dienstag, 17. M?rz 2009, 14:35 Brian G. Peterson a ?crit :
Yana Roth wrote:
Hello, I am trying to do block reasampling to rearrange
my data and not succeed to do random permutation and assugnement.
I would like to divide original time series to
subsamples and then to rearange this subsamples randomly.
Function tsboot works only if I need to check
statistic, I am interested in just rearranging the data while keeping its structure.
The problem is defined as follows. 1. I define llentgh of block , b. 2.Divide an original time series by b and receive
k=n/b subsamles.
3. I need to generate random vector of integers
from 1 to k
4 Let Z*(j) be for j=1....k be the j th row of a
matrix with num of rows equal to number of blocks and number of columns equal to number of simulations.
5. Assigne to each Z*(j) the blocks according to
generated random vector(each column of matrix is a different order of permutations)
For future reference, please provide reproducible code
as per the posting guidelines. It makes it easier for others to help you. Also, please use a desciptive subject, as we all get a quite a lot of mail.
Your procedure appears incorrect. Your steps 3-5 look like a homework assignment, so
I'm going to ignore those and focus on the block bootstrap, which has some applicability to other members of this list in financial time series analysis.
Thanks Brian for these examples!
Actually even if it is homework I would be really
interested in the answer ;-) this is a question I always
wanted to find out, maybe is it the right time to ask? I
looked in source code of tsboot() but got lost
Does anyone has an idea about how to generate block
resampling with function sample()? And with overlapping and
non-overlapping blocks? That is, (example just taken from
Maddala and Li 1998, bootstraping cointegrating
relationships in journal of econometrics 80,2 also in their
book unit roots, coint and struc change page 328) you pick
blocks:
if series is {3, 6, 7, 2, 1, 5}
-non-overlapping: {(3,6,7), (2,1,5)}
-overlapping: {(3,6,7), (6,7,2), (7,2, l), (2, 1,5)}
and then sample those blocks with replacement. I don't
have a clear idea about how do to that on R... Thanks!
a<-1:100
boot1<-sample(a, replace=TRUE) #length 1
I suspect that you simply misunderstood the
"statistic" parameter of tsboot(). I expect that you do indeed intend to use the bootstrapped data to calculate one or more statstics, this is what the statistic parameter is for.
Block bootstrapping works by randomly sampling blocks
of length l from your original series. The tsboot function also applies one or more statistics to the bootstrapped data, and uses the multiple samples to calculate the bias and standard error for those statistics, providing you with a sensitivity analysis for those statistics on your data.
Using the data series "acme" included with
R, you would do something like:
library(boot) library(PerformanceAnalytics) data(acme) #calculate the sensitivity of standard deviation on
the data:
tsboot(tseries=acme[,2],statistic=sd,R=1000,l=12,sim="fixed",endcorr=FALSE,n.sim=1000)
# use blocks of length 12 (one year) to # create 1000
bootstrapped time series
# each of length 1000 observations #Returns: #Bootstrap Statistics : # original bias std. error #t1* 0.05362889 0.0001614213 0.001925484 # calculate sensitivity of VaR:
tsboot(tseries=acme[,2],statistic=VaR.CornishFisher,R=1000,sim="fixed",l=12,endcorr=FALSE,n.sim=1000)
#Returns: #Bootstrap Statistics : # original bias std. error #t1* 0.227064 0.009412978 0.007284343 Normally, this is what you want. The random
bootstrapped series itself is not useful to you, except to calculate a statistic or statistics of interest, and understand their sensitivity. If you want the bootstrapped series returned, you can modify the code of the tsboot function to do what you want. If you want to apply your steps 3-5 to the bootstrapped data, see the documentation of tsboot() for an example of defining a function to use as the statistic parameter.
Regards, - Brian
_______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. -- If you want to post, subscribe first.