Hasbroucks Information Share in R

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20120711/27502f33/attachment.pl>
I am a recent R converter so please bare with me if I ask any stupid
questions. I am open to both constructive criticism and blatant mockery. I
have been working on converting the Joel Hasbrouck Information share code
originally for SAS and Implementing it into R. I have a working code but
for some reason I am about 10% of the actual values, if there is anyone
more experienced in this area I would much appreciate any advice availlable.
Drew,

Thanks for taking the time to provide code and describe your problem, 
but what you've provided isn't reproducible.

You'll need to provide a small data set and expected results (presumably 
from the SAS code) before anyone will be able to examine your code and 
sort out why they differ.

You say 'about 10%', so I assume that this doesn't mean 'precisely 10%' 
which would likely be a simple multiplier change.

Have you looked at all the intermediary steps in the original SAS code 
and compared these to the R code?  obviously this won't be possible at 
every step. but I'd assume it should be at most.

etc etc etc.

much easier to help if people had data to compare to.

Regards,

    - Brian
Brian G. Peterson
http://braverock.com/brian/
Ph: 773-459-4973
IM: bgpbraverock
Hi Drew,

In addition to Brian's comments, a few stylistic / performance notes
below. I can't check content, but hopefully you'll find the resulting
code a little clearer and more idiomatic.
Hi all,

I am a recent R converter so please bare with me if I ask any stupid
questions. I am open to both constructive criticism and blatant mockery. I
have been working on converting the Joel Hasbrouck Information share code
originally for SAS and Implementing it into R. I have a working code but
for some reason I am about 10% of the actual values, if there is anyone
more experienced in this area I would much appreciate any advice availlable.

I know there is a better method in a previous post by Niddhi Aggarwaul with
a very coherent code for calculating a 2 dimensional case but I really need
an n dimensional case which should be represented by my code being ideally
the same as Hasbrouk's. My code is below, x refers to a matrix of parallel
midpoint stock quotes of n markets.

#find the optimal lag length
lags <- VARselect(x, lag.max=100)$selection[1]

#Johansens Cointegration test is applied and then estimates the VECM with
the estimated Beta
cointest <- ca.jo(x, K=lags, type="eigen", ecdet="const",
spec="transitory")
#summary(cointest)
    vecm <- cajorls(cointest)

#Breaks the VECM down into VAR by levels as in Hasbrouck
    var <- vec2var(cointest)
#covariance of forecast error response
FER <- fevd(var, n.ahead=ImpulseResponse,cumulative=TRUE, boot=FALSE)

#covariance matrix of innovations
d <- dim(FER[[1]])
for (i in 1:n) {
errvar <- FER[[i]]
if (i == 1) {CovIn <- errvar[(d[1]),]}
if (i != 1) {CovIn <- rbind(CovIn, errvar[(d[1]),])}
}
It's generally going to give you worlds better performance to make a
list of objects and then rbind() them all at once. (Consider the
equivalent C level operations of malloc()ing and free()ing on each
iteration, with all the work that entails) You can also avoid an
(explicit) for loop by doing something like:

do.call(rbind, lapply(FER, function(x) x[d[1],]))
#long run cumulative response functions
CRF <- irf(var, n.ahead=ImpulseResponse, ortho=TRUE, cumulative=FALSE,
boot=FALSE)
#matrix of long run cumulative response functions
d <- dim(CRF$irf[[1]])
for (i in 1:n) {
lrcrf <- CRF$irf[[i]]
if (i == 1) {LRCoeffs <- lrcrf[(d[1]),]}
if (i != 1) {LRCoeffs <- rbind(LRCoeffs, lrcrf[(d[1]),])}
}
As before, something like

do.call(rbind, lapply(CRF, function(x) x[d[1],]))
 Cdiag  <- diag(CovIn)
sqrtd <- sqrt(Cdiag)
sd  <- diag(sqrt(sqrtd)
Are you sure you mean to sqrt() twice in here?
cor   <- sd %*% CovIn %*% sd
cd  <- t(chol(cor))
VarContrib     <- LRCoeffs %*% cd
Probably a little easier to use the tcrossprod() function:

VarContrib <- tcrossprod(LRCoeffs, chol(cor))
VarTotal       <- colSums(VarContrib)
VarTotalMatrix <- VarTotal
 for (i in 1:(VectorCount-1)) {
See below, but this sort of syntax is sometimes fraught with peril
because it gives a loop of length 2, not 0 if VectorCount = 1;
specifically, it loops for i = 1 and then i = 0.
VarTotalMatrix <- rbind(VarTotalMatrix,VarTotal)}
PC <- VarContrib / (VarTotalMatrix)
 if (j == 1) {
PropCont <- colMeans(t(PC))}
if (j != 1) {
PropCont <- rbind(PropCont, colMeans(t(PC)))}
}
I'll let you modify this one like above, but note there's also a
rowMeans() function which will be faster than colMeans(t(...))
rm(InformationShares)
colnames(PropCont) <- InputVectors
InformationShares  <-colMeans(PropCont)
I'd recommend you wrap this all up in a function: then you don't have
to worry about freeing memory with rm() [It just happens when the
function call ends] and your code will be more reusable. If you do so,
it's not totally necessary, but it'll be easier to read if you add
return(InformationShares) at the end. [Some prefer just
InformationShares without a return statement but I find that
marginally less obvious: you actually wouldn't need either here
because of some subtleties about <- being a function as well, but
don't worry about that here.]

Disclaimer: I did the do.call(rbind, ...) bits real quick: they might
not be quite perfect.

Best,
Michael
        [[alternative HTML version deleted]]

_______________________________________________
R-SIG-Finance at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.