Hello, my apologies if I do something wrong - first posting for me. I am trying to apply PCA on the daily history of a bunch of forward curves and run into my depths of ignorance. I would appreciate some help... My aim is to use PCA for risk control. I.e. estimate the eigenverctors and eigenvalues and build the principal components at some confidence level, e.g. 95%. If, for example, we were looking at the first 3 components only, I would - estimate PC1up, PC1dn, PC2up, PC2dn, PC3up and PC3dn. Let's assume that - PC1up is worse for my position than PC1dn, - PC2up is worse than PC2dn and - PC3dn is worse than PC3up I would then 'add' these worse for me components (PC1up, PC2up and PC3dn) and run my position through them to get a measure of risk at that confidence level. To do the PCA, I first foundthe log returns, let's call them Returns. I then do: pcdat <-princomp(Returns, cor=TRUE) and calculate the principal components like this (this is where I am very foggy...): PC <- exp(someQuantile*t(pcdat[[2]])*sqrt(pcdat[[1]])*sd(Returns)) # somQuantile = 1.64 for a 95% CL As much as I looked around, people discuss the benefits of PC but not how to recombine the principal components at some confidence interval to get a shocked curve. Could anyone help? Thank you, Benji
PCA in Risk Control with R
7 messages · Benji Famel, Brian G. Peterson, Sarbo +1 more
Why don't you disguise a subset of your data and provide a working example? Both you and the list will get more out of it if we can all work on something that is actually executable in R, per the posting guide. Your problem is interesting and relevant, so put a little more effort into it, and I'm sure you'll get collaborators in working through it. Regards, - Brian
Benji Famel wrote:
Hello, my apologies if I do something wrong - first posting for me. I am trying to apply PCA on the daily history of a bunch of forward curves and run into my depths of ignorance. I would appreciate some help... My aim is to use PCA for risk control. I.e. estimate the eigenverctors and eigenvalues and build the principal components at some confidence level, e.g. 95%. If, for example, we were looking at the first 3 components only, I would - estimate PC1up, PC1dn, PC2up, PC2dn, PC3up and PC3dn. Let's assume that - PC1up is worse for my position than PC1dn, - PC2up is worse than PC2dn and - PC3dn is worse than PC3up I would then 'add' these worse for me components (PC1up, PC2up and PC3dn) and run my position through them to get a measure of risk at that confidence level. To do the PCA, I first foundthe log returns, let's call them Returns. I then do: pcdat <-princomp(Returns, cor=TRUE) and calculate the principal components like this (this is where I am very foggy...): PC <- exp(someQuantile*t(pcdat[[2]])*sqrt(pcdat[[1]])*sd(Returns)) # somQuantile = 1.64 for a 95% CL As much as I looked around, people discuss the benefits of PC but not how to recombine the principal components at some confidence interval to get a shocked curve. Could anyone help? Thank you, Benji
_______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.
Brian G. Peterson http://braverock.com/brian/ Ph: 773-459-4973 IM: bgpbraverock
I like the idea, and I have attached a sample of data. I think this
should work as the file is not a binary one. The data represents
daily NYMEX data for Natural gas.
The columns are flat prices (not returns) and represent:
1. Date
2. Prompt contract
3. Back contract (2nd month out)
4. Far contract (3d month out)
5. etc.
After transferring the data to R through RExcel as MktData, I execute
the following code:
MktReturns.d <- MktData
# ----------- Data Preparation ------------------
for (i in 1:ncol(MktData) ) {
MktReturns.d[,i] <- Fin.Calcs.logreturns(x=MktData[,i], deltaT= 1, pad=T)
}
MktReturns.d <- na.omit(MktReturns.d)
#PERFORM PCA ON DAILIES (good for 1 day risk... if I wanted the
weekly risk, I would work with weekly returns)
pcdat.d <- princomp(MktReturns.d, cor=TRUE) # - It will use
correlation matrix so NO need to scale
the.summary.d <- summary(pcdat.d) # - It will print standard
deviation and proportion of variances for each component
the.loadings.d <- loadings(pcdat.d) # - it will give information how
much each variable contribute to each component.
the.scores.d <- pcdat.d$scores # - It will plot scores of each
observation for each variable
whichQuantile <- quantile(rnorm(1000000),probs=c(0.95))
PC <- exp(whichQuantile*t(pcdat.d[[2]])*sqrt(pcdat.d[[1]])*sd(MktReturns.d))
# note that if I wanted to work with daily returns but calculate the 1
week risk, I woudl be multiplying above with sqrt(5)
Hope this helps.
Benji
On Tue, Feb 16, 2010 at 6:54 PM, Brian G. Peterson <brian at braverock.com> wrote:
Why don't you disguise a subset of your data and provide a working example? Both you and the list will get more out of it if we can all work on something that is actually executable in R, per the posting guide. Your problem is interesting and relevant, so put a little more effort into it, and I'm sure you'll get collaborators in working through it. Regards, ?- Brian Benji Famel wrote:
Hello, my apologies if I do something wrong - first posting for me. I am trying to apply PCA on the daily history of a bunch of forward curves and run into my depths of ignorance. ?I would appreciate some help... My aim is to use PCA for risk control. ?I.e. estimate the eigenverctors and eigenvalues and build the principal components at some confidence level, e.g. 95%. ?If, for example, we were looking at the first 3 components only, I would - estimate PC1up, PC1dn, PC2up, PC2dn, PC3up and PC3dn. Let's assume that - PC1up is worse for my position than PC1dn, - PC2up is worse than PC2dn and - PC3dn is worse than PC3up I would then 'add' these worse for me components (PC1up, PC2up and PC3dn) and run my position through them to get a measure of risk at that confidence level. To do the PCA, I first foundthe log returns, let's call them Returns. I then do: pcdat <-princomp(Returns, cor=TRUE) and calculate the principal components like this (this is where I am very foggy...): PC <- exp(someQuantile*t(pcdat[[2]])*sqrt(pcdat[[1]])*sd(Returns)) ? # somQuantile = 1.64 for a 95% CL As much as I looked around, people discuss the benefits of PC but not how to recombine the principal components at some confidence interval to get a shocked curve. Could anyone help? Thank you, Benji
_______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.
-- Brian G. Peterson http://braverock.com/brian/ Ph: 773-459-4973 IM: bgpbraverock
-------------- next part -------------- A non-text attachment was scrubbed... Name: NYMEX curves.csv Type: application/octet-stream Size: 376188 bytes Desc: not available URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20100216/33a4c680/attachment.obj>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20100216/438e7214/attachment.pl>
Sarbo, Thank you. This answers how you rebuild the curve. Now this may be a silly question but how do you shock the resulting curve to a given conf level? Thanks again, Benji. Sent from my mobile device.
On Feb 16, 2010, at 8:02 PM, Sarbo <cmdr_rogue at hotmail.com> wrote:
Hi Benji- you're in luck. I've done exactly this sort of thing in the
past. Here is the code that I wrote to do the job:
BuildPCACurves <- function(data, ncomps = 3){
if (class(data) != 'matrix'){A <- as.matrix(data)} else A <- data
M <- t(A) %*% A
eigens <- eigen(M)$vectors
eigens[,1] <- -eigens[,1]
selectors <- matrix(0, nrow = ncol(A), ncol = ncol(A))
diagentries <- c(rep(1, ncomps), rep(0, ncol(A) - ncomps))
diag(selectors) <- diagentries
coefficients <- A %*% eigens
newcurves <- coefficients %*% selectors %*% t(eigens)
coef2 <- apply(coefficients, 2, diff)
means <- apply(coef2, 2, mean)
stdevs <- apply(coef2, 2, sd)
RMSE <- sqrt(sum((apply(newcurves - A, 2, sum)) ^ 2))
output = list(original = A, rebuilt = newcurves, means =
means[1:ncomps], stds = stdevs[1:ncomps], RMSE = RMSE)
return(output)
}
This doesn't use the actual "princomp" function in R, but it does
exactly the same thing; it just uses the underlying matrix theory
behind
PCA itself.
On Tue, 2010-02-16 at 19:15 -0500, Benji Famel wrote:
I like the idea, and I have attached a sample of data. I think this
should work as the file is not a binary one. The data represents
daily NYMEX data for Natural gas.
The columns are flat prices (not returns) and represent:
1. Date
2. Prompt contract
3. Back contract (2nd month out)
4. Far contract (3d month out)
5. etc.
After transferring the data to R through RExcel as MktData, I
execute
the following code:
MktReturns.d <- MktData
# ----------- Data Preparation ------------------
for (i in 1:ncol(MktData) ) {
MktReturns.d[,i] <- Fin.Calcs.logreturns(x=MktData[,i], deltaT=
1, pad=T)
}
MktReturns.d <- na.omit(MktReturns.d)
#PERFORM PCA ON DAILIES (good for 1 day risk... if I wanted the
weekly risk, I would work with weekly returns)
pcdat.d <- princomp(MktReturns.d, cor=TRUE) # - It will use
correlation matrix so NO need to scale
the.summary.d <- summary(pcdat.d) # - It will print standard
deviation and proportion of variances for each component
the.loadings.d <- loadings(pcdat.d) # - it will give information
how
much each variable contribute to each component.
the.scores.d <- pcdat.d$scores # - It will plot scores of each
observation for each variable
whichQuantile <- quantile(rnorm(1000000),probs=c(0.95))
PC <- exp(whichQuantile*t(pcdat.d[[2]])*sqrt(pcdat.d[[1]])*sd
(MktReturns.d))
# note that if I wanted to work with daily returns but calculate
the 1
week risk, I woudl be multiplying above with sqrt(5)
Hope this helps.
Benji
On Tue, Feb 16, 2010 at 6:54 PM, Brian G. Peterson <brian at braverock.com
wrote: Why don't you disguise a subset of your data and provide a working example? Both you and the list will get more out of it if we can all work on something that is actually executable in R, per the posting guide. Your problem is interesting and relevant, so put a little more effort into it, and I'm sure you'll get collaborators in working through it. Regards, - Brian Benji Famel wrote:
Hello, my apologies if I do something wrong - first posting for me. I am trying to apply PCA on the daily history of a bunch of forward curves and run into my depths of ignorance. I would appreciate some help... My aim is to use PCA for risk control. I.e. estimate the eigenverctors and eigenvalues and build the principal components at some confidence level, e.g. 95%. If, for example, we were looking at the first 3 components only, I would - estimate PC1up, PC1dn, PC2up, PC2dn, PC3up and PC3dn. Let's assume that - PC1up is worse for my position than PC1dn, - PC2up is worse than PC2dn and - PC3dn is worse than PC3up I would then 'add' these worse for me components (PC1up, PC2up and PC3dn) and run my position through them to get a measure of risk at that confidence level. To do the PCA, I first foundthe log returns, let's call them Returns. I then do: pcdat <-princomp(Returns, cor=TRUE) and calculate the principal components like this (this is where I am very foggy...): PC <- exp(someQuantile*t(pcdat[[2]])*sqrt(pcdat[[1]])*sd (Returns)) # somQuantile = 1.64 for a 95% CL As much as I looked around, people discuss the benefits of PC but not how to recombine the principal components at some confidence interval to get a shocked curve. Could anyone help? Thank you, Benji
_______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.
-- Brian G. Peterson http://braverock.com/brian/ Ph: 773-459-4973 IM: bgpbraverock
_______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.
[[alternative HTML version deleted]]
_______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20100217/6601eb3c/attachment.pl>
hello Julien, Thank you for your response. I get Point 1. On point 2, stating your comment in another way (to make sure I understand it): since we do not have a distribution of the normal modes, no scale factor exists by which to shock them. Here is my thinking: 1. Work with returns 2 .Calculate the sigma of these returns (assuming a normal for ease) 3. Scale by sigma (which explains why it appears as a mult factor in step 5) 4. Calculate the normal modes 5. Return to original coordinate system by (this might be somewhat incorrect but, if not, then it provides a way to stress at a given confidence level): PCA| i,m = exp( FI(0.995) * N|i,m * S|m * SQRT(L|i) ) i = i-th Principal component m = forward month FI = inv normal at a CL N|i,m = eigenvectors S|mm = std dev L|i = eigenvalue So my way of stressing is scaling up that S by the "2.57" (the FI(0.995)). Is this wrong? Am I missing something? Benji On Wed, Feb 17, 2010 at 3:38 AM, julien cuisinier
<j_cuisinier at hotmail.com> wrote:
Hi Benji, Welcome to the list & good luck with the posting guide ;-) I am not a PCA expert, but in my opinion there are 2 items in your question: 1. run PCA on original data set, choose relevant factors/components, do something with these factors, back to "original" data set. check http://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf there is a small description of getting back to old axis system 2. Do something to the factors (risk focus): it seems you do factor push stress testing (i.e. each pushing factor right/left by an arbitrary amount, pick the side hurting the port the most, repeat this for each factors and apply each factor "worst" move to the portfolio. I therefore do not see where/why the last computation comes in (trying to get a risk measure at a certain confidence level). The components moves are arbitrary and the risk measure resulting on the portfolio level will also be, no probabilities associated hence no confidence level. One can decide to model each component with a parametric distribution then derive what risk measure is relevant - the big advatages of PCA being that you do not have to care about components comovement, as long as the assumption of linearity is acceptable. But this is simply another approach, not something on top of the factor push you are trying Hope this helps Rgds, Julien PS: the "parametric approach" used in your first post on top of the factor push stuff seems to use simple normal distrib VaR, if you go that route I would probably start by trying to model volatility clustering (GARCH) instead of relying on simple historical estimates...hope I understood your post well, I do not mean to lecture you of course...
From: benjifamel at gmail.com To: cmdr_rogue at hotmail.com Date: Tue, 16 Feb 2010 20:26:47 -0500 CC: r-sig-finance at stat.math.ethz.ch Subject: Re: [R-SIG-Finance] PCA in Risk Control with R Sarbo, Thank you. This answers how you rebuild the curve. Now this may be a silly question but how do you shock the resulting curve to a given conf level? Thanks again, Benji. Sent from my mobile device. On Feb 16, 2010, at 8:02 PM, Sarbo <cmdr_rogue at hotmail.com> wrote:
Hi Benji- you're in luck. I've done exactly this sort of thing in the
past. Here is the code that I wrote to do the job:
BuildPCACurves <- function(data, ncomps = 3){
if (class(data) != 'matrix'){A <- as.matrix(data)} else A <- data
M <- t(A) %*% A
eigens <- eigen(M)$vectors
eigens[,1] <- -eigens[,1]
selectors <- matrix(0, nrow = ncol(A), ncol = ncol(A))
diagentries <- c(rep(1, ncomps), rep(0, ncol(A) - ncomps))
diag(selectors) <- diagentries
coefficients <- A %*% eigens
newcurves <- coefficients %*% selectors %*% t(eigens)
coef2 <- apply(coefficients, 2, diff)
means <- apply(coef2, 2, mean)
stdevs <- apply(coef2, 2, sd)
RMSE <- sqrt(sum((apply(newcurves - A, 2, sum)) ^ 2))
output = list(original = A, rebuilt = newcurves, means =
means[1:ncomps], stds = stdevs[1:ncomps], RMSE = RMSE)
return(output)
}
This doesn't use the actual "princomp" function in R, but it does
exactly the same thing; it just uses the underlying matrix theory
behind
PCA itself.
On Tue, 2010-02-16 at 19:15 -0500, Benji Famel wrote:
I like the idea, and I have attached a sample of data. I think this
should work as the file is not a binary one. The data represents
daily NYMEX data for Natural gas.
The columns are flat prices (not returns) and represent:
1. Date
2. Prompt contract
3. Back contract (2nd month out)
4. Far contract (3d month out)
5. etc.
After transferring the data to R through RExcel as MktData, I
execute
the following code:
MktReturns.d <- MktData
# ----------- Data Preparation ------------------
for (i in 1:ncol(MktData) ) {
MktReturns.d[,i] <- Fin.Calcs.logreturns(x=MktData[,i], deltaT=
1, pad=T)
}
MktReturns.d <- na.omit(MktReturns.d)
#PERFORM PCA ON DAILIES (good for 1 day risk... if I wanted the
weekly risk, I would work with weekly returns)
pcdat.d <- princomp(MktReturns.d, cor=TRUE) # - It will use
correlation matrix so NO need to scale
the.summary.d <- summary(pcdat.d) # - It will print standard
deviation and proportion of variances for each component
the.loadings.d <- loadings(pcdat.d) # - it will give information
how
much each variable contribute to each component.
the.scores.d <- pcdat.d$scores # - It will plot scores of each
observation for each variable
whichQuantile <- quantile(rnorm(1000000),probs=c(0.95))
PC <- exp(whichQuantile*t(pcdat.d[[2]])*sqrt(pcdat.d[[1]])*sd
(MktReturns.d))
# note that if I wanted to work with daily returns but calculate
the 1
week risk, I woudl be multiplying above with sqrt(5)
Hope this helps.
Benji
On Tue, Feb 16, 2010 at 6:54 PM, Brian G. Peterson <brian at braverock.com
wrote: Why don't you disguise a subset of your data and provide a working example? Both you and the list will get more out of it if we can all work on something that is actually executable in R, per the posting guide. Your problem is interesting and relevant, so put a little more effort into it, and I'm sure you'll get collaborators in working through it. Regards, - Brian Benji Famel wrote:
Hello, my apologies if I do something wrong - first posting for me. I am trying to apply PCA on the daily history of a bunch of forward curves and run into my depths of ignorance. I would appreciate some help... My aim is to use PCA for risk control. I.e. estimate the eigenverctors and eigenvalues and build the principal components at some confidence level, e.g. 95%. If, for example, we were looking at the first 3 components only, I would - estimate PC1up, PC1dn, PC2up, PC2dn, PC3up and PC3dn. Let's assume that - PC1up is worse for my position than PC1dn, - PC2up is worse than PC2dn and - PC3dn is worse than PC3up I would then 'add' these worse for me components (PC1up, PC2up and PC3dn) and run my position through them to get a measure of risk at that confidence level. To do the PCA, I first foundthe log returns, let's call them Returns. I then do: pcdat <-princomp(Returns, cor=TRUE) and calculate the principal components like this (this is where I am very foggy...): PC <- exp(someQuantile*t(pcdat[[2]])*sqrt(pcdat[[1]])*sd (Returns)) # somQuantile = 1.64 for a 95% CL As much as I looked around, people discuss the benefits of PC but not how to recombine the principal components at some confidence interval to get a shocked curve. Could anyone help? Thank you, Benji
_______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.
-- Brian G. Peterson http://braverock.com/brian/ Ph: 773-459-4973 IM: bgpbraverock
_______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.
[[alternative HTML version deleted]]
_______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.
_______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.
________________________________ Hotmail: Leistungsstarke kostenlose E-Mails mit Sicherheit von Microsoft. Jetzt herunterladen.