I had been using Minitab in my MBA program at NYU, but my professor of regression and data analysis Jeff Simonoff suggested that ambitious students might try R. I didn't use it for his course but gradually taught myself to perform regression analysis and diagnostics with R. I'm definitely not from a mathematical/statistical background. I've been driven to statistics by a desire to test assumptions I form about fundamental company research and equity analysis. Most equity research people tend to not check diagnostics very well, and thus violate basic assumptions I think, because they'd rather be overconfident about their assumptions. That's an over-generalization though. I've been doing valuation studies within industries, looking at how some valuation measures relate to company characteristics and external factors over time. My professor suggested using mixed effects models for such longitudinal data studies, and I don't have a budget for this sort of thing, so using R and the NLME package was a natural choice. I think I'm doing some things with it I haven't seen other analysts do. I notice that most financial users of R and S-Plus tend to be focused on derivatives and time-series applications. I haven't seen many approaching time series data from a multi-factor relationship perspective. Anyone else seen good applications, publications on that kind of analysis? Regards, Andrew West
Using R in equity research
4 messages · Andrew West, Dirk Eddelbuettel, Ajay Shah
Andrew,
On Sun, Jun 06, 2004 at 05:00:09PM -0700, Andrew West wrote:
I had been using Minitab in my MBA program at NYU, but my professor of regression and data analysis Jeff Simonoff suggested that ambitious students might try R. I didn't use it for his course but gradually taught myself to perform regression analysis and diagnostics with R. I'm definitely not from a mathematical/statistical background. I've been driven to statistics by a desire to test assumptions I form about fundamental company research and equity analysis. Most equity research people tend to not check diagnostics very well, and thus violate basic
I think that is generally true 'on the street'.
assumptions I think, because they'd rather be overconfident about their assumptions. That's an over-generalization though. I've been doing valuation studies within industries, looking at how some valuation measures relate to company characteristics and external factors over time. My professor suggested using mixed effects models for such longitudinal data studies, and I don't have a budget for this sort of thing, so using R and the NLME package was a natural choice. I think I'm doing some things with it I haven't seen other analysts do.
Sounds interesting. Do you have any write-ups?
I notice that most financial users of R and S-Plus tend to be focused on derivatives and time-series applications. I haven't seen many approaching time series data from a multi-factor relationship perspective. Anyone else seen good applications, publications on that kind of analysis?
For what I know, it is used fairly extensive in portfolio management (for equity as well as other portfolio) but I don't have a quick reference I could point you to for more. Suggestions, anyone? Best regards, Dirk
FEATURE: VW Beetle license plate in California
I've been doing valuation studies within industries, looking at how some valuation measures relate to company characteristics and external factors over time. My professor suggested using mixed effects models for such longitudinal data studies, and I don't have a budget for this sort of thing, so using R and the NLME package was a natural choice. I think I'm doing some things with it I haven't seen other analysts do.
I'm not sure I understand what you are doing, but I've often thought about doing the following: Suppose you estimate regression models _within_ a homogeneous industry, where you put P/E or P/B on the l.h.s. and you use a bunch of firm characteristics as explanatory variables. Would the outliers be places to take a good look for a profit opportunity? (Is this what you have in mind?) The problem with this (AFAICT) is that the cross section of accounting ratios / data tends to be pretty nasty in terms of distributions. You'll always have a few weird observations which drive the result. R might be particularly good at this, by virtue of bringing a variety of statistical and graphical tools to bear on weird observations, non-normal distributions, etc. All this is just guesswork, I haven't actually done it. If you have, do show us examples?
Ajay Shah Consultant ajayshah@mayin.org Department of Economic Affairs http://www.mayin.org/ajayshah Ministry of Finance, New Delhi
Ajay, Yes, that's part of what I've been working on lately. I cover companies in the industrial and materials sector. I can't do research just trying to apply the major academic studies done on large universes, by increasing my loadings on SMB, HML, earnings surprises, and putting negative loadings on accruals (though it's good to keep in mind). That kind of strategy doesn't work so well within a universe of 30 companies and a 1 year measurement horizon. I mostly work on fundamental analysis, and DCF models, but wanted to supplement that with value-added relative value analysis. The typical sell-side "based on average p/e blah blah" is quite weak, so I looked at what Aswath Damodaran suggested in his valuation book. He had some basic ideas of performing a regression on one industry at one point in time, regressing something like p/e on expected growth and beta, for example, or ev/ebitda on some other factors. But then he showed how the results vary widely year to year, performing separate regressions each year, and kind of just threw up his hands. Fortunately, I've got the Compustat database at work, and can pull industry data like that for many points in time, so I can create a longitudinal set of data for an industry. I first tried piling multiple years and companies into one big pile and performing a regression on it, but I knew that would be very bad. I asked my former NYU professor of regression for advice on how to tackle such an analysis. He suggested using mixed-effects models, such as nlme in R, and after researching it and buying the Bates/Pinheiro book, and performing some analyses, it definitely makes more sense using this approach. Then I tie the relationships that I find into my relative value models using my fundemental forecasts (stuff like growth, margins, expected leverage, etc.) as predicting factors. I don't expect a tight fit, but just to help make better informed, more objective valuation estimates, which tend to be required of a sell-side analyst. I definitely have to transform variables, do variable weightings, look for outliers, check for autocorrelations, etc. The RCMDR and NLME packages make these things reasonably easy to do, and the best thing about R is that I have the code for my study after its done, so if I have second thoughts, I can go back in fairly easily.
--- Ajay Shah <ajayshah@mayin.org> wrote:
I've been doing valuation studies within
industries, looking at how
some valuation measures relate to company
characteristics and
external factors over time. My professor suggested
using mixed
effects models for such longitudinal data studies,
and I don't have
a budget for this sort of thing, so using R and
the NLME package was
a natural choice. I think I'm doing some things
with it I haven't
seen other analysts do.
I'm not sure I understand what you are doing, but
I've often thought
about doing the following: Suppose you estimate
regression models
_within_ a homogeneous industry, where you put P/E
or P/B on the
l.h.s. and you use a bunch of firm characteristics
as explanatory
variables. Would the outliers be places to take a
good look for a
profit opportunity? (Is this what you have in mind?)
The problem with this (AFAICT) is that the cross
section of accounting
ratios / data tends to be pretty nasty in terms of
distributions. You'll always have a few weird
observations which drive
the result. R might be particularly good at this, by
virtue of
bringing a variety of statistical and graphical
tools to bear on weird
observations, non-normal distributions, etc.
All this is just guesswork, I haven't actually done
it. If you have,
do show us examples?
--
Ajay Shah
Consultant
ajayshah@mayin.org Department
of Economic Affairs
http://www.mayin.org/ajayshah Ministry of
Finance, New Delhi
_______________________________________________ R-sig-finance@stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-sig-finance