party for prediction [REPOST]
On Oct 12, 2012, at 1:37 AM, Ed wrote:
Apologies for re-posting, my original message seems to have been overlooked by the moderators.
No. Your original post _was_ forwarded to the list. On my machine it appeared at October 11, 2012 11:03:08 AM PDT. No one responded. It seems possible that its lack of data or code is the reason for that state of affairs.
David. > ---------- Forwarded message ---------- > From: Ed <icelus2k5 at gmail.com> > Date: 11 October 2012 19:03 > Subject: party for prediction > To: R-help at r-project.org > > > Hi there > > I'm experiencing some problems using the party package (specifically > mob) for prediction. I have a real scalar y I want to predict from a > real valued vector x and an integral vector z. mob seemed the ideal > choice from the documentation. > > The first problem I had was at some nodes in a partitioning tree, the > components of x may be extremely highly correlated or effectively > constant (that is x are not independent for all choices of components > of z). When the resulting fit is fed into predict() the result is NA - > this is not the same behaviour as models returned by say lm which > ignore missing coefficients. I have fixed this by defining my own > statsModel (myLinearModel - imaginative) which also ignores such > coefficients when predicting. > > The second problem I have is that I get "Cholesky not positive > definite" errors at some nodes. I guess this is because of numerical > error and degeneracy in the covariance matrix? Any thoughts on how to > avoid having this happen would be welcome; it is ignorable though for > now. > > The third and really big problem I have is that when I apply mob to > large datasets (say hundreds of thousands of elements) I get a > "logical subscript too long" error inside mob_fit_fluctests. It's > caught in a try(), and mob just gives up and treats the node as > terminal. This is really hurting me though; with 1% of my data I can > get a good fit and a worthwhile tree, but with the whole dataset I get > a very stunted tree with a pretty useless prediction ability. > > I guess what I really want to know is: > (a) has anyone else had this problem, and if so how did they overcome it? > (b) is there any way to get a line or stack trace out of a try() > without source modification? > (c) failing all of that, does anyone know of an alternative to mob > that does the same thing; for better or worse I'm now committed to > recursive partitioning over linear models, as per mob? > (d) failing all of this, does anyone have a link to a way to rebuild, > or locally modify, an R package (preferably windows, but anything > would do)? > > Sorry for the length of this post. If I should RTFM, please point me > at any relevant manual by all means. I've spent a few days on this as > you can maybe tell, but I'm far from being an R expert. > > Thanks for any help you can give. > > Best wishes, > > Ed David Winsemius, MD Alameda, CA, USA