Skip to content
Prev 43897 / 398528 Next

Recursive partitioning with multicollinear variables

On Mon, 9 Feb 2004 11:24:39 +0100
"Jean-Noel" <jean-noel.candau at avignon.inra.fr> wrote:

            
A more accurate and stable result would be obtained by performing a data
reduction procedure that ignores the response variable.  Combining
collinear variables into an index is often better than arbitrarily
choosing between them.  Then use the indexes in a regression model unless
you have tens of thousands of observations for recursive partitioning, or
are using bagging of trees or a related procedure to cancel out the
instability in the tree growing process [which unfortunately will often
result in an average of trees that is more complex in appearance than a
regression model].

Frank
---
Frank E Harrell Jr   Professor and Chair           School of Medicine
                     Department of Biostatistics   Vanderbilt University