Skip to content

Difference between "tree" and "rpart"

2 messages · Dr Carbon, Brian Ripley

#
In the help for rpart it says, "This differs from the tree function
mainly in its handling of surrogate variables." And it says that an
rpart object is a superset of a tree object. Both cite Brieman et al.
1984. Both call external code which looks like martian poetry to me.

I've seen posts in the archives where BDR, and other knowledgeable
folks, have said that rpart() is to be prefered over tree()

Is there a simple reason why? They use the same fundamental algorithm.
Are there differences in processing time? Bells and whistles?

TIA, DRC
#
rpart does much more at C level, including pruning and cross-validation so 
can be much faster.

It is also user-extensible.

tree was actually written to track down bugs in the then S implementation, 
and so is much closer to the functionality in S.  It is not where I would 
have started from.  It is really only available for R to support MASS and 
PRNN (my books).
On Wed, 4 May 2005, Dr Carbon wrote: