Skip to content

User defined split function in Rpart

1 message · Terry Therneau

#
The question is about the direction vector in rpart.
  
  There are (at least) two preferred ways to lay out a tree, wrt the question of 
which obs are sent left and which right.

    1. Send the smaller y values to the left.  In the final tree,  there will be 
a graphical ordering with smaller y's to the left and larger ones to the right.  
One has a "left bad, right good" orientation when traversing the tree.  I find 
that medical researchers often like this.
 
    2. Send observations with x < cutpoint to the left.  Setting all elements of 
the direction vector to -1 will give this behavior.  
    
    I happen to slightly prefer option 1, which of course means that it became 
the default behavior in rpart.  (For a categorical y with many levels, however, 
rpart orders on the percent of observations in category 1, which may not be 
particularly useful.)
    
    
    	Terry Therneau