Skip to content
Prev 12308 / 398503 Next

rpart puzzle

Two problems here:

1)  rpart is supposed to follow the Breiman et al (1984) monograph, which 
looks at all n*v values of potential splitters (n = cases; v= variables) 
and then splits on the midpoint using the rule:

x7<= 37
x7 > 37

2)  It makes the tree useless for dealing with unknown observations where 
x7 may happen to equal 37.

The reason this even came to my notice is because of this precise 
circumstance.  I found that rpart moved to a surrogate variable when 
x7=37.  There was no need to use a surrogate since x7 wasn't missing and 
assumed a value well within x7's range.
At 11:22 AM 7/12/2001 -0700, White.Denis at epamail.epa.gov wrote:

            
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._