Skip to content

Exhaustive CHAID package

4 messages · Michael Grant, Achim Zeileis

#
Dear R-Help:
Michael Grant
Professor
University of Colorado Boulder
#
On Tue, 21 Apr 2015, Michael Grant wrote:

            
I searched a bit on the web for "exhaustive CHAID" and didn't find any 
convincing evidence that this method is "most commonly" the "most useful". 
I doubt that such evidence exists because the methods are applicable to so 
many different situations that uniformly better results are essentially 
never obtained. Nevertheless, if you have references of comparison 
studies, I would still be interested. Possibly these provide insight in 
which situations exhaustive CHAID performs particularly well.
I wouldn't know of any such plans. But if you want to adapt/extend the 
code from the CHAID package, this is freely available.
I wouldn't be concerned about disloyalty. If you feel that exhaustive 
CHAID is the most appropriate tool for your problem and you have access to 
it in SPSS, why not use it? Possibly you can also export it from SPSS and 
import it into R using PMML. The "partykit" package has an example with an 
imported QUEST tree from SPSS.
#
Many thanks for your response, sir.

Here are two of the references to which I referred.  I've also personally explored several data sets in which the outcomes are 'known' and have seen high variability in the topology of the trees being produced but, typically Exhaustive CHAID predictions match the 'known' results better than any of the others, using default settings.

http://www.hindawi.com/journals/jam/2014/929768/
http://interstat.statjournals.net/YEAR/2010/articles/1007001.pdf

By inference, many research papers are choosing Exhaustive CHAID.

My concern is not that these procedures produce mildly variant trees but dramatically variant, with not even the same set of variables included.

Is CHAID available for use as an R package?  I thought R-FORGE was solely for developers?

Again, many thanks.

MCG

-----Original Message-----
From: Achim Zeileis [mailto:Achim.Zeileis at uibk.ac.at] 
Sent: Wednesday, April 22, 2015 3:30 AM
To: Michael Grant
Cc: r-help at R-project.org
Subject: Re: [R] Exhaustive CHAID package
On Tue, 21 Apr 2015, Michael Grant wrote:

            
I searched a bit on the web for "exhaustive CHAID" and didn't find any convincing evidence that this method is "most commonly" the "most useful". 
I doubt that such evidence exists because the methods are applicable to so many different situations that uniformly better results are essentially never obtained. Nevertheless, if you have references of comparison studies, I would still be interested. Possibly these provide insight in which situations exhaustive CHAID performs particularly well.
I wouldn't know of any such plans. But if you want to adapt/extend the code from the CHAID package, this is freely available.
I wouldn't be concerned about disloyalty. If you feel that exhaustive CHAID is the most appropriate tool for your problem and you have access to it in SPSS, why not use it? Possibly you can also export it from SPSS and import it into R using PMML. The "partykit" package has an example with an imported QUEST tree from SPSS.
#
On Wed, 22 Apr 2015, Michael Grant wrote:

            
Thanks for the references, I wasn't aware of these. Both of these appear 
to use the SPSS implementations of CHAID, exhaustive CHAID, CART, and 
QUEST with default settings (as you did above). As I have never used these 
myself in SPSS, I cannot say how the implementations compare but it's well 
possible that these are different from other implementations. E.g., for 
CART the pruning rule may make a difference (with or without 
crossvalidation; with 1-SE or 0-SE rule etc.). Similarly, for QUEST I 
think that Loh's own implementation uses somewhat different default 
settings.

So it may be advisable to go beyond defaults.
My experience is that this is determined to a good degree by the software 
available and what others in the same literature use.
Yes, instability of the tree structure is one of the drawbacks of 
tree-based procedures. Of course, the tree structure can be very different 
while producing very similar predictions.
Yes.
See: https://R-Forge.R-project.org/R/?group_id=343

You can easily install the package from R-Forge and also check out the 
entire source code anonymously.