Many thanks for your response, sir.
Here are two of the references to which I referred. I've also personally
explored several data sets in which the outcomes are 'known' and have
seen high variability in the topology of the trees being produced but, typically
Exhaustive CHAID predictions match the 'known' results better than any
of the others, using default settings.
http://www.hindawi.com/journals/jam/2014/929768/
http://interstat.statjournals.net/YEAR/2010/articles/1007001.pdf
By inference, many research papers are choosing Exhaustive CHAID.
My concern is not that these procedures produce mildly variant trees but
dramatically variant, with not even the same set of variables included.
Is CHAID available for use as an R package? I thought R-FORGE was solely for
developers?
Again, many thanks.
MCG
-----Original Message-----
From: Achim Zeileis [mailto:Achim.Zeileis at uibk.ac.at]
Sent: Wednesday, April 22, 2015 3:30 AM
To: Michael Grant
Cc: r-help at R-project.org
Subject: Re: [R] Exhaustive CHAID package
On Tue, 21 Apr 2015, Michael Grant wrote:
> Dear R-Help:
>
> From multiple sources comparing methods of tree classification and
> tree regressions on various data sets, it seems that Exhaustive CHAID
> (distinct from CHAID), most commonly generates the most useful tree
> results and, in particular, is more effective than ctree or rpart
> which are implemented in R.
I searched a bit on the web for "exhaustive CHAID" and didn't find
any convincing evidence that this method is "most commonly" the
"most useful".
I doubt that such evidence exists because the methods are applicable to so many
different situations that uniformly better results are essentially never
obtained. Nevertheless, if you have references of comparison studies, I would
still be interested. Possibly these provide insight in which situations
exhaustive CHAID performs particularly well.
> I see that CHAID, but not Exhaustive CHAID, is in the R-forge, and I
> write to ask if there are plans to create a package which employs the
> Exhaustive CHAID strategy.
I wouldn't know of any such plans. But if you want to adapt/extend the code
from the CHAID package, this is freely available.
> Right now the best source I can find is in SPSS-IBM and I feel a bit
> disloyal to R using it.
I wouldn't be concerned about disloyalty. If you feel that exhaustive CHAID
is the most appropriate tool for your problem and you have access to it in SPSS,
why not use it? Possibly you can also export it from SPSS and import it into R
using PMML. The "partykit" package has an example with an imported
QUEST tree from SPSS.
> Michael Grant
> Professor
> University of Colorado Boulder
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>