On Wed, 31 Oct 2012, aelmore wrote:
> I'm hoping that folks out there with expertise in working with the
party
> package can help me out here. My team is trying to convert party tree
> output into a text file format that can be read by our image processing
> software. We are running into difficulties because the way the two
> different programs identify their nodes is different.
>
> R numbers it's nodes 1, 2, 3, 4, 5, 6 etc. down the "yes"
answers in the
> tree until a terminal node is reached, at which point it backs up to the
> parent of that terminal node, gives its "no" child the next
number
> available, follows any additional yes children down from there until it
> hits a terminal node, and repeats necessary.
And that's essentially also how the visualization works. A suitably sized
grid is set up first and then grid viewports are recursively created and
traversed.
> Our image processing system uses a coordinate system that labels each
> node according to its location in the tree (e.g. row 4, column 43).
> For instance, the first split in the tree is at location (1,1), the two
> daughter nodes are (2,1) and (2,2), and their daughters are (3,1),
> (3,2), (3,3), (3,4), etc. etc. on down the tree. It assigns these
> location coordinates referencing all possible node locations. In other
> words, if node (2,1) in the above scenario happened to be a terminal
> node, the daughters of node 2,2 would still be labeled (3,3) and (3,4),
> even though nodes (3,1) and (3,2) would not actually exist. They are
> still, from a labeling/ID perspective, retained, as it were, as phantom
> place holders.
>
> Given that the plot function in the party package is able to produce
> output that shows these column and row relationships in graphical
> format, it seems to me that there ought to be a way to extract this
> positional information from the package, but I haven't found a way to
do
> it yet.
I think we haven't got anything in the package that delivers this out of
the box. (Also note that the tree is often sparser than all binary splits
that could occur.)
> The plot.tree info in the manual says it creates "an (invisible) list
> with components x and y giving the coordinates of the tree nodes," and
> I'm wondering if it's talking about "tree coordinates,"
as used by our
> image processor, or if it's refering to the x and y position on the
> printed plot. Either way, I don't know how to access those x/y numbers.
That's not from the "party" package anyways. plot.tree is from the
"tree"
package whose code bases are not related.
> At any rate, I'm pretty stumped. It seems to me that since the
> positional relationships are produced in the plots, there ought to be a
> way to get at them, programmatically.
>
> Any suggestions out there?
I would suggest that you take a loot at the "partykit" package. It
contains a reimplementation of ctree() based on acleaner design of the
recursive tree structure. It also offers more functions for user
interaction. It comes with a vignette that is still somewhat rough but
should give you a good idea how things work.
hth,
Z