Not sure if anyone has posted on this problem ... I want to use rpart to
build a binary tree on a relatively large dataset with ~1400 data points
and 15 predictors. But I've noticed that rpart fails almost immediately
in the call to C_s_to_rp, as that code returns nonsense. Looking at the
code itself isn't terribly helpful, and there don't seem to be any hard
limits coded anywhere. Does anyone have a suggestion for what might be
going on?
 
Thanks in advance for you help
Andrew Zachary
----
Wetherby Partners LLC believes the information provided herein is reliable.
While every care has been taken to ensure accuracy, the information is furnished
to the recipients with no warranty as to the completeness and accuracy of its
contents and on condition that any errors or omissions shall not be made the
basis for any claim, demand or cause for action.
The information in this email is intended only for the named...{{dropped}}
On Tue, 19 Sep 2006, Andrew Zachary wrote:> Not sure if anyone has posted on this problem ... I want to use rpart to > build a binary tree on a relatively large dataset with ~1400 data points > and 15 predictors. But I've noticed that rpart fails almost immediately > in the call to C_s_to_rp, as that code returns nonsense. Looking at the > code itself isn't terribly helpful, and there don't seem to be any hard > limits coded anywhere. Does anyone have a suggestion for what might be > going on? >Andrew, you need to give an _executable_ example illustrating your problem. What means `nonsense'? Best, Torsten> Thanks in advance for you help > Andrew Zachary > > ---- > Wetherby Partners LLC believes the information provided herein is reliable. While every care has been taken to ensure accuracy, the information is furnished to the recipients with no warranty as to the completeness and accuracy of its contents and on condition that any errors or omissions shall not be made the basis for any claim, demand or cause for action. > The information in this email is intended only for the named...{{dropped}} > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >
Andrew, Not sure what your problem is based on your email. But data volume is not a problem if there is only 1400 obs and 15 predictors. On 9/19/06, Andrew Zachary <Andrew.Zachary at wetherbypartnersllc.com> wrote:> Not sure if anyone has posted on this problem ... I want to use rpart to > build a binary tree on a relatively large dataset with ~1400 data points > and 15 predictors. But I've noticed that rpart fails almost immediately > in the call to C_s_to_rp, as that code returns nonsense. Looking at the > code itself isn't terribly helpful, and there don't seem to be any hard > limits coded anywhere. Does anyone have a suggestion for what might be > going on? > > Thanks in advance for you help > Andrew Zachary > > ---- > Wetherby Partners LLC believes the information provided herein is reliable. While every care has been taken to ensure accuracy, the information is furnished to the recipients with no warranty as to the completeness and accuracy of its contents and on condition that any errors or omissions shall not be made the basis for any claim, demand or cause for action. > The information in this email is intended only for the named...{{dropped}} > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- WenSui Liu (http://spaces.msn.com/statcompute/blog) Senior Decision Support Analyst Health Policy and Clinical Effectiveness Cincinnati Children Hospital Medical Center
Here is an example (though the data are too large to send ). The dataset
is (6530 x 15). Predictors are continuous N(0,1). Trying to build a
regression tree.
 fit <- rpart( y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10 +
x11 + x12 + x13 + x14, data=my.data.set, weights=wts )
And the output:
 summary( fit )
Call:
rpart(formula = y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10 +
x11 + x12 + x13 + x14, data =my.data.set, weights = wts)
  n= 6530 
  CP nsplit rel error
1 NA      0        NA
Node number NA: NA observationsError in if (ff$complexity[i] < cp ||
is.leaf[i]) cat("\n") else cat(",    complexity param=",  : 
        missing value where TRUE/FALSE needed
If I run this using a subset of 900 points, everything is fine.
Similarly, if I run it using 1100 points, it dies. There are no missing
values in the dataset. Is this simply a case where I should decrease cp?
Regards,
Andrew
-----Original Message-----
From: Torsten Hothorn [mailto:Torsten.Hothorn at rzmail.uni-erlangen.de] 
Sent: Tuesday, September 19, 2006 4:45 PM
To: Andrew Zachary
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] Problem with rpart
On Tue, 19 Sep 2006, Andrew Zachary wrote:
> Not sure if anyone has posted on this problem ... I want to use rpart 
> to build a binary tree on a relatively large dataset with ~1400 data 
> points and 15 predictors. But I've noticed that rpart fails almost 
> immediately in the call to C_s_to_rp, as that code returns nonsense. 
> Looking at the code itself isn't terribly helpful, and there don't 
> seem to be any hard limits coded anywhere. Does anyone have a 
> suggestion for what might be going on?
>
Andrew,
you need to give an _executable_ example illustrating your problem. What
means `nonsense'?
Best,
Torsten
> Thanks in advance for you help
> Andrew Zachary
>
> ----
> Wetherby Partners LLC believes the information provided herein is
reliable. While every care has been taken to ensure accuracy, the
information is furnished to the recipients with no warranty as to the
completeness and accuracy of its contents and on condition that any
errors or omissions shall not be made the basis for any claim, demand or
cause for action.> The information in this email is intended only for the\ > ...{{dropped}}