Displaying 20 results from an estimated 10000 matches similar to: "Estimating error rate for a classification tree"
2008 Oct 01
0
xpred.rpart() in library(mvpart)
R-users
E-mail: r-help@r-project.org
Hi! R-users.
http://finzi.psych.upenn.edu/R/library/mvpart/html/xpred.rpart.html
says:
data(car.test.frame)
fit <- rpart(Mileage ~ Weight, car.test.frame)
xmat <- xpred.rpart(fit)
xerr <- (xmat - car.test.frame$Mileage)^2
apply(xerr, 2, sum) # cross-validated error estimate
# approx same result as rel. error from printcp(fit)
apply(xerr, 2,
2001 Aug 12
2
rpart 3.1.0 bug?
I just updated rpart to the latest version (3.1.0). There are a number of
changes between this and previous versions, and some of the code I've been
using with earlier versions (e.g. 3.0.2) no longer work.
Here is a simple illustration of a problem I'm having with xpred.rpart.
iris.test.rpart<-rpart(iris$Species~., data=iris[,1:4],
parms=list(prior=c(0.5,0.25, 0.25)))
+ )
>
2009 May 26
0
cross-validation in rpart
Dear R users,
I know cross-validation does not work in rpart with user defined split
functions. As Terry Therneau suggested, one can use the xpred.rpart function
and then summarize the matrix of the predicted values into a single
"goodness" value.
I need only a confirmation: set for example xval=10, if I correctly
understood a single column of the matrix obatined by xpred.rpart gives
2009 Jun 09
3
rpart - the xval argument in rpart.control and in xpred.rpart
Dear R users,
I'm working with the rpart package and want to evaluate the performance of
user defined split functions.
I have some problems in understanding the meaning of the xval argument in
the two functions rpart.control and xpred.rpart. In the former it is defined
as the number of cross-validations while in the latter it is defined as the
number of cross-validation groups. If I am
2010 Dec 14
1
rpart - how to estimate the “meaningful” predictors for an outcome (in classification trees)
Hi dear R-help memebers,
When building a CART model (specifically classification tree) using rpart,
it is sometimes obvious that there are variables (X's) that are meaningful
for predicting some of the outcome (y) variables - while other predictors
are relevant for other outcome variables (y's only).
*How can it be estimated, which explanatory variable is "used" for which of
2007 Jan 29
3
comparing random forests and classification trees
Hi,
I have done an analysis using 'rpart' to construct a Classification Tree. I
am wanting to retain the output in tree form so that it is easily
interpretable. However, I am wanting to compare the 'accuracy' of the tree
to a Random Forest to estimate how much predictive ability is lost by using
one simple tree. My understanding is that the error automatically displayed
by the two
2005 Mar 18
1
How to show which variables include in plot of classification tree
Dear all
For my research, I am learning classification now.
I was trying some example about classification tree pakages, such as
tree and rpart, for instance,
in Pima.te dataset have 8 variables (include class=type):
library(rpart)
library(datasets)
pima.rpart <- rpart(type ~ npreg+glu+bp+skin+bmi+ped+age,data=Pima.te,
method='class')
plot(pima.rpart, uniform=TRUE)
text(pima.rpart)
2011 Nov 04
1
Decision tree model using rpart ( classification
Hi Experts,
I am new to R, using decision tree model for getting segmentation rules.
A) Using behavioural data (attributes defining customer behaviour, ( example
balances, number of accounts etc.)
1. Clustering: Cluster behavioural data to suitable number of clusters
2. Decision Tree: Using rpart classification tree for generating rules for
segmentation using cluster number(cluster id) as target
2008 Jan 29
2
rpart error when constructing a classification tree
I am trying to make a decision tree using rpart. The function runs very
quickly considering the size of the data (1742, 163). When I call the
summary command I get this:
> summary(bookings.cart)
Call:
rpart(formula = totalRev ~ ., data = bookings, method = "class")
n=1741 (1 observation deleted due to missingness)
CP nsplit rel error
1 0 0 1
Error in yval[, 1] :
2007 Feb 26
2
survival analysis using rpart
Hello,
I use rpart to predict survival time and have a problem in interpreting the
output of ?estimated rate?. Here is an example of what I do:
> stagec <-
> read.table("http://www.stanford.edu/class/stats202/DATA/stagec.data",
> col.names=c("pgtime", "pgstat", "age","eet", "g2", "grade", "gleason",
>
2006 Aug 24
0
Classification tree with a random variable
Hi,
I am planning on using classification trees to build a predictive model for data which includes a random variable. I intend to use the R functions 'rpart' (and potentially also 'randomForest' and 'bagging').
I have a data set with 390 data points. The response variable is binary. There are a large number of variables (>20, both categorical and continuous). The
2006 Jul 18
1
Classification error rate increased by bagging - any ideas?
Hi,
I'm analysing some anthropometric data on fifty odd skull bases. We know the
gender of each skull, and we are trying to develop a predictor to identify
the
sex of unknown skulls.
Rpart with cross-validation produces two models - one of which predicts
gender
for Males well, and Females poorly, and the other does the opposite (Females
well, and Males poorly). In both cases the error
2008 Jun 17
0
Rpart description of tree groups
I'm making a few functions to generate latex files describing
rpart objects that are then \input-ed into a larger document. So
far, the functions I have generate paragraphs containing
enumerations of the predictors in pruned trees and the number of
formed groups.
Its easy enough to recover these. For instance,
R> print ( tree )
n= 878
node), split, n, loss, yval, (yprob)
*
2003 Dec 19
1
Question re labels in r-part (continuation of a thread from a while back)
Hello again
I have modeled a tree using rpart, with the DV being a log
transformation of the variable I am really interested in (I transformed
the DV due to extreme skewness). By default, text.rpart labels the
nodes with the value of yval, which in this case is not what I want; I'd
like the labels to be on the original metric, but label in text.rpart
requires a "column name of
2017 Jun 13
2
Classification and Regression Tree for Survival Analysis
I am trying to use the CART in a survival analysis. I have three variables of interest (all 3 ordinal - x, y and z, each of them with 5 categories) from which I want to make smaller groups (just an example 1st category from X variable with the 2nd and 3rd categories from the Y category and 2, 3 and 4 categories from the Z category etc) based on their, let's say, association with mortality.
Now
2012 Sep 04
1
predict rpart newdata - introduce only values variables used in the tree
Dear community,
I've a tree which included at first 23 variables. Then I've pruned this
tree, and there are only 8 variables involved.
I'd like to predict and only introduce in newdata the values of these 8
variables involved. However, as the tree was built with the 23, it asked me
for 15 values, even if it doesn't need them.
Is there a way to introduce only this 8 values?
2003 Jul 21
0
Changing the labels on a regression tree (repeat post - with added clarity)
Hello
I posted a very similar question last week, but the responses I
received indicated that my post was unclear....
I have a regression tree created in rpart with
tr.logypsx <- rpart(log(YPSX + 1)
~AGE+drugfact+sexfact+as.numeric(OBSX) +WINDLE + EABUSED + PABAU +
positive.par + control.par + lenient.par, xval = 10, method = 'anova',
cp = 0.0001, data = duhray2)
and then
2012 Dec 19
0
Fitting a predefined classification tree
Hi,
I've searched R-help and haven't found an answer. I have a set of data from which I can create a classification tree using
rpart. However, what I'd like to do is predefine the blank structure of the binary tree (i.e., which nodes to include) and then use a package like rpart to fit for the optimal splitting criteria at each of the predefined nodes.
Does such a package exist?
2009 Mar 11
2
Couple of Questions about Classification trees
So I have 2 sets of data - a training data set and a test data set. I've been
doing the analysis on the training data set and then using predict and
feeding the test data through that. There are 114 rows in the training data
and 117 in the test data and 1024 columns in both. It's actually the same
set of data split into two. The rows are made of 5 different numbers. They
do represent
2012 Mar 05
1
decision/classification trees with fewer than 20 objects
Hi!
I'm trying to construct and plot a decision tree to class a set of only 8 objects and tried to use the rpart and tree function, but get a error message both times:
rpart: fit is not a tree, just a root
tree: cannot plot singlenode tree
I read in the post 'question regression trees' that rpart doesn't split a set of fewer than 20 objects...so I guess the same holds true for