similar to: Estimating error rate for a classification tree

Displaying 20 results from an estimated 10000 matches similar to: "Estimating error rate for a classification tree"

2008 Oct 01
0
xpred.rpart() in library(mvpart)
R-users E-mail: r-help@r-project.org Hi! R-users. http://finzi.psych.upenn.edu/R/library/mvpart/html/xpred.rpart.html says: data(car.test.frame) fit <- rpart(Mileage ~ Weight, car.test.frame) xmat <- xpred.rpart(fit) xerr <- (xmat - car.test.frame$Mileage)^2 apply(xerr, 2, sum) # cross-validated error estimate # approx same result as rel. error from printcp(fit) apply(xerr, 2,
2001 Aug 12
2
rpart 3.1.0 bug?
I just updated rpart to the latest version (3.1.0). There are a number of changes between this and previous versions, and some of the code I've been using with earlier versions (e.g. 3.0.2) no longer work. Here is a simple illustration of a problem I'm having with xpred.rpart. iris.test.rpart<-rpart(iris$Species~., data=iris[,1:4], parms=list(prior=c(0.5,0.25, 0.25))) + ) >
2009 May 26
0
cross-validation in rpart
Dear R users, I know cross-validation does not work in rpart with user defined split functions. As Terry Therneau suggested, one can use the xpred.rpart function and then summarize the matrix of the predicted values into a single "goodness" value. I need only a confirmation: set for example xval=10, if I correctly understood a single column of the matrix obatined by xpred.rpart gives
2009 Jun 09
3
rpart - the xval argument in rpart.control and in xpred.rpart
Dear R users, I'm working with the rpart package and want to evaluate the performance of user defined split functions. I have some problems in understanding the meaning of the xval argument in the two functions rpart.control and xpred.rpart. In the former it is defined as the number of cross-validations while in the latter it is defined as the number of cross-validation groups. If I am
2010 Dec 14
1
rpart - how to estimate the “meaningful” predictors for an outcome (in classification trees)
Hi dear R-help memebers, When building a CART model (specifically classification tree) using rpart, it is sometimes obvious that there are variables (X's) that are meaningful for predicting some of the outcome (y) variables - while other predictors are relevant for other outcome variables (y's only). *How can it be estimated, which explanatory variable is "used" for which of
2007 Jan 29
3
comparing random forests and classification trees
Hi, I have done an analysis using 'rpart' to construct a Classification Tree. I am wanting to retain the output in tree form so that it is easily interpretable. However, I am wanting to compare the 'accuracy' of the tree to a Random Forest to estimate how much predictive ability is lost by using one simple tree. My understanding is that the error automatically displayed by the two
2005 Mar 18
1
How to show which variables include in plot of classification tree
Dear all For my research, I am learning classification now. I was trying some example about classification tree pakages, such as tree and rpart, for instance, in Pima.te dataset have 8 variables (include class=type): library(rpart) library(datasets) pima.rpart <- rpart(type ~ npreg+glu+bp+skin+bmi+ped+age,data=Pima.te, method='class') plot(pima.rpart, uniform=TRUE) text(pima.rpart)
2011 Nov 04
1
Decision tree model using rpart ( classification
Hi Experts, I am new to R, using decision tree model for getting segmentation rules. A) Using behavioural data (attributes defining customer behaviour, ( example balances, number of accounts etc.) 1. Clustering: Cluster behavioural data to suitable number of clusters 2. Decision Tree: Using rpart classification tree for generating rules for segmentation using cluster number(cluster id) as target
2008 Jan 29
2
rpart error when constructing a classification tree
I am trying to make a decision tree using rpart. The function runs very quickly considering the size of the data (1742, 163). When I call the summary command I get this: > summary(bookings.cart) Call: rpart(formula = totalRev ~ ., data = bookings, method = "class") n=1741 (1 observation deleted due to missingness) CP nsplit rel error 1 0 0 1 Error in yval[, 1] :
2007 Feb 26
2
survival analysis using rpart
Hello, I use rpart to predict survival time and have a problem in interpreting the output of ?estimated rate?. Here is an example of what I do: > stagec <- > read.table("http://www.stanford.edu/class/stats202/DATA/stagec.data", > col.names=c("pgtime", "pgstat", "age","eet", "g2", "grade", "gleason", >
2006 Aug 24
0
Classification tree with a random variable
Hi, I am planning on using classification trees to build a predictive model for data which includes a random variable. I intend to use the R functions 'rpart' (and potentially also 'randomForest' and 'bagging'). I have a data set with 390 data points. The response variable is binary. There are a large number of variables (>20, both categorical and continuous). The
2006 Jul 18
1
Classification error rate increased by bagging - any ideas?
Hi, I'm analysing some anthropometric data on fifty odd skull bases. We know the gender of each skull, and we are trying to develop a predictor to identify the sex of unknown skulls. Rpart with cross-validation produces two models - one of which predicts gender for Males well, and Females poorly, and the other does the opposite (Females well, and Males poorly). In both cases the error
2008 Jun 17
0
Rpart description of tree groups
I'm making a few functions to generate latex files describing rpart objects that are then \input-ed into a larger document. So far, the functions I have generate paragraphs containing enumerations of the predictors in pruned trees and the number of formed groups. Its easy enough to recover these. For instance, R> print ( tree ) n= 878 node), split, n, loss, yval, (yprob) *
2003 Dec 19
1
Question re labels in r-part (continuation of a thread from a while back)
Hello again I have modeled a tree using rpart, with the DV being a log transformation of the variable I am really interested in (I transformed the DV due to extreme skewness). By default, text.rpart labels the nodes with the value of yval, which in this case is not what I want; I'd like the labels to be on the original metric, but label in text.rpart requires a "column name of
2017 Jun 13
2
Classification and Regression Tree for Survival Analysis
I am trying to use the CART in a survival analysis. I have three variables of interest (all 3 ordinal - x, y and z, each of them with 5 categories) from which I want to make smaller groups (just an example 1st category from X variable with the 2nd and 3rd categories from the Y category and 2, 3 and 4 categories from the Z category etc) based on their, let's say, association with mortality. Now
2012 Sep 04
1
predict rpart newdata - introduce only values variables used in the tree
Dear community, I've a tree which included at first 23 variables. Then I've pruned this tree, and there are only 8 variables involved. I'd like to predict and only introduce in newdata the values of these 8 variables involved. However, as the tree was built with the 23, it asked me for 15 values, even if it doesn't need them. Is there a way to introduce only this 8 values?
2003 Jul 21
0
Changing the labels on a regression tree (repeat post - with added clarity)
Hello I posted a very similar question last week, but the responses I received indicated that my post was unclear.... I have a regression tree created in rpart with tr.logypsx <- rpart(log(YPSX + 1) ~AGE+drugfact+sexfact+as.numeric(OBSX) +WINDLE + EABUSED + PABAU + positive.par + control.par + lenient.par, xval = 10, method = 'anova', cp = 0.0001, data = duhray2) and then
2012 Dec 19
0
Fitting a predefined classification tree
Hi, I've searched R-help and haven't found an answer. I have a set of data from which I can create a classification tree using rpart. However, what I'd like to do is predefine the blank structure of the binary tree (i.e., which nodes to include) and then use a package like rpart to fit for the optimal splitting criteria at each of the predefined nodes. Does such a package exist?
2009 Mar 11
2
Couple of Questions about Classification trees
So I have 2 sets of data - a training data set and a test data set. I've been doing the analysis on the training data set and then using predict and feeding the test data through that. There are 114 rows in the training data and 117 in the test data and 1024 columns in both. It's actually the same set of data split into two. The rows are made of 5 different numbers. They do represent
2012 Mar 05
1
decision/classification trees with fewer than 20 objects
Hi! I'm trying to construct and plot a decision tree to class a set of only 8 objects and tried to use the rpart and tree function, but get a error message both times: rpart: fit is not a tree, just a root tree: cannot plot singlenode tree I read in the post 'question regression trees' that rpart doesn't split a set of fewer than 20 objects...so I guess the same holds true for