thr3ads.net - similar to: "Estimating error rate for a classification tree"

Displaying 20 results from an estimated 10000 matches similar to: "Estimating error rate for a classification tree"

2008 Oct 01

xpred.rpart() in library(mvpart)

R-users E-mail: r-help@r-project.org Hi! R-users. http://finzi.psych.upenn.edu/R/library/mvpart/html/xpred.rpart.html says: data(car.test.frame) fit <- rpart(Mileage ~ Weight, car.test.frame) xmat <- xpred.rpart(fit) xerr <- (xmat - car.test.frame$Mileage)^2 apply(xerr, 2, sum) # cross-validated error estimate # approx same result as rel. error from printcp(fit) apply(xerr, 2,

rpart 3.1.0 bug?

2001 Aug 12

rpart 3.1.0 bug?

I just updated rpart to the latest version (3.1.0). There are a number of changes between this and previous versions, and some of the code I've been using with earlier versions (e.g. 3.0.2) no longer work. Here is a simple illustration of a problem I'm having with xpred.rpart. iris.test.rpart<-rpart(iris$Species~., data=iris[,1:4], parms=list(prior=c(0.5,0.25, 0.25))) + ) >

cross-validation in rpart

2009 May 26

cross-validation in rpart

Dear R users, I know cross-validation does not work in rpart with user defined split functions. As Terry Therneau suggested, one can use the xpred.rpart function and then summarize the matrix of the predicted values into a single "goodness" value. I need only a confirmation: set for example xval=10, if I correctly understood a single column of the matrix obatined by xpred.rpart gives

rpart - the xval argument in rpart.control and in xpred.rpart

2009 Jun 09

rpart - the xval argument in rpart.control and in xpred.rpart

Dear R users, I'm working with the rpart package and want to evaluate the performance of user defined split functions. I have some problems in understanding the meaning of the xval argument in the two functions rpart.control and xpred.rpart. In the former it is defined as the number of cross-validations while in the latter it is defined as the number of cross-validation groups. If I am

rpart - how to estimate the “meaningful” predictors for an outcome (in classification trees)

2010 Dec 14

rpart - how to estimate the “meaningful” predictors for an outcome (in classification trees)

Hi dear R-help memebers, When building a CART model (specifically classification tree) using rpart, it is sometimes obvious that there are variables (X's) that are meaningful for predicting some of the outcome (y) variables - while other predictors are relevant for other outcome variables (y's only). *How can it be estimated, which explanatory variable is "used" for which of

comparing random forests and classification trees

2007 Jan 29

comparing random forests and classification trees

Hi, I have done an analysis using 'rpart' to construct a Classification Tree. I am wanting to retain the output in tree form so that it is easily interpretable. However, I am wanting to compare the 'accuracy' of the tree to a Random Forest to estimate how much predictive ability is lost by using one simple tree. My understanding is that the error automatically displayed by the two

How to show which variables include in plot of classification tree

2005 Mar 18

How to show which variables include in plot of classification tree

Dear all For my research, I am learning classification now. I was trying some example about classification tree pakages, such as tree and rpart, for instance, in Pima.te dataset have 8 variables (include class=type): library(rpart) library(datasets) pima.rpart <- rpart(type ~ npreg+glu+bp+skin+bmi+ped+age,data=Pima.te, method='class') plot(pima.rpart, uniform=TRUE) text(pima.rpart)

Decision tree model using rpart ( classification

2011 Nov 04

Decision tree model using rpart ( classification

Hi Experts, I am new to R, using decision tree model for getting segmentation rules. A) Using behavioural data (attributes defining customer behaviour, ( example balances, number of accounts etc.) 1. Clustering: Cluster behavioural data to suitable number of clusters 2. Decision Tree: Using rpart classification tree for generating rules for segmentation using cluster number(cluster id) as target

rpart error when constructing a classification tree

2008 Jan 29

rpart error when constructing a classification tree

I am trying to make a decision tree using rpart. The function runs very quickly considering the size of the data (1742, 163). When I call the summary command I get this: > summary(bookings.cart) Call: rpart(formula = totalRev ~ ., data = bookings, method = "class") n=1741 (1 observation deleted due to missingness) CP nsplit rel error 1 0 0 1 Error in yval[, 1] :

survival analysis using rpart

2007 Feb 26

survival analysis using rpart

Hello, I use rpart to predict survival time and have a problem in interpreting the output of ?estimated rate?. Here is an example of what I do: > stagec <- > read.table("http://www.stanford.edu/class/stats202/DATA/stagec.data", > col.names=c("pgtime", "pgstat", "age","eet", "g2", "grade", "gleason", >

Classification tree with a random variable

2006 Aug 24

Classification tree with a random variable

Hi, I am planning on using classification trees to build a predictive model for data which includes a random variable. I intend to use the R functions 'rpart' (and potentially also 'randomForest' and 'bagging'). I have a data set with 390 data points. The response variable is binary. There are a large number of variables (>20, both categorical and continuous). The

Classification error rate increased by bagging - any ideas?

2006 Jul 18

Classification error rate increased by bagging - any ideas?

Hi, I'm analysing some anthropometric data on fifty odd skull bases. We know the gender of each skull, and we are trying to develop a predictor to identify the sex of unknown skulls. Rpart with cross-validation produces two models - one of which predicts gender for Males well, and Females poorly, and the other does the opposite (Females well, and Males poorly). In both cases the error

Rpart description of tree groups

2008 Jun 17

Rpart description of tree groups

I'm making a few functions to generate latex files describing rpart objects that are then \input-ed into a larger document. So far, the functions I have generate paragraphs containing enumerations of the predictors in pruned trees and the number of formed groups. Its easy enough to recover these. For instance, R> print ( tree ) n= 878 node), split, n, loss, yval, (yprob) *

Question re labels in r-part (continuation of a thread from a while back)

2003 Dec 19

Question re labels in r-part (continuation of a thread from a while back)

Hello again I have modeled a tree using rpart, with the DV being a log transformation of the variable I am really interested in (I transformed the DV due to extreme skewness). By default, text.rpart labels the nodes with the value of yval, which in this case is not what I want; I'd like the labels to be on the original metric, but label in text.rpart requires a "column name of

Classification and Regression Tree for Survival Analysis

2017 Jun 13

Classification and Regression Tree for Survival Analysis

I am trying to use the CART in a survival analysis. I have three variables of interest (all 3 ordinal - x, y and z, each of them with 5 categories) from which I want to make smaller groups (just an example 1st category from X variable with the 2nd and 3rd categories from the Y category and 2, 3 and 4 categories from the Z category etc) based on their, let's say, association with mortality. Now

predict rpart newdata - introduce only values variables used in the tree

2012 Sep 04

predict rpart newdata - introduce only values variables used in the tree

Dear community, I've a tree which included at first 23 variables. Then I've pruned this tree, and there are only 8 variables involved. I'd like to predict and only introduce in newdata the values of these 8 variables involved. However, as the tree was built with the 23, it asked me for 15 values, even if it doesn't need them. Is there a way to introduce only this 8 values?

Changing the labels on a regression tree (repeat post - with added clarity)

2003 Jul 21

Changing the labels on a regression tree (repeat post - with added clarity)

Hello I posted a very similar question last week, but the responses I received indicated that my post was unclear.... I have a regression tree created in rpart with tr.logypsx <- rpart(log(YPSX + 1) ~AGE+drugfact+sexfact+as.numeric(OBSX) +WINDLE + EABUSED + PABAU + positive.par + control.par + lenient.par, xval = 10, method = 'anova', cp = 0.0001, data = duhray2) and then

Fitting a predefined classification tree

2012 Dec 19

Fitting a predefined classification tree

Hi, I've searched R-help and haven't found an answer. I have a set of data from which I can create a classification tree using rpart. However, what I'd like to do is predefine the blank structure of the binary tree (i.e., which nodes to include) and then use a package like rpart to fit for the optimal splitting criteria at each of the predefined nodes. Does such a package exist?

Couple of Questions about Classification trees

2009 Mar 11

Couple of Questions about Classification trees

So I have 2 sets of data - a training data set and a test data set. I've been doing the analysis on the training data set and then using predict and feeding the test data through that. There are 114 rows in the training data and 117 in the test data and 1024 columns in both. It's actually the same set of data split into two. The rows are made of 5 different numbers. They do represent

decision/classification trees with fewer than 20 objects

2012 Mar 05

decision/classification trees with fewer than 20 objects

Hi! I'm trying to construct and plot a decision tree to class a set of only 8 objects and tried to use the rpart and tree function, but get a error message both times: rpart: fit is not a tree, just a root tree: cannot plot singlenode tree I read in the post 'question regression trees' that rpart doesn't split a set of fewer than 20 objects...so I guess the same holds true for

similar to: Estimating error rate for a classification tree