similar to: How to measure/rank “variable importance” when using rpart?

Displaying 20 results from an estimated 10000 matches similar to: "How to measure/rank “variable importance” when using rpart?"

2011 Jan 24
1
How to measure/rank ?variable importance when using rpart?
--- included message ---- Thus, my question is: *What common measures exists for ranking/measuring variable importance of participating variables in a CART model? And how can this be computed using R (for example, when using the rpart package)* ---end ---- Consider the following printout from rpart summary(rpart(time ~ age + ph.ecog + pat.karno, data=lung)) Node number 1: 228 observations,
2010 Mar 07
1
Is there an equivalence of lm's “anova” for an rpart object ?
Simple example: # Classification Tree with rpart library(rpart) # grow tree fit <- rpart(Kyphosis ~ Age + Number + Start, method="class", data=kyphosis) Now I would like to know how can I measure the "importance" of each of my three explanatory variables (Age, Number, Start) in the model? If this was a regression model, I could have looked at p values from the
2010 Dec 14
1
rpart - how to estimate the “meaningful” predictors for an outcome (in classification trees)
Hi dear R-help memebers, When building a CART model (specifically classification tree) using rpart, it is sometimes obvious that there are variables (X's) that are meaningful for predicting some of the outcome (y) variables - while other predictors are relevant for other outcome variables (y's only). *How can it be estimated, which explanatory variable is "used" for which of
2011 Apr 08
4
Rpart decision tree
Dear useRs: I try to plot an rpart object but cannot get a nice tree structure plot. I am using plot.rpart and text.rpart (please see below) but the branches that connect the nodes overlap the text in the ellipses and rectangles. Is there a way to get a clean nice tree plot (as in the Rpart Mayo report)? I work under Windows and use R2.11.1 with rpart version 3.1-46. Thank you. Tudor ...
2011 Jan 26
2
Extracting the terms from an rpart object
Hello all, I wish to extract the terms from an rpart object. Specifically, I would like to be able to know what is the response variable (so I could do some manipulation on it). But in general, such a method for rpart will also need to handle a "." case (see fit2) Here are two simple examples: fit1 <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis) fit1$call fit2 <-
2012 Apr 12
2
enableJIT(2) causes major slow-up in rpart
Hello, Due to exploration of the JIT capabilities offered through the {compiler} package, I came by the fact that using enableJIT(2) can *slow* the rpart function (from the {rpart} package) by a magnitude of about 10 times. Here is an example code to run: library(rpart) require(compiler) enableJIT(0) # just making sure that JIT is off # We could also use enableJIT(1) and it would be fine fo
2011 Jun 13
1
In rpart, how is "improve" calculated? (in the "class" case)
Hi all, I apologies in advance if I am missing something very simple here, but since I failed at resolving this myself, I'm sending this question to the list. I would appreciate any help in understanding how the rpart function is (exactly) computing the "improve" (which is given in fit$split), and how it differs when using the split='information' vs split='gini'
2001 Jul 12
2
rpart puzzle
I've been using the package rpart with R 1.3.0 for Windows to produce simple classification trees for some measurement data from paleontological specimens. Both the rpart documentation and the output confirm that the program produces splits on continuous data that leave "holes" in the data. It is probably of little practical importance, but is there a reason why the binary
2008 Jan 29
2
rpart error when constructing a classification tree
I am trying to make a decision tree using rpart. The function runs very quickly considering the size of the data (1742, 163). When I call the summary command I get this: > summary(bookings.cart) Call: rpart(formula = totalRev ~ ., data = bookings, method = "class") n=1741 (1 observation deleted due to missingness) CP nsplit rel error 1 0 0 1 Error in yval[, 1] :
2010 Oct 12
6
Rpart query
Hi, Being a novice this is my first usage of R. I am trying to use rpart for building a decision tree in R. And I have the following dataframe Outlook Temp Humidity Windy Class Sunny 75 70 Yes Play Sunny 80 90 Yes Don't Play Sunny 85 85 No Don't Play Sunny 72 95 No Don't Play Sunny 69 70 No Play Overcast 72 90 Yes Play Overcast 83 78 No Play Overcast 64 65 Yes Play Overcast 81 75
2011 Dec 02
1
CART with rpart
Un texte encapsul? et encod? dans un jeu de caract?res inconnu a ?t? nettoy?... Nom : non disponible URL : <https://stat.ethz.ch/pipermail/r-help/attachments/20111202/b4d64bba/attachment.pl>
2009 May 12
1
questions on rpart (tree changes when rearrange the order of covariates?!)
Greetings, I am using rpart for classification with "class" method. The test data is the Indian diabetes data from package mlbench. I fitted a classification tree firstly using the original data, and then exchanged the order of Body mass and Plasma glucose which are the strongest/important variables in the growing phase. The second tree is a little different from the first one. The
2011 Jul 29
1
help with predict.rpart
? data=read.table("http://statcourse.com/research/boston.csv", , sep=",", header = TRUE) ? library(rpart) ? fit=rpart (MV~ CRIM+ZN+INDUS+CHAS+NOX+RM+AGE+DIS+RAD+TAX+ PT+B+LSTAT) predict(fit,data[4,]) plot only reveals part of the tree in contrast to the results on obtains with CART or C5 -------- Original Message -------- Subject: Re: [R] help with rpart From: Sarah
2009 May 22
1
bug in rpart?
Greetings, I checked the Indian diabetes data again and get one tree for the data with reordered columns and another tree for the original data. I compared these two trees, the split points for these two trees are exactly the same but the fitted classes are not the same for some cases. And the misclassification errors are different too. I know how CART deal with ties --- even we are using the
2004 Jun 04
1
rpart
Hello everyone, I'm a newbie to R and to CART so I hope my questions don't seem too stupid. 1.) My first question concerns the rpart() method. Which method does rpart use in order to get the best split - entropy impurity, Bayes error (min. error) or Gini index? Is there a way to make it use the entropy impurity? The second and third question concern the output of the printcp() function.
2011 Jun 21
0
How does rpart computes "improve" for split="information"?? (which seems to be different then the "gini" case)
Hello dear R-help members, I would appreciate any help in understanding how the rpart function computes the "improve" (which is given in fit$split) when using the split='information' parameter. Thanks to Professor Atkinson help, I was able to find how this is done in the case that split='gini'. By following the explanation here:
2011 Nov 04
1
Decision tree model using rpart ( classification
Hi Experts, I am new to R, using decision tree model for getting segmentation rules. A) Using behavioural data (attributes defining customer behaviour, ( example balances, number of accounts etc.) 1. Clustering: Cluster behavioural data to suitable number of clusters 2. Decision Tree: Using rpart classification tree for generating rules for segmentation using cluster number(cluster id) as target
2011 Jan 26
1
Inconsistencies in the rpart.object help file?
Hello all, I'm was going through the help for ?rpart.object And noticed some inconsistencies, Some might be a mistake in the help file and some might be my misunderstanding. The help in the section: value -> frame (first paragraph), states that: > yval, the fitted value of the response at each node, *and splits, a two > column matrix of left and right split labels for each node. *
2012 Mar 04
1
rpart package, text function, and round of class counts
I run the following code: library(rpart) data(kyphosis) fit <- rpart(Kyphosis ~ ., data=kyphosis) plot(fit) text(fit, use.n=TRUE) The text labels represent the count of each class at the leaf node. Unfortunately, the numbers are rounded and in scientific notation rather than the exact number of examples sorted by that node in each class. The plot is supposed to look like
2010 May 03
1
rpart, cross-validation errors question
I ran this code (several times) from the Quick-R web page ( http://www.statmethods.net/advstats/cart.html) but my cross-validation errors increase instead of decrease (same thing happens with an unrelated data set). Why does this happen? Am I doing something wrong? # Classification Tree with rpart library(rpart) # grow tree fit <- rpart(Kyphosis ~ Age + Number + Start,