thr3ads.net - similar to: "How to measure/rank “variable importance” when using rpart?"

Displaying 20 results from an estimated 10000 matches similar to: "How to measure/rank “variable importance” when using rpart?"

How to measure/rank ?variable importance when using rpart?

2011 Jan 24

How to measure/rank ?variable importance when using rpart?

--- included message ---- Thus, my question is: *What common measures exists for ranking/measuring variable importance of participating variables in a CART model? And how can this be computed using R (for example, when using the rpart package)* ---end ---- Consider the following printout from rpart summary(rpart(time ~ age + ph.ecog + pat.karno, data=lung)) Node number 1: 228 observations,

Is there an equivalence of lm's “anova” for an rpart object ?

2010 Mar 07

Is there an equivalence of lm's “anova” for an rpart object ?

Simple example: # Classification Tree with rpart library(rpart) # grow tree fit <- rpart(Kyphosis ~ Age + Number + Start, method="class", data=kyphosis) Now I would like to know how can I measure the "importance" of each of my three explanatory variables (Age, Number, Start) in the model? If this was a regression model, I could have looked at p values from the

rpart - how to estimate the “meaningful” predictors for an outcome (in classification trees)

2010 Dec 14

rpart - how to estimate the “meaningful” predictors for an outcome (in classification trees)

Hi dear R-help memebers, When building a CART model (specifically classification tree) using rpart, it is sometimes obvious that there are variables (X's) that are meaningful for predicting some of the outcome (y) variables - while other predictors are relevant for other outcome variables (y's only). *How can it be estimated, which explanatory variable is "used" for which of

Rpart decision tree

2011 Apr 08

Rpart decision tree

Dear useRs: I try to plot an rpart object but cannot get a nice tree structure plot. I am using plot.rpart and text.rpart (please see below) but the branches that connect the nodes overlap the text in the ellipses and rectangles. Is there a way to get a clean nice tree plot (as in the Rpart Mayo report)? I work under Windows and use R2.11.1 with rpart version 3.1-46. Thank you. Tudor ...

Extracting the terms from an rpart object

2011 Jan 26

Extracting the terms from an rpart object

Hello all, I wish to extract the terms from an rpart object. Specifically, I would like to be able to know what is the response variable (so I could do some manipulation on it). But in general, such a method for rpart will also need to handle a "." case (see fit2) Here are two simple examples: fit1 <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis) fit1$call fit2 <-

enableJIT(2) causes major slow-up in rpart

2012 Apr 12

enableJIT(2) causes major slow-up in rpart

Hello, Due to exploration of the JIT capabilities offered through the {compiler} package, I came by the fact that using enableJIT(2) can *slow* the rpart function (from the {rpart} package) by a magnitude of about 10 times. Here is an example code to run: library(rpart) require(compiler) enableJIT(0) # just making sure that JIT is off # We could also use enableJIT(1) and it would be fine fo

In rpart, how is "improve" calculated? (in the "class" case)

2011 Jun 13

In rpart, how is "improve" calculated? (in the "class" case)

Hi all, I apologies in advance if I am missing something very simple here, but since I failed at resolving this myself, I'm sending this question to the list. I would appreciate any help in understanding how the rpart function is (exactly) computing the "improve" (which is given in fit$split), and how it differs when using the split='information' vs split='gini'

rpart puzzle

2001 Jul 12

rpart puzzle

I've been using the package rpart with R 1.3.0 for Windows to produce simple classification trees for some measurement data from paleontological specimens. Both the rpart documentation and the output confirm that the program produces splits on continuous data that leave "holes" in the data. It is probably of little practical importance, but is there a reason why the binary

rpart error when constructing a classification tree

2008 Jan 29

rpart error when constructing a classification tree

I am trying to make a decision tree using rpart. The function runs very quickly considering the size of the data (1742, 163). When I call the summary command I get this: > summary(bookings.cart) Call: rpart(formula = totalRev ~ ., data = bookings, method = "class") n=1741 (1 observation deleted due to missingness) CP nsplit rel error 1 0 0 1 Error in yval[, 1] :

Rpart query

2010 Oct 12

Rpart query

Hi, Being a novice this is my first usage of R. I am trying to use rpart for building a decision tree in R. And I have the following dataframe Outlook Temp Humidity Windy Class Sunny 75 70 Yes Play Sunny 80 90 Yes Don't Play Sunny 85 85 No Don't Play Sunny 72 95 No Don't Play Sunny 69 70 No Play Overcast 72 90 Yes Play Overcast 83 78 No Play Overcast 64 65 Yes Play Overcast 81 75

CART with rpart

2011 Dec 02

CART with rpart

Un texte encapsul? et encod? dans un jeu de caract?res inconnu a ?t? nettoy?... Nom : non disponible URL : <https://stat.ethz.ch/pipermail/r-help/attachments/20111202/b4d64bba/attachment.pl>

questions on rpart (tree changes when rearrange the order of covariates?!)

2009 May 12

questions on rpart (tree changes when rearrange the order of covariates?!)

Greetings, I am using rpart for classification with "class" method. The test data is the Indian diabetes data from package mlbench. I fitted a classification tree firstly using the original data, and then exchanged the order of Body mass and Plasma glucose which are the strongest/important variables in the growing phase. The second tree is a little different from the first one. The

help with predict.rpart

2011 Jul 29

help with predict.rpart

? data=read.table("http://statcourse.com/research/boston.csv", , sep=",", header = TRUE) ? library(rpart) ? fit=rpart (MV~ CRIM+ZN+INDUS+CHAS+NOX+RM+AGE+DIS+RAD+TAX+ PT+B+LSTAT) predict(fit,data[4,]) plot only reveals part of the tree in contrast to the results on obtains with CART or C5 -------- Original Message -------- Subject: Re: [R] help with rpart From: Sarah

bug in rpart?

2009 May 22

bug in rpart?

Greetings, I checked the Indian diabetes data again and get one tree for the data with reordered columns and another tree for the original data. I compared these two trees, the split points for these two trees are exactly the same but the fitted classes are not the same for some cases. And the misclassification errors are different too. I know how CART deal with ties --- even we are using the

rpart

2004 Jun 04

rpart

Hello everyone, I'm a newbie to R and to CART so I hope my questions don't seem too stupid. 1.) My first question concerns the rpart() method. Which method does rpart use in order to get the best split - entropy impurity, Bayes error (min. error) or Gini index? Is there a way to make it use the entropy impurity? The second and third question concern the output of the printcp() function.

How does rpart computes "improve" for split="information"?? (which seems to be different then the "gini" case)

2011 Jun 21

How does rpart computes "improve" for split="information"?? (which seems to be different then the "gini" case)

Hello dear R-help members, I would appreciate any help in understanding how the rpart function computes the "improve" (which is given in fit$split) when using the split='information' parameter. Thanks to Professor Atkinson help, I was able to find how this is done in the case that split='gini'. By following the explanation here:

Decision tree model using rpart ( classification

2011 Nov 04

Decision tree model using rpart ( classification

Hi Experts, I am new to R, using decision tree model for getting segmentation rules. A) Using behavioural data (attributes defining customer behaviour, ( example balances, number of accounts etc.) 1. Clustering: Cluster behavioural data to suitable number of clusters 2. Decision Tree: Using rpart classification tree for generating rules for segmentation using cluster number(cluster id) as target

Inconsistencies in the rpart.object help file?

2011 Jan 26

Inconsistencies in the rpart.object help file?

Hello all, I'm was going through the help for ?rpart.object And noticed some inconsistencies, Some might be a mistake in the help file and some might be my misunderstanding. The help in the section: value -> frame (first paragraph), states that: > yval, the fitted value of the response at each node, *and splits, a two > column matrix of left and right split labels for each node. *

rpart package, text function, and round of class counts

2012 Mar 04

rpart package, text function, and round of class counts

I run the following code: library(rpart) data(kyphosis) fit <- rpart(Kyphosis ~ ., data=kyphosis) plot(fit) text(fit, use.n=TRUE) The text labels represent the count of each class at the leaf node. Unfortunately, the numbers are rounded and in scientific notation rather than the exact number of examples sorted by that node in each class. The plot is supposed to look like

rpart, cross-validation errors question

2010 May 03

rpart, cross-validation errors question

I ran this code (several times) from the Quick-R web page ( http://www.statmethods.net/advstats/cart.html) but my cross-validation errors increase instead of decrease (same thing happens with an unrelated data set). Why does this happen? Am I doing something wrong? # Classification Tree with rpart library(rpart) # grow tree fit <- rpart(Kyphosis ~ Age + Number + Start,

similar to: How to measure/rank “variable importance” when using rpart?