thr3ads.net - similar to: "question on rpart"

Displaying 20 results from an estimated 20000 matches similar to: "question on rpart"

paste dependent variable in formula (rpart)?

2007 Dec 16

paste dependent variable in formula (rpart)?

Hello, i'm trying to replace different target variables in rpart with a function. The data.frame getting always the target variable as last column. Try below, i get the target variable in the explained variables, too!? Have anybody an advice to avoid this. rp1 <- rpart(eval(parse(text=paste(names(train[length(train)])))) ~ . , data=train,cp=0.0001) regards & many thanks

caret: Error when using rpart and CV != LOOCV

2012 May 15

caret: Error when using rpart and CV != LOOCV

Hy, I got the following problem when trying to build a rpart model and using everything but LOOCV. Originally, I wanted to used k-fold partitioning, but every partitioning except LOOCV throws the following warning: ---- Warning message: In nominalTrainWorkflow(dat = trainData, info = trainInfo, method = method, : There were missing values in resampled performance measures. ----- Below are some

rpart - the xval argument in rpart.control and in xpred.rpart

2009 Jun 09

rpart - the xval argument in rpart.control and in xpred.rpart

Dear R users, I'm working with the rpart package and want to evaluate the performance of user defined split functions. I have some problems in understanding the meaning of the xval argument in the two functions rpart.control and xpred.rpart. In the former it is defined as the number of cross-validations while in the latter it is defined as the number of cross-validation groups. If I am

Jackknife and rpart

2003 Apr 16

Jackknife and rpart

Hi, First, thanks to those who helped me see my gross misunderstanding of randomForest. I worked through a baging tutorial and now understand the "many tree" approach. However, it is not what I want to do! My bagged errors are accpetable but I need to use the actual tree and need a single tree application. I am using rpart for a classification tree but am interested in a more unbaised

Question about rpart decision trees (being used to predict customer churn)

2009 Jul 26

Question about rpart decision trees (being used to predict customer churn)

Hi, I am using rpart decision trees to analyze customer churn. I am finding that the decision trees created are not effective because they are not able to recognize factors that influence churn. I have created an example situation below. What do I need to do to for rpart to build a tree with the variable experience? My guess is that this would happen if rpart used the loss matrix while creating

How to collect trees grown by rpart

2004 Mar 19

How to collect trees grown by rpart

Jonathan, Try making a list instead of an array. See ?list. Also, did you look into random forests? I'm not sure what you want to do, but there might be methods there to do some of the work for you. Sean On 3/19/04 1:12 PM, "Jonathan Williams" <jonathan.williams at pharmacology.oxford.ac.uk> wrote: > I would like to collect the trees grown by rpart fits in an array,

rpart v. lda classification.

2003 Feb 12

rpart v. lda classification.

I've been groping my way through a classification/discrimination problem, from a consulting client. There are 26 observations, with 4 possible categories and 24 (!!!) potential predictor variables. I tried using lda() on the first 7 predictor variables and got 24 of the 26 observations correctly classified. (Training and testing both on the complete data set --- just to get started.) I

how is xerror calculated in rpart?

2010 Apr 30

how is xerror calculated in rpart?

Hi, I've searched online, in a few books, and in the archives, but haven't seen this. I believe that xerror is scaled to rel error on the first split. After fitting an rpart object, is it possible with a little math to determine the percentage of true classifications represented by a xerror value? -seth -- View this message in context:

rpart with interval censored data crashes R

2009 Jan 09

rpart with interval censored data crashes R

Hi Everyone, This example code results in R 'crashing'; that is the R application closes with no warnings or error messages. #----------------------- myD <- read.table(stdin(), header=TRUE, nrows=20) Broth Salt pH Temp N Y Growth 1 310 9.0 2.92 10 90.0 NA 0 2 615 6.0 7.82 30 1.0 2 1 3 217 2.0 7.34 10 7.0 8

rpart

2006 Sep 25

rpart

Dear r-help-list: If I use the rpart method like cfit<-rpart(y~.,data=data,...), what kind of tree is stored in cfit? Is it right that this tree is not pruned at all, that it is the full tree? If so, it's up to me to choose a subtree by using the printcp method. In the technical report from Atkinson and Therneau "An Introduction to recursive partitioning using the rpart

How to measure/rank ?variable importance when using rpart?

2011 Jan 24

How to measure/rank ?variable importance when using rpart?

--- included message ---- Thus, my question is: *What common measures exists for ranking/measuring variable importance of participating variables in a CART model? And how can this be computed using R (for example, when using the rpart package)* ---end ---- Consider the following printout from rpart summary(rpart(time ~ age + ph.ecog + pat.karno, data=lung)) Node number 1: 228 observations,

cross-validation in rpart

2008 Jul 03

cross-validation in rpart

Hello list, I'm having a problem with custom functions in rpart, and before I tear my hair out trying to fix it, I want to make sure it's actually a problem. It seems that, when you write custom functions for rpart (init, split and eval) then rpart no longer cross-validates the resulting tree to return errors. A simple test is to use the usersplits.R function to get a simple, custom

rpart.object help

2010 Dec 13

rpart.object help

Hi, Suppose i have generated an object using the following : fit <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis) And when i print fit, i get the following : n= 81 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 81 17 absent (0.7901235 0.2098765) 2) Start>=8.5 62 6 absent (0.9032258 0.0967742) 4) Start>=14.5 29 0 absent (1.0000000

user-written splits in rpart

2009 Dec 15

user-written splits in rpart

Hi, I am trying to write my own split function for rpart. The aim is to do, instead of anova, a linear regression to determine the split (minimize some criterion like sum of rss left and right of the split). The regression (lm) should simply use the dependent and independent variables passed to rpart. I am aware of the example provided in the rpart source code, but stumbled on similar problems

Does rpart package have some requirements on the original data set?

2007 Feb 15

Does rpart package have some requirements on the original data set?

Hi, I am currently studying Decision Trees by using rpart package in R. I artificially created a data set which includes the dependant variable (y) and a few independent variables (x1, x2...). The dependant variable y only comprises 0 and 1. 90% of y are 1 and 10% of y are 0. When I apply rpart to it, there is no splitting at all. I am wondering whether this is because of the

how to Store loop output from a function

2010 May 26

how to Store loop output from a function

HI, Dear R community, I am writing the following function to create one data set(*tree.pred*) and one vector(*valid.out*) from loops. Later, I want to use the data set from this loop to plot curves. I have tried return, list, but I can not use the *tree.pred* data and *valid.out* vector. auc.tree<- function(msplit,mbucket) { * tree.pred<-data.frame()

rpart puzzle

2001 Jul 12

rpart puzzle

I've been using the package rpart with R 1.3.0 for Windows to produce simple classification trees for some measurement data from paleontological specimens. Both the rpart documentation and the output confirm that the program produces splits on continuous data that leave "holes" in the data. It is probably of little practical importance, but is there a reason why the binary

change the for loops with lapply

2010 Sep 07

change the for loops with lapply

cv.fold<-function(i, size=3, rang=0.3){ cat('Fold ', i, '\n') out.fold.c <-((i-1)*c.each.part +1):(i*c.each.part) out.fold.n <-((i-1)*n.each.part +1):(i*n.each.part) train.cv <- n.cc[-out.fold.c, c(2:2401, 2417)] train.nv <- n.nn[-out.fold.n, c(2:2401, 2417)] train.v<-rbind(train.cv, train.nv) #training data for feature

CV by rpart/mvpart

2006 Dec 28

CV by rpart/mvpart

Dear R-list, I am using the rpart/mvpart-package for selecting a right-sized regression tree by 10-fold cross-validation. My question: Is there a possibility to find out for every observation in which of the ten folds it is lying? I want to use the same folds for validating another regression method (moving averages) in order to choose the better one. Thanks a lot, Pedro

Class probabilities in rpart

2007 Sep 15

Class probabilities in rpart

Hi, the predict.rpart() function from the rpart library allows for calculating the class probabilities for a given test case instead of a discrete class label. How are these class probabilities derived? Is it simply the proportion of the majority class to all cases in a leaf node? Thanks in advance, Chris

similar to: question on rpart