similar to: question on rpart

Displaying 20 results from an estimated 20000 matches similar to: "question on rpart"

2007 Dec 16
1
paste dependent variable in formula (rpart)?
Hello, i'm trying to replace different target variables in rpart with a function. The data.frame getting always the target variable as last column. Try below, i get the target variable in the explained variables, too!? Have anybody an advice to avoid this. rp1 <- rpart(eval(parse(text=paste(names(train[length(train)])))) ~ . , data=train,cp=0.0001) regards & many thanks
2012 May 15
1
caret: Error when using rpart and CV != LOOCV
Hy, I got the following problem when trying to build a rpart model and using everything but LOOCV. Originally, I wanted to used k-fold partitioning, but every partitioning except LOOCV throws the following warning: ---- Warning message: In nominalTrainWorkflow(dat = trainData, info = trainInfo, method = method, : There were missing values in resampled performance measures. ----- Below are some
2009 Jun 09
3
rpart - the xval argument in rpart.control and in xpred.rpart
Dear R users, I'm working with the rpart package and want to evaluate the performance of user defined split functions. I have some problems in understanding the meaning of the xval argument in the two functions rpart.control and xpred.rpart. In the former it is defined as the number of cross-validations while in the latter it is defined as the number of cross-validation groups. If I am
2003 Apr 16
2
Jackknife and rpart
Hi, First, thanks to those who helped me see my gross misunderstanding of randomForest. I worked through a baging tutorial and now understand the "many tree" approach. However, it is not what I want to do! My bagged errors are accpetable but I need to use the actual tree and need a single tree application. I am using rpart for a classification tree but am interested in a more unbaised
2009 Jul 26
3
Question about rpart decision trees (being used to predict customer churn)
Hi, I am using rpart decision trees to analyze customer churn. I am finding that the decision trees created are not effective because they are not able to recognize factors that influence churn. I have created an example situation below. What do I need to do to for rpart to build a tree with the variable experience? My guess is that this would happen if rpart used the loss matrix while creating
2004 Mar 19
2
How to collect trees grown by rpart
Jonathan, Try making a list instead of an array. See ?list. Also, did you look into random forests? I'm not sure what you want to do, but there might be methods there to do some of the work for you. Sean On 3/19/04 1:12 PM, "Jonathan Williams" <jonathan.williams at pharmacology.oxford.ac.uk> wrote: > I would like to collect the trees grown by rpart fits in an array,
2003 Feb 12
1
rpart v. lda classification.
I've been groping my way through a classification/discrimination problem, from a consulting client. There are 26 observations, with 4 possible categories and 24 (!!!) potential predictor variables. I tried using lda() on the first 7 predictor variables and got 24 of the 26 observations correctly classified. (Training and testing both on the complete data set --- just to get started.) I
2010 Apr 30
1
how is xerror calculated in rpart?
Hi, I've searched online, in a few books, and in the archives, but haven't seen this. I believe that xerror is scaled to rel error on the first split. After fitting an rpart object, is it possible with a little math to determine the percentage of true classifications represented by a xerror value? -seth -- View this message in context:
2009 Jan 09
2
rpart with interval censored data crashes R
Hi Everyone, This example code results in R 'crashing'; that is the R application closes with no warnings or error messages. #----------------------- myD <- read.table(stdin(), header=TRUE, nrows=20) Broth Salt pH Temp N Y Growth 1 310 9.0 2.92 10 90.0 NA 0 2 615 6.0 7.82 30 1.0 2 1 3 217 2.0 7.34 10 7.0 8
2006 Sep 25
2
rpart
Dear r-help-list: If I use the rpart method like cfit<-rpart(y~.,data=data,...), what kind of tree is stored in cfit? Is it right that this tree is not pruned at all, that it is the full tree? If so, it's up to me to choose a subtree by using the printcp method. In the technical report from Atkinson and Therneau "An Introduction to recursive partitioning using the rpart
2011 Jan 24
1
How to measure/rank ?variable importance when using rpart?
--- included message ---- Thus, my question is: *What common measures exists for ranking/measuring variable importance of participating variables in a CART model? And how can this be computed using R (for example, when using the rpart package)* ---end ---- Consider the following printout from rpart summary(rpart(time ~ age + ph.ecog + pat.karno, data=lung)) Node number 1: 228 observations,
2008 Jul 03
1
cross-validation in rpart
Hello list, I'm having a problem with custom functions in rpart, and before I tear my hair out trying to fix it, I want to make sure it's actually a problem. It seems that, when you write custom functions for rpart (init, split and eval) then rpart no longer cross-validates the resulting tree to return errors. A simple test is to use the usersplits.R function to get a simple, custom
2010 Dec 13
2
rpart.object help
Hi, Suppose i have generated an object using the following : fit <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis) And when i print fit, i get the following : n= 81 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 81 17 absent (0.7901235 0.2098765) 2) Start>=8.5 62 6 absent (0.9032258 0.0967742) 4) Start>=14.5 29 0 absent (1.0000000
2009 Dec 15
1
user-written splits in rpart
Hi, I am trying to write my own split function for rpart. The aim is to do, instead of anova, a linear regression to determine the split (minimize some criterion like sum of rss left and right of the split). The regression (lm) should simply use the dependent and independent variables passed to rpart. I am aware of the example provided in the rpart source code, but stumbled on similar problems
2007 Feb 15
2
Does rpart package have some requirements on the original data set?
Hi, I am currently studying Decision Trees by using rpart package in R. I artificially created a data set which includes the dependant variable (y) and a few independent variables (x1, x2...). The dependant variable y only comprises 0 and 1. 90% of y are 1 and 10% of y are 0. When I apply rpart to it, there is no splitting at all. I am wondering whether this is because of the
2010 May 26
1
how to Store loop output from a function
HI, Dear R community, I am writing the following function to create one data set(*tree.pred*) and one vector(*valid.out*) from loops. Later, I want to use the data set from this loop to plot curves. I have tried return, list, but I can not use the *tree.pred* data and *valid.out* vector. auc.tree<- function(msplit,mbucket) { * tree.pred<-data.frame()
2001 Jul 12
2
rpart puzzle
I've been using the package rpart with R 1.3.0 for Windows to produce simple classification trees for some measurement data from paleontological specimens. Both the rpart documentation and the output confirm that the program produces splits on continuous data that leave "holes" in the data. It is probably of little practical importance, but is there a reason why the binary
2010 Sep 07
1
change the for loops with lapply
cv.fold<-function(i, size=3, rang=0.3){ cat('Fold ', i, '\n') out.fold.c <-((i-1)*c.each.part +1):(i*c.each.part) out.fold.n <-((i-1)*n.each.part +1):(i*n.each.part) train.cv <- n.cc[-out.fold.c, c(2:2401, 2417)] train.nv <- n.nn[-out.fold.n, c(2:2401, 2417)] train.v<-rbind(train.cv, train.nv) #training data for feature
2006 Dec 28
3
CV by rpart/mvpart
Dear R-list, I am using the rpart/mvpart-package for selecting a right-sized regression tree by 10-fold cross-validation. My question: Is there a possibility to find out for every observation in which of the ten folds it is lying? I want to use the same folds for validating another regression method (moving averages) in order to choose the better one. Thanks a lot, Pedro
2007 Sep 15
1
Class probabilities in rpart
Hi, the predict.rpart() function from the rpart library allows for calculating the class probabilities for a given test case instead of a discrete class label. How are these class probabilities derived? Is it simply the proportion of the majority class to all cases in a leaf node? Thanks in advance, Chris