similar to: tree size in rpart()

Displaying 20 results from an estimated 9000 matches similar to: "tree size in rpart()"

2008 Jul 31
1
predict rpart: new data has new level
Hi. I uses rpart to build a regression tree. Y is continuous. Now, I try to predict on a new set of data. In the new set of data, one of my x (call Incoterm, a factor) has a new level. I wonder why the error below appears as the guide says "For factor predictors, if an observation contains a level not used to grow the tree, it is left at the deepest possible node and
2005 Dec 07
0
Are minbucket and minsplit rpart options working as expected?
Dear r-list: I am using rpart to build a tree on a dataset. First I obtain a perhaps too large tree: > arbol.bsvg.02 <- rpart(formula, data = bsvg, subset=grp.entr, control=rpart.control(cp=0.001)) > arbol.bsvg.02 n= 100000 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 100000 6657 0 (0.93343000 0.06657000) 2) meses_antiguedad_svg>=10.5 73899 3658
2002 Mar 29
1
memory error with rpart()
Dear all, I have a 100 iteration loop. Within each loop, there are some calls to rpart() like: ctl <- rpart.control(maxcompete=0, maxsurrogate=0, maxdepth=10) temp <- rpart(y~., x, w=wt, method="class", parms=list(split="gini"), control=ctl) res <- log(predict.rpart(temp, type="prob")) newres <- log(predict.rpart(temp, newdata=newx,
2008 Feb 26
1
predict.rpart question
Dear All, I have a question regarding predict.rpart. I use rpart to build classification and regression trees and I deal with data with relatively large number of input variables (predictors). For example, I build an rpart model like this rpartModel <- rpart(Y ~ X, method="class", minsplit =1, minbucket=nMinBucket,cp=nCp); and get predictors used in building the model like
2007 Jan 03
1
User defined split function in Rpart
Dear all, I'm trying to manage with user defined split function in rpart (file rpart\tests\usersplits.R in http://cran.r-project.org/src/contrib/rpart_3.1-34.tar.gz - see bottom of the email). Suppose to have the following data.frame (note that x's values are already sorted) > D y x 1 7 0.428 2 3 0.876 3 1 1.467 4 6 1.492 5 3 1.703 6 4 2.406 7 8 2.628 8 6 2.879 9 5 3.025 10 3 3.494
2001 Jul 02
1
text.rpart: Unwanted NA labels on terminal nodes (PR#1009)
Brian The following (which is new to rw1030) occurs with both Windows 98 & Windows ME. I have not tested behaviour under Unix or Linux, but I expect it is no different. text.rpart() prints unwanted NAs (presumably in the splitting criterion position) on terminal nodes. Criterion <- factor(paste("Leaf", 1:5)) Node <- factor(1:5)
2012 Apr 03
1
rpart error message
Hi R-helpers, I am using rpart package for decision tree using R.We are invoking R environment through JRI from our java application.Hence, the result of R command is returned in REXP and we use geterrMessage() to retrieve the error. When we execute the following command, cnr_model<-rpart(as.factor(Species)~Sepal Length+Sepal Width+Petal Length, method="class",
2007 Dec 10
1
Multiple Reponse CART Analysis
Dear R friends- I'm attempting to generate a regression tree with one gradient predictor and multiple responses, trying to test if change in size (turtle.data$Clength) acts as a single predictor of ten multiple diet taxa abundances (prey.data) Neither rpart or mvpart seem to allow me to do multiple responses. (Or if they can, I'm not using the functions properly.) > library(rpart)
2010 May 26
1
how to Store loop output from a function
HI, Dear R community, I am writing the following function to create one data set(*tree.pred*) and one vector(*valid.out*) from loops. Later, I want to use the data set from this loop to plot curves. I have tried return, list, but I can not use the *tree.pred* data and *valid.out* vector. auc.tree<- function(msplit,mbucket) { * tree.pred<-data.frame()
2011 Dec 27
0
Using minsplit and unequal weights in rpart
Dear r-help mailing list, Is there a way to incorporate weights into the minsplit criteria in rpart, when the weights are uneven? I could not find a way for the minsplit threshold to take the weights into account, and when the weights are uneven it becomes an issue, as the following example shows. My current workaround is to expand the data into one in which each row is an observation, but that
2010 Feb 03
0
mboost: how to implement cost-sensitive boosting family
mboost contains a blackboost method to build tree-based boosting models. I tried to write my own "cost-sensitive" ada family. But obviously my understanding to implement ngradient, loss, and offset functions is not right. I would greatly appreciate if anyone can help me out, or show me how to write a cost-sensitive family, thanks! Follows are some families I wrote ngradient <-
2009 Sep 08
0
Ada package question
Hi, I am using ada to predict a data set with 36 variables ada(x~.,data=train,iter=Iter, control=rpart.control(maxdepth=4,cp=-1,minsplit=0,xval=0)) can any one tell me in in laymans terms maxdepth- how do you set this, how do you change this to improve predictions success cp- same question for this also minsplit- same question for this also how do I change all this parameters to my
2012 Dec 07
0
loop for calculating 1-se in rpart
Hi Listers I need to calculate and then plot a frequency histogram of the best tree calculated using the 1-se rule. I have included some code that has worked well for me in the past but it was only for selecting the minimum cross-validation error. I include the code for my model, some relevant output and the code for selecting and plotting the frequency histogram of minimum xerror. Here is the
2006 Apr 07
1
rpart.predict error--subscript out of bounds
Hi, I am using rpart to do leave one out cross validation, but met some problem, Data is a data frame, the first column is the subject id, the second column is the group id, and the rest columns are numerical variables, > Data[1:5,1:10] sub.id group.id X3262.345 X3277.402 X3369.036 X3439.895 X3886.935 X3939.054 X3953.777 X3970.352 1 32613 HAM_TSP 417.7082 430.4895 619.4776 720.8246
2011 May 20
0
RPART error
Hi, I have been working generating decision tree analyses on large numbers of simulation datasets using the RPART function.? With some datasets, RPART is returning an error of "Error in yval[, 1] : incorrect number of dimensions".? There seem to be certain types of splits that cause it to break and return this message.? I?am able to isolate the record at which this error message
2002 Aug 28
0
user defined function in rpart
Hi, I am trying to use the rpart library with my own set of functions on a survival object. I get an immeadiate segmentation fault when i try calling rpart with my list of functions. I get the same problem with the logrank example from Therneau,s S-rpart library though their anova example works. Should I report this as a bug, as even if my functions are structured improperly, that should lead to
2011 May 19
1
Specifying Splits WhenUusing rpart
I am using the package rpart to explore various classification structures. The call looks like: seekhi1<-rpart(pvol~spec+a1+psize+eppres+numpt+icds+bivalcrt+stents+ppshare+ nhosp+nyrs,data=dat,method="class", control=rpart.control(minsplit=30,xval=10)) The output is 1) root 198 87 1 (0.5606061 0.4393939) 2) psize=1,2 122 43 1 (0.6475410 0.3524590)
2010 Aug 13
1
decision tree finetune
My decision tree grows only with one split and based on what I see in E-Miner it should split on more variables. How can I adjust splitting criteria in R? Also is there way to indicate that some variables are binary, like variable Info_G is binary so in the results would be nice to see "2) Info_G=0" instead of "2) Info_G<0.5". Thank you in advance! And thanks for Eric who
2005 Oct 18
1
Memory problems with large dataset in rpart
Dear helpers, I am a Dutch student from the Erasmus University. For my Bachelor thesis I have written a script in R using boosting by means of classification and regression trees. This script uses the function the predefined function rpart. My input file consists of about 4000 vectors each having 2210 dimensions. In the third iteration R complains of a lack of memory, although in each iteration
2011 Jul 07
0
Can't reproduce ada example
Dear R Users, I'm having trouble reproducing the results in Section 5.1 of Culp, M., Johnson, K., Michailidis, G. (2006). ada: an R Package for Stochastic Boosting Journal of Statistical Software, 16 They build and display a boosting model with the code: library("ada") n <- 12000 p <- 10 set.seed(100) x <- matrix(rnorm(n*p), ncol=p) y <-