search for: minsplit

Displaying 20 results from an estimated 27 matches for "minsplit".

2011 Dec 27
0
Using minsplit and unequal weights in rpart
Dear r-help mailing list, Is there a way to incorporate weights into the minsplit criteria in rpart, when the weights are uneven? I could not find a way for the minsplit threshold to take the weights into account, and when the weights are uneven it becomes an issue, as the following example shows. My current workaround is to expand the data into one in which each row is an obser...
2005 Dec 07
0
Are minbucket and minsplit rpart options working as expected?
...r, > arbol.bsvg.02 n= 100000 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 100000 6657 0 (0.9334300 0.0665700) * And I get an "empty" tree. But there were branches in the original tree with more than 1000 observations. Something similar happens if I set minsplit (or both minbucket and minsplit) to a similar value: I end up with the same root, branch-less tree. Am I misreading something? Can anybody cast a light on the correct usage of the minbucket (and/or minsplit) for me? Sincerely, Carlos J. Gil Bellosta http://www.datanalytics.com
2010 Oct 12
2
repeating an analysis
...un it n=50 times and from each output pick the appropriate tree size and post it to a datafile where I can then look at the frequency distribution of tree sizes. Here is the code and output from a single run > fit1 <- rpart(CHAB~.,data=chabun, method="anova", control=rpart.control(minsplit=10, cp=0.01, xval=10)) > printcp(fit1) Regression tree: rpart(formula = CHAB ~ ., data = chabun, method = "anova", control = rpart.control(minsplit = 10, cp = 0.01, xval = 10)) Variables actually used in tree construction: [1] EXP LAT POC RUG Root node error: 35904/33 = 1088 n= 33...
2010 May 26
1
how to Store loop output from a function
...um_genes","position", "acid_per", "base_per", "charge_per", "hydrophob_per", "polar_per", "out") train<-train[myvar] # update training set valid<-valid[myvar] control<-rpart.control(xval=10, cp=0.01, minsplit=5, minbucket=5) #control the size of the initial tree tree.fit <- rpart(out ~ ., method="class", data=train, control=control) # model fitting p.tree<- prune(tree.fit, cp=tree.fit$cptable[which.min(tree.fit$cptable[,"xerror"]),"CP"]) # prune the tree...
2007 Dec 10
1
Multiple Reponse CART Analysis
...ultiple diet taxa abundances (prey.data) Neither rpart or mvpart seem to allow me to do multiple responses. (Or if they can, I'm not using the functions properly.) > library(rpart) > turtle.rtree<-rpart(prey.data~., data=turtle.data$Clength, method="anova", maxsurrogate=0, minsplit=8, minbucket=4, xval=10); plot(turtle.rtree); text(turtle.rtree) Error in terms.formula(formula, data = data) : '.' in formula and no 'data' argument When I switch response for predictor, it works. But this is the opposite of what I wanted to test and gives me splits at a...
2008 Feb 26
1
predict.rpart question
...ear All, I have a question regarding predict.rpart. I use rpart to build classification and regression trees and I deal with data with relatively large number of input variables (predictors). For example, I build an rpart model like this rpartModel <- rpart(Y ~ X, method="class", minsplit =1, minbucket=nMinBucket,cp=nCp); and get predictors used in building the model like this colnamesUsed<-unique(rownames(rpartModel$splits)); When later I apply the rpart model to predict the new data I strip the input data from unneccessary columns and only use X columns that exist in c...
2008 Feb 29
1
controlling for number of elements in each node of the tree in mvpart
Still about the mvpart. Is there any way I can control for the number of elements in each node in the function mvpart? Specifically, how can I ask partition to ignore node with elements less than 10? Thanks! -Shu
2009 Sep 08
0
Ada package question
Hi, I am using ada to predict a data set with 36 variables ada(x~.,data=train,iter=Iter, control=rpart.control(maxdepth=4,cp=-1,minsplit=0,xval=0)) can any one tell me in in laymans terms maxdepth- how do you set this, how do you change this to improve predictions success cp- same question for this also minsplit- same question for this also how do I change all this parameters to my advantage/ Greatly appreciated the help pc...
2010 Aug 13
1
decision tree finetune
...in advance! And thanks for Eric who helped with my previous question about starting "rpart". Olga > fit <- rpart(Retention ~ Info_G+AOPD+Mail+Xref_Umbr+Ins_Age+Discount+Xref_A + Con6 + + Con5 + Con4 + + Con3 + Con2 + + Con1 , data=Home,control=rpart.control(minsplit=5)) > > fit n= 48407 node), split, n, deviance, yval * denotes terminal node 1) root 48407 4730.642 0.8902225 2) Info_G< 0.5 14280 1999.293 0.8316527 * 3) Info_G>=0.5 34127 2661.865 0.9147303 * > [[alternative HTML version deleted]]
2012 Dec 07
0
loop for calculating 1-se in rpart
...de for selecting and plotting the frequency histogram of minimum xerror. Here is the output that is being referenced in the code below Regression tree: rpart(formula = chbiomsq ~ HC + BC + POC + RUG + Depth + Exp + DFP + FI + LAT, data = ch, method = "anova", control = rpart.control(minsplit = 10, cp = 0.01, xval = 10)) Variables actually used in tree construction: [1] BC Depth DFP Exp Root node error: 47456/99 = 479.35 n= 99 CP nsplit rel error xerror xstd 1 0.344626 0 1.00000 1.02074 0.139585 2 0.179054 1 0.65537 0.76522 0.107470 3 0.072037...
2018 Jan 01
1
Error in adabag
...or rror in if (nrow(object$splits) > 0) { : argument is of length zero when I am running the following codes. train <- c(sample(1:27,18), sample(28:54, 18), sample(55:81, 8)) a2011.adaboost <- boosting(median_kod ~ ., data = b[train, ], boos=TRUE, mfinal = 10, control = rpart.control(minsplit = 0)) Regards, Greg [[alternative HTML version deleted]]
2008 Jul 31
1
predict rpart: new data has new level
...tors, if an observation contains a level not used to grow the tree, it is left at the deepest possible node and frame$yval at the node is the prediction. " Many thanks. > mod <- rpart(y~., data=data.frame(y=y,x=x), method="anova", + cp=0.05, minsplit=100, minbucket=50, maxdepth=5) > predictLost <- predict(mod, newdata=data.frame(y=yLost, x=xLost), type="vector") Error in model.frame.default(Terms, newdata, na.action = act, xlev = attr(object, : factor 'x.Incoterm' has new level(s) MTD ---- Ch...
2012 Apr 03
1
rpart error message
...ult of R command is returned in REXP and we use geterrMessage() to retrieve the error. When we execute the following command, cnr_model<-rpart(as.factor(Species)~Sepal Length+Sepal Width+Petal Length, method="class", parms=list(split="gini",prior=c()), control=rpart.control(minsplit=2, na.action=na.pass,cp=0.001,usesurrogate=1,maxsurrogate=2,surrogatestyle=0,maxdepth=20,xval=10)) we get an error message* "Error: unexpected symbol in "a<-cnr_model<-rpart(as.factor(Species)~Sepal Length""* The REXP returned in not null and the geterrMessage() call doe...
2001 Jul 02
1
text.rpart: Unwanted NA labels on terminal nodes (PR#1009)
...Criterion <- factor(paste("Leaf", 1:5)) Node <- factor(1:5) assign("tree.df", data.frame(Criterion = Criterion, Node = Node)) nobs <- dim(tree.df)[[1]] u.tree <- rpart(Node ~ Criterion, data = tree.df, all = F, control = list(minsplit = 2, minbucket = 1, cp = 9.999999999999998e-008)) plot(u.tree, uniform=T) text(u.tree) --please do not edit the information below-- Version: platform = i386-pc-mingw32 arch = x86 os = Win32 system = x86, Win32 status = major = 1 minor = 3.0...
2007 Feb 15
2
Does rpart package have some requirements on the original data set?
Hi, I am currently studying Decision Trees by using rpart package in R. I artificially created a data set which includes the dependant variable (y) and a few independent variables (x1, x2...). The dependant variable y only comprises 0 and 1. 90% of y are 1 and 10% of y are 0. When I apply rpart to it, there is no splitting at all. I am wondering whether this is because of the
2010 Oct 12
6
Rpart query
Hi, Being a novice this is my first usage of R. I am trying to use rpart for building a decision tree in R. And I have the following dataframe Outlook Temp Humidity Windy Class Sunny 75 70 Yes Play Sunny 80 90 Yes Don't Play Sunny 85 85 No Don't Play Sunny 72 95 No Don't Play Sunny 69 70 No Play Overcast 72 90 Yes Play Overcast 83 78 No Play Overcast 64 65 Yes Play Overcast 81 75
2002 Feb 13
0
tree size in rpart()
Dear all, I know in rpart(), one can control the tree size (i.e. number of terminal nodes) through rpart.control(), e.g. minsplit, minbucket, maxdepth etc. But is there any more direct way to specify the number of terminal nodes when rpart() does the recursive partitioning? Your help is highly appreciated! Regards, -Ji -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list --...
2002 Aug 28
0
user defined function in rpart
...formatg(yval,digits),"\nn=", n,sep="")} else{paste(formatg(yval,digits))} }) } tst.lst<-list(eval=tst.eval, split=tst.split, init=tst.init) data(lung) fit1 <- rpart(Surv(time, status) ~ age + ph.karno + meal.cal,data=lung,control=rpart.control(minsplit=30, xval=0, cp=.011),method=tst.lst) -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not th...
2010 Feb 03
0
mboost: how to implement cost-sensitive boosting family
...t, loss = loss); model.blackboost <- blackboost(tr[,1:DIM], tr.y, family=CSAdaExp, weights=NULL, control=boost_control(mstop=MSTOP, nu=0.1,savedata=TRUE,save_ensembless=TRUE,trace=TRUE), tree_controls=ctree_control(teststat = "max",testtype = "Teststatistic",mincriterion = 0,minsplit = 2000, minbucket = 700,maxdepth = TREEDEPTH)); -------------------------------- regards, Yuchun Tang, Ph.D. Principal Engineer, Lead McAfee, Inc. 4501 North Point Parkway Suite 300 Alpharetta, GA 30022 Main: 770.776.2685 www.mcafee.com www.trustedsource.org www.linkedin.com/in/yuchuntang
2011 Jul 07
0
Can't reproduce ada example
....seed(100) x <- matrix(rnorm(n*p), ncol=p) y <- as.factor(c(-1,1)[as.numeric(apply(x^2, 1, sum) > 9.34) + 1]) indtrain <- sample(1:n, 2000, FALSE) train <- data.frame(y=y[indtrain], x[indtrain,]) test <- data.frame(y=y[-indtrain], x[-indtrain,]) control <- rpart.control(cp = -1,minsplit = 0,xval = 0,maxdepth = 1) gdis <- ada(y~., data = train, iter = 400, bag.frac = 1, nu = 1, control = control, test.x = test[,-1], test.y = test[,1]) gdis plot(gdis, TRUE, TRUE) summary(gdis, n.iter = 398) My problem is that my confusion matrix, testing results and diagnostic plots differ from...