Displaying 18 results from an estimated 18 matches for "minbucket".
2005 Dec 07
Are minbucket and minsplit rpart options working as expected?
...ffice>=0.5 148 35 0 (0.76351351 0.23648649) *
63) back_office< 0.5 250 109 1 (0.43600000 0.56400000) *
So I decide not to consider branches with less than 1000 observations, a 1% of
the original number of observations. Therefore, according to the rpart.control
help pages, I set minbucket=1000. However,
> arbol.bsvg.02
n= 100000
node), split, n, loss, yval, (yprob)
* denotes terminal node
1) root 100000 6657 0 (0.9334300 0.0665700) *
And I get an "empty" tree. But there were branches in the original tree with
more than 1000 observations. Something similar happ...
2010 May 26
how to Store loop output from a function
"acid_per", "base_per", "charge_per", "hydrophob_per", "polar_per", "out")
train<-train[myvar] # update training set
control<-rpart.control(xval=10, cp=0.01, minsplit=5, minbucket=5) #control
the size of the initial tree
tree.fit <- rpart(out ~ ., method="class", data=train,
control=control) # model fitting
p.tree<- prune(tree.fit,
cp=tree.fit$cptable[which.min(tree.fit$cptable[,"xerror"]),"CP"]) # prune
the tree
#get the p...
2007 Jan 03
User defined split function in Rpart
...ame (note that x's values are already
> D
y x
1 7 0.428
2 3 0.876
3 1 1.467
4 6 1.492
5 3 1.703
6 4 2.406
7 8 2.628
8 6 2.879
9 5 3.025
10 3 3.494
11 2 3.496
12 6 4.623
13 4 4.824
14 6 4.847
15 2 6.234
16 7 7.041
17 2 8.600
18 4 9.225
19 5 9.381
20 8 9.986
Running rpart and setting minbucket=1 and maxdepth=1 we get the following
tree (which uses, by default, deviance):
> rpart(D$y~D$x,control=rpart.control(minbucket=1,maxdepth=1))
n= 20
node), split, n, deviance, yval * denotes terminal node
1) root 20 84.80000 4.600000
2) D$x< 9.6835 19 72.63158 4.421053 *
3) D$x>=9....
2007 Dec 10
Multiple Reponse CART Analysis
...taxa abundances (prey.data) Neither rpart or mvpart seem to allow me to do multiple responses. (Or if they can, I'm not using the functions properly.)
> library(rpart)
> turtle.rtree<-rpart(prey.data~., data=turtle.data$Clength, method="anova", maxsurrogate=0, minsplit=8, minbucket=4, xval=10); plot(turtle.rtree); text(turtle.rtree)
Error in terms.formula(formula, data = data) :
'.' in formula and no 'data' argument
When I switch response for predictor, it works. But this is the opposite of what I wanted to test and gives me splits at abundance valu...
2008 Feb 26
predict.rpart question
...have a question regarding predict.rpart. I use
rpart to build classification and regression trees and I deal with data with
relatively large number of input variables (predictors). For example, I build an
rpart model like this
rpartModel <- rpart(Y ~ X, method="class",
minsplit =1, minbucket=nMinBucket,cp=nCp);
and get predictors used in building the model like
When later I apply the rpart model to predict the new
data I strip the input data from unneccessary columns and only use X columns
that exist in colnamesUsed. U...
2012 Jan 19
ctree question
Hello. I have used the "party" package to generate a regression tree as
The above works fine. Orig data was my training data. I now have a test
data file (testdata), and would like to run the testdata through the above
tree to see predictions. I tried using the following function
2006 Apr 07
rpart.predict error--subscript out of bounds
...8 552.3632 719.9989 1306.6299 446.6184 1352.9955 867.4219
5 32629 HAM_TSP 898.8879 640.2680 342.5477 386.5816 811.6709 518.0244 715.9886 441.1622
Example, I use the first sample as test set, the rest as training set
> fit <- rpart(as.factor(Data[-1,2]) ~., Data[-1, -c(1:2 ) ], minbucket=2 )
> predict(fit, Data[1,],type='prob')
Error in predict.rpart(fit, Data[1, ]) : subscript out of bounds
but when I changes the parameter of type into 'class'
it works well
> predict(fit, Data[1,-c(1:2)],type='class')
[1] HTLV_Carrier
Levels: HAM_TSP HTLV_Carrier...
2008 Jul 31
predict rpart: new data has new level
...servation contains a level not used to grow the tree, it
is left at the deepest possible node and frame$yval at the node is the
prediction. "
Many thanks.
> mod <- rpart(y~., data=data.frame(y=y,x=x), method="anova",
+ cp=0.05, minsplit=100, minbucket=50, maxdepth=5)
> predictLost <- predict(mod, newdata=data.frame(y=yLost, x=xLost),
Error in model.frame.default(Terms, newdata, na.action = act, xlev =
attr(object, :
factor 'x.Incoterm' has new level(s) MTD
Chua Siang Li...
2001 Jul 02
text.rpart: Unwanted NA labels on terminal nodes (PR#1009)
...;- factor(paste("Leaf", 1:5))
Node <- factor(1:5)
assign("tree.df", data.frame(Criterion = Criterion, Node = Node))
nobs <- dim(tree.df)[[1]]
u.tree <- rpart(Node ~ Criterion, data = tree.df, all = F,
control = list(minsplit = 2, minbucket
= 1, cp = 9.999999999999998e-008))
plot(u.tree, uniform=T)
--please do not edit the information below--
platform = i386-pc-mingw32
arch = x86
os = Win32
system = x86, Win32
status =
major = 1
minor = 3.0
year = 2001
2011 Feb 10
R 2.12.1 Windows 32bit and 64bit - are numerical differences expected?
...it to a few simple lines of code to replicate the
differences (but had to stay with the weather dataset from rattle since
could not replicate on standard datasets yet).
model <- rpart(RainTomorrow ~ ., data=weather[-c(1, 2,
23)], control=rpart.control(minbucket=0))
Final row on 32bit: 9 0.01000000 23 0.1515152 1.1060606 0.1158273
Final row on 64bit: 9 0.01000000 23 0.1515152 1.0909091 0.1152273
Pretty minor, but different. I've not found any seed other than 41 (only
tried a few) that results in a difference.
2002 Feb 13
tree size in rpart()
Dear all,
I know in rpart(), one can control the tree size (i.e. number of
terminal nodes) through rpart.control(), e.g. minsplit, minbucket,
maxdepth etc. But is there any more direct way to specify the number of
terminal nodes when rpart() does the recursive partitioning? Your help
is highly appreciated!
r-help mailing list -- Read http:/...
2010 Feb 03
mboost: how to implement cost-sensitive boosting family
...model.blackboost <- blackboost(tr[,1:DIM], tr.y, family=CSAdaExp,
weights=NULL, control=boost_control(mstop=MSTOP,
tree_controls=ctree_control(teststat = "max",testtype =
"Teststatistic",mincriterion = 0,minsplit = 2000, minbucket =
700,maxdepth = TREEDEPTH));
Yuchun Tang, Ph.D.
Principal Engineer, Lead
McAfee, Inc.
4501 North Point Parkway
Suite 300
Alpharetta, GA 30022
Main: 770.776.2685
2011 May 20
RPART error
...ctly produce output.? The second (problem.csv), with only one additional
record, will
return the error message and no output.?
I am running R 2.13.0 on a Windows XP platform.?
To reproduce the problem:
data <- read.csv("problem.csv", header=T)control=rpart.control(minbucket=10)
x <- rpart(cad~v1+v2+v3+v4+v5+v6+v7+v8+v9+v10,data=data, method = "class",
Similar code run on "noproblem.csv" will not produce the error.
Any suggestions on how to proceed to debug this issue would be greatly
appreciated.? I am a novice R...
2012 Apr 24
mvpart versus SPSS
...ild nodes. Now we would like
to proceed with fitting a multivariate tree. We only used pruning by the
way, no v-fold cross validation afterwards. Using the aforementioned
criteria in univariate analyses resulted in relatively large trees in SPSS,
but using mvpart with xv=1se, cp=0.000001,minsplit=5,minbucket=3 resulted in
a tree with only 1 or 2 splits. This makes us wonder what causes this
dramatic difference in the tree size produced by SPSS vs. mvpart. If I use
the "pick" option in mvpart I am able to "pick" the SPSS-tree, but the X-val
Relative Error is quite large. The plot loo...
2000 Mar 27
Behavior different inside function?
...ste(fn, sesnum, ".ps", sep="", collapse=NULL)
fm1 <- as.formula(fm)
ds <- read.table(file=dsn, header=T)
rownames(ds) <- ds$unit
nmavgres <- ds$mavgres * 1000
nravgres <- ds$ravgres * 1000
ds.mrpt <- rpart(formula=fm1, data=ds, control=rpart.control(minbucket=20))
plot(prune(ds.mrpt, cp=0.018))
text(prune(ds.mrpt, cp=0.018), digits=2)
post.rpart(prune(ds.mrpt, cp=0.018),
title=c(tit, paste("SES Quartile", sesnum, sep=" ",
collapse=NULL)), pretty=0,
filename=psn, hori...
2012 Jul 06
Plotting rpart trees with long list of class members
I have a class with 732 members, so using rpart.plot is giving me a tiny plot
in the middle of the window. Is there a good way to modify the plot, or
replace the long list with something like "group1"?
View this message in context: http://r.789695.n4.nabble.com/Plotting-rpart-trees-with-long-list-of-class-members-tp4635671.html
Sent from the R help mailing list archive at
2005 May 25
Error with user defined split function in rpart (PR#7895)
...found by lining the groups up in this order
# and going from left to right, so that only m-1 splits need to
# be evaluated rather than 2^(m-1)
# goodness = m-1 values, as before.
# The reason for returning a vector of goodness is that the C routine
# enforces the "minbucket" constraint. It selects the best return value
# that is not too close to an edge.
temp2 <- function(y, wt, x, parms, continuous) {
print("***** START: TEMP2 *****");
n <- length(y)
# For binary y, get P(Y=0)/n and P(Y=1)/n at each split
temp <- cumsum(y...
2007 Feb 27
rpart minimum sample size
Is there an optimal / minimum sample size for attempting to construct a
classification tree using /rpart/?
I have 27 seagrass disturbance sites (boat groundings) that have been
monitored for a number of years. The monitoring protocol for each site
is identical. From the monitoring data, I am able to determine the
level of recovery that each site has experienced. Recovery is our