Displaying 20 results from an estimated 8000 matches similar to: "predict rpart newdata - introduce only values variables used in the tree"
2012 Feb 15
2
assign same legend colors than in the grouped data plot
Dear community,
I've plotted data and coloured depending on the factor variable v3.
In the legend, I'd like to assign properly the same colors than in the
factor (the factor has 5 levels).
I've been trying this but it doesn't work.
plot(var1, var2, xlab = "var1", ylab = "var2", col =var3 , bty='L')
legend(locator(1),c("level 1 var3",
2005 May 04
1
Difference between "tree" and "rpart"
In the help for rpart it says, "This differs from the tree function
mainly in its handling of surrogate variables." And it says that an
rpart object is a superset of a tree object. Both cite Brieman et al.
1984. Both call external code which looks like martian poetry to me.
I've seen posts in the archives where BDR, and other knowledgeable
folks, have said that rpart() is to be
2011 Nov 16
2
outlier identify in qqplot
Dear Community,
I want to identify outliers in my data. I don't know how to use identify
command in the plots obtained.
I've gone through help files and use mahalanobis example for my purpose:
NormalMultivarianteComparefunc <- function(x) {
Sx <- cov(x)
D2 <- mahalanobis(x, colMeans(x), Sx)
plot(density(D2, bw=.5), main="Squared Mahalanobis distances, n=nrow(x),
2013 Jan 27
2
rpart
Hi,
When I look at the summary of an rpart object run on my data, I get 7 nodes but when I plot the rpart object, I get only 3 nodes. Should the number of nodes not match in the results of the 2 functions (summary and plot) or it is not always the same?
Look forward to your reply,
Carol
--------------------------------------------
?summary(rpart.res)
Call:
rpart(formula = mydata$class ~ ., data
2011 Jan 24
1
How to measure/rank ?variable importance when using rpart?
--- included message ----
Thus, my question is: *What common measures exists for ranking/measuring
variable importance of participating variables in a CART model? And how
can
this be computed using R (for example, when using the rpart package)*
---end ----
Consider the following printout from rpart
summary(rpart(time ~ age + ph.ecog + pat.karno, data=lung))
Node number 1: 228 observations,
2012 Aug 01
1
rpart package: why does predict.rpart require values for "unused" predictors?
After fitting and pruning an rpart model, it is often the case that one or
more of the original predictors is not used by any of the splits of the
final tree. It seems logical, therefore, that values for these "unused"
predictors would not be needed for prediction. But when predict() is called
on such models, all predictors seem to be required. Why is that, and can it
be easily
2008 May 12
3
help with rpart
Hi,
I am using rpart as a part of my masters' project. I am trying to print out
the resulting model using plot() function along with text() function. I am
having difficulties with labels being cut-off. In text() function, I am
using use.n=T option to get the number of people in each nodes but the on
the lower and left part of the plot, the numbers get cut off. Thanks!
Linus
[[alternative
2008 Jul 22
2
rpart$where and predict.rpart
Hello there. I have fitted a rpart model.
> rpartModel <- rpart(y~., data=data.frame(y=y,x=x),method="class", ....)
and can use rpart$where to find out the terminal nodes that each
observations belongs.
Now, I have a set of new data and used predict.rpart which seems to give
only the predicted value with no information similar to rpart$where.
May I know how
2008 Jul 31
1
predict rpart: new data has new level
Hi. I uses rpart to build a regression tree. Y is continuous. Now, I try
to predict on a new set of data. In the new set of data, one of my x (call
Incoterm, a factor) has a new level.
I wonder why the error below appears as the guide says "For factor
predictors, if an observation contains a level not used to grow the tree, it
is left at the deepest possible node and
2011 Jul 29
1
help with predict.rpart
? data=read.table("http://statcourse.com/research/boston.csv", ,
sep=",", header = TRUE)
? library(rpart)
? fit=rpart (MV~ CRIM+ZN+INDUS+CHAS+NOX+RM+AGE+DIS+RAD+TAX+
PT+B+LSTAT)
predict(fit,data[4,])
plot only reveals part of the tree in contrast to the results on obtains
with CART or C5
-------- Original Message --------
Subject: Re: [R] help with rpart
From: Sarah
2008 Feb 26
1
predict.rpart question
Dear All,
I have a question regarding predict.rpart. I use
rpart to build classification and regression trees and I deal with data with
relatively large number of input variables (predictors). For example, I build an
rpart model like this
rpartModel <- rpart(Y ~ X, method="class",
minsplit =1, minbucket=nMinBucket,cp=nCp);
and get predictors used in building the model like
2005 Oct 08
1
Rpart -- using predict() when missing data is present?
I am doing
> library(rpart)
> m <- rpart("y ~ x", D[insample,])
> D[outsample,]
y x
8 0.78391922 0.579025591
9 0.06629211 NA
10 NA 0.001593063
> p <- predict(m, newdata=D[9,])
Error in model.frame(formula, rownames, variables, varnames, extras, extranames, :
invalid result from na.action
How do I persuade him to give me NA
2011 Mar 04
4
cv.lm syntax error
Dear all,
I've tried a multiple regression, and now I want to try a cross-validation.
I obtain this error (it must be sth related to df) that I don't understand,
any help would be appreciated.
cv.lm(df= dat, lm2.52f, m=3)
Error en `[.data.frame`(df, , ynam) : undefined columns selected
lm2.52f is my lm object, dat is a dataframe where the variables involved in
.lm are
I tried CVlm
2012 Sep 21
1
prune in rpart: choose number terminal nodes
Dear community,
I've an rpart object, and I know the CP I want. I'd like to know if it's
possible also to fix the number of terminal nodes I want.
Thanks in advance, user at host.com as user at host.com
--
View this message in context: http://r.789695.n4.nabble.com/prune-in-rpart-choose-number-terminal-nodes-tp4643837.html
Sent from the R help mailing list archive at
2009 May 22
1
bug in rpart?
Greetings,
I checked the Indian diabetes data again and get one tree for the data with
reordered columns and another tree for the original data. I compared these
two trees, the split points for these two trees are exactly the same but the
fitted classes are not the same for some cases. And the misclassification
errors are different too. I know how CART deal with ties --- even we are
using the
2012 Nov 01
0
oblique.tree : the predict function asserts the dependent variable to be included in "newdata"
Dear R community,
I have recently discovered the package oblique.tree and I must admit that
it was a nice surprise for me,
since I have actually made my own version of a kind of a classifier which
uses the idea of oblique splits (splits by means of hyperplanes).
So I am now interested in comparing these two classifiers.
But what I do not seem to understand is why the function
2010 Nov 18
1
predict() an rpart() model: how to ignore missing levels in a factor
I am using an algorigm to split my data set into two random sections
repeatedly and constuct a model using rpart() on one, test on the other and
average out the results.
One of my variables is a factor(crop) where each crop type has a code. Some
crop types occur infrequently or singly. when the data set is randomly
split, it may be that the first data set has a crop type which is not
present in
1999 Dec 23
1
rpart on Alpha under OSF
Running on an Alpha machine which reports (uname -a)
OSF1 bsdx01.bs.ehu.es V4.0 878 alpha
and using the binary distribution put together by Albrecht Gebhardt
(in http://cran.at.r-project.org/bin/osf/osf4.0/tar/alpha_ev5/) I
obtain core dumps whenever I try to use package rpart. I have R
REMOVE'd the rpart package, downloaded the source rpart_1.0-7.tar from
CRAN and
2012 May 03
1
NA's when subset in a dataframe
Dear community,
I'm having this silly problem.
I've a linear model. After fixing it, I wanted to know which data had
studentized residuals larger than 3, so i tried this:
d1 <- cooks.distance(lmmodel)
r <- sqrt(abs(rstandard(lmmodel)))
rstu <- abs(rstudent(lmmodel))
a <- cbind( mydata, d1, r,rstu)
alargerthan3 <- a[rstu >3, ]
And suddenly a[rstu >3, ] has
2001 Aug 02
1
Missing value in Rpart
Hi, all
Our understanding of how classification trees in Rpart treat missing is
that if the variable is ordinal(continous), Rpart, by default, imputes a
value for missing. How do we do the classification tree and tell Rpart not
to impute. That is, what command is used to turn off the imputation.
Also, if we do get true missing, how does classification tree analysis in
Rpart treat missing when