similar to: predict rpart newdata - introduce only values variables used in the tree

Displaying 20 results from an estimated 7000 matches similar to: "predict rpart newdata - introduce only values variables used in the tree"

2012 Feb 15
2
assign same legend colors than in the grouped data plot
Dear community, I've plotted data and coloured depending on the factor variable v3. In the legend, I'd like to assign properly the same colors than in the factor (the factor has 5 levels). I've been trying this but it doesn't work. plot(var1, var2, xlab = "var1", ylab = "var2", col =var3 , bty='L') legend(locator(1),c("level 1 var3",
2005 May 04
1
Difference between "tree" and "rpart"
In the help for rpart it says, "This differs from the tree function mainly in its handling of surrogate variables." And it says that an rpart object is a superset of a tree object. Both cite Brieman et al. 1984. Both call external code which looks like martian poetry to me. I've seen posts in the archives where BDR, and other knowledgeable folks, have said that rpart() is to be
2011 Nov 16
2
outlier identify in qqplot
Dear Community, I want to identify outliers in my data. I don't know how to use identify command in the plots obtained. I've gone through help files and use mahalanobis example for my purpose: NormalMultivarianteComparefunc <- function(x) { Sx <- cov(x) D2 <- mahalanobis(x, colMeans(x), Sx) plot(density(D2, bw=.5), main="Squared Mahalanobis distances, n=nrow(x),
2013 Jan 27
2
rpart
Hi, When I look at the summary of an rpart object run on my data, I get 7 nodes but when I plot the rpart object, I get only 3 nodes. Should the number of nodes not match in the results of the 2 functions (summary and plot) or it is not always the same? Look forward to your reply, Carol -------------------------------------------- ?summary(rpart.res) Call: rpart(formula = mydata$class ~ ., data
2011 Jan 24
1
How to measure/rank ?variable importance when using rpart?
--- included message ---- Thus, my question is: *What common measures exists for ranking/measuring variable importance of participating variables in a CART model? And how can this be computed using R (for example, when using the rpart package)* ---end ---- Consider the following printout from rpart summary(rpart(time ~ age + ph.ecog + pat.karno, data=lung)) Node number 1: 228 observations,
2012 Aug 01
1
rpart package: why does predict.rpart require values for "unused" predictors?
After fitting and pruning an rpart model, it is often the case that one or more of the original predictors is not used by any of the splits of the final tree. It seems logical, therefore, that values for these "unused" predictors would not be needed for prediction. But when predict() is called on such models, all predictors seem to be required. Why is that, and can it be easily
2008 May 12
3
help with rpart
Hi, I am using rpart as a part of my masters' project. I am trying to print out the resulting model using plot() function along with text() function. I am having difficulties with labels being cut-off. In text() function, I am using use.n=T option to get the number of people in each nodes but the on the lower and left part of the plot, the numbers get cut off. Thanks! Linus [[alternative
2008 Jul 22
2
rpart$where and predict.rpart
Hello there. I have fitted a rpart model. > rpartModel <- rpart(y~., data=data.frame(y=y,x=x),method="class", ....) and can use rpart$where to find out the terminal nodes that each observations belongs. Now, I have a set of new data and used predict.rpart which seems to give only the predicted value with no information similar to rpart$where. May I know how
2008 Jul 31
1
predict rpart: new data has new level
Hi. I uses rpart to build a regression tree. Y is continuous. Now, I try to predict on a new set of data. In the new set of data, one of my x (call Incoterm, a factor) has a new level. I wonder why the error below appears as the guide says "For factor predictors, if an observation contains a level not used to grow the tree, it is left at the deepest possible node and
2011 Jul 29
1
help with predict.rpart
? data=read.table("http://statcourse.com/research/boston.csv", , sep=",", header = TRUE) ? library(rpart) ? fit=rpart (MV~ CRIM+ZN+INDUS+CHAS+NOX+RM+AGE+DIS+RAD+TAX+ PT+B+LSTAT) predict(fit,data[4,]) plot only reveals part of the tree in contrast to the results on obtains with CART or C5 -------- Original Message -------- Subject: Re: [R] help with rpart From: Sarah
2008 Feb 26
1
predict.rpart question
Dear All, I have a question regarding predict.rpart. I use rpart to build classification and regression trees and I deal with data with relatively large number of input variables (predictors). For example, I build an rpart model like this rpartModel <- rpart(Y ~ X, method="class", minsplit =1, minbucket=nMinBucket,cp=nCp); and get predictors used in building the model like
2005 Oct 08
1
Rpart -- using predict() when missing data is present?
I am doing > library(rpart) > m <- rpart("y ~ x", D[insample,]) > D[outsample,] y x 8 0.78391922 0.579025591 9 0.06629211 NA 10 NA 0.001593063 > p <- predict(m, newdata=D[9,]) Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : invalid result from na.action How do I persuade him to give me NA
2011 Mar 04
4
cv.lm syntax error
Dear all, I've tried a multiple regression, and now I want to try a cross-validation. I obtain this error (it must be sth related to df) that I don't understand, any help would be appreciated. cv.lm(df= dat, lm2.52f, m=3) Error en `[.data.frame`(df, , ynam) : undefined columns selected lm2.52f is my lm object, dat is a dataframe where the variables involved in .lm are I tried CVlm
2012 Sep 21
1
prune in rpart: choose number terminal nodes
Dear community, I've an rpart object, and I know the CP I want. I'd like to know if it's possible also to fix the number of terminal nodes I want. Thanks in advance, user at host.com as user at host.com -- View this message in context: http://r.789695.n4.nabble.com/prune-in-rpart-choose-number-terminal-nodes-tp4643837.html Sent from the R help mailing list archive at
2009 May 22
1
bug in rpart?
Greetings, I checked the Indian diabetes data again and get one tree for the data with reordered columns and another tree for the original data. I compared these two trees, the split points for these two trees are exactly the same but the fitted classes are not the same for some cases. And the misclassification errors are different too. I know how CART deal with ties --- even we are using the
2012 Nov 01
0
oblique.tree : the predict function asserts the dependent variable to be included in "newdata"
Dear R community, I have recently discovered the package oblique.tree and I must admit that it was a nice surprise for me, since I have actually made my own version of a kind of a classifier which uses the idea of oblique splits (splits by means of hyperplanes). So I am now interested in comparing these two classifiers. But what I do not seem to understand is why the function
2010 Nov 18
1
predict() an rpart() model: how to ignore missing levels in a factor
I am using an algorigm to split my data set into two random sections repeatedly and constuct a model using rpart() on one, test on the other and average out the results. One of my variables is a factor(crop) where each crop type has a code. Some crop types occur infrequently or singly. when the data set is randomly split, it may be that the first data set has a crop type which is not present in
1999 Dec 23
1
rpart on Alpha under OSF
Running on an Alpha machine which reports (uname -a) OSF1 bsdx01.bs.ehu.es V4.0 878 alpha and using the binary distribution put together by Albrecht Gebhardt (in http://cran.at.r-project.org/bin/osf/osf4.0/tar/alpha_ev5/) I obtain core dumps whenever I try to use package rpart. I have R REMOVE'd the rpart package, downloaded the source rpart_1.0-7.tar from CRAN and
2012 May 03
1
NA's when subset in a dataframe
Dear community, I'm having this silly problem. I've a linear model. After fixing it, I wanted to know which data had studentized residuals larger than 3, so i tried this: d1 <- cooks.distance(lmmodel) r <- sqrt(abs(rstandard(lmmodel))) rstu <- abs(rstudent(lmmodel)) a <- cbind( mydata, d1, r,rstu) alargerthan3 <- a[rstu >3, ] And suddenly a[rstu >3, ] has
2001 Aug 02
1
Missing value in Rpart
Hi, all Our understanding of how classification trees in Rpart treat missing is that if the variable is ordinal(continous), Rpart, by default, imputes a value for missing. How do we do the classification tree and tell Rpart not to impute. That is, what command is used to turn off the imputation. Also, if we do get true missing, how does classification tree analysis in Rpart treat missing when