similar to: rpart tuning question

Displaying 20 results from an estimated 20000 matches similar to: "rpart tuning question"

2010 Mar 06
3
r code to generate interaction columns
Hi, is there a way to take a dataset and extract numeric columns and create interaction columns from it automatically? For e.g. there are 5 columns of data: A,B,C,D,E. CDE are numeric. Can someone provide code to automatically create more columns such as: 1) C*D, C*E, C*D*E, (C+E)/(D+.01 (to avoid divide by zero), (D+E)/(C+.01 (to avoid divide by zero), (C+D)/(E+.01 (to avoid
2008 Oct 19
3
pairs plots in R
Hi, is there a way to take a data frame with 100+ columns and large data set to do efficient exploratory analysis in R with pairs? I find using pairs on the whole matrix is slow and the resulting matrix is tiny. Also the variable of interest for me is a binary var Y or N . Is there an efficient way to graphically view many variable relationships that does not look teeny ? I could do
2008 Oct 02
2
aggregate empty row for pretty appearance also subtotal if possible
Hi, To pretty print aggregates by various dimensions I needed to add a empty row in output of aggregate. For example. d<-(aggregate(data[,cbind("x")], by=list(data$group1,data$group2), sum)) Group.1 Group.2 x 1 A N 3 2 A Y 2 3 B N 420164905 Is there a way to add an empty
2008 Sep 29
1
persistence of model in R to a file
Hi, Is there a way to save R models (glm, lm , rpart etc) in a file that be read in later? I noticed models take up space. by space them off and removing them from memory it seems that would be useful. Also why do the models keep a copy of all columns in the original data set even those columns are not in the model. E.g. if I build a model on columns A, B even thought column C
2008 Oct 27
0
Displaying number of Y/N affected by tree in rule form RE: R question/request on rules from rpart
Hi Prof. Williams, thanks for your suggestion. The updated code is below. It turns out it was a matter of displaying the second column in yval to get the number of N and subtracting it from the n column in the frame to get the number of Y remaining in a binary example. once this is added now the function returns the rules along with Y and N count affected by the resulting rule. I am ccing
2010 Dec 14
1
rpart - how to estimate the “meaningful” predictors for an outcome (in classification trees)
Hi dear R-help memebers, When building a CART model (specifically classification tree) using rpart, it is sometimes obvious that there are variables (X's) that are meaningful for predicting some of the outcome (y) variables - while other predictors are relevant for other outcome variables (y's only). *How can it be estimated, which explanatory variable is "used" for which of
2008 Dec 17
1
pruning trees using rpart
Hi, I am using the packages tree and rpart to build a classification tree to predict a 0/1 outcome. The package rpart has the advantage that the function plotcp gives a visual representation of the cross-validation results with a horizontal line indicating the 1 standard error rule, i.e. the recommendation to select the most parsimonious model (the smallest tree) whose error is not more than one
2006 Mar 07
3
how to use the rpart function?
Hi all, What parameter do I normally change in the rpart function? How do I set the "cp" option? Is there a way to read off error rate directly from the "rpart" function for training data; I imagine for testing data I have to apply a "predict", but for training data I guess the error count would be somewhere existing once the "rpart" function is
2001 Aug 02
1
Missing value in Rpart
Hi, all Our understanding of how classification trees in Rpart treat missing is that if the variable is ordinal(continous), Rpart, by default, imputes a value for missing. How do we do the classification tree and tell Rpart not to impute. That is, what command is used to turn off the imputation. Also, if we do get true missing, how does classification tree analysis in Rpart treat missing when
2008 Dec 23
1
sorting regression coefficients by p-value
Hi, Is there a way to get/extract a matrix of regression variable name, coefficient, and p values? (for lm and glm; which can be sort by p value?) thanks Dhruv [[alternative HTML version deleted]]
2011 Nov 04
1
Decision tree model using rpart ( classification
Hi Experts, I am new to R, using decision tree model for getting segmentation rules. A) Using behavioural data (attributes defining customer behaviour, ( example balances, number of accounts etc.) 1. Clustering: Cluster behavioural data to suitable number of clusters 2. Decision Tree: Using rpart classification tree for generating rules for segmentation using cluster number(cluster id) as target
2005 Oct 14
1
Predicting classification error from rpart
Hi, I think I'm missing something very obvious, but I am missing it, so I would be very grateful for help. I'm using rpart to analyse data on skull base morphology, essentially predicting sex from one or several skull base measurements. The sex of the people whose skulls are being studied is known, and lives as a factor (M,F) in the data. I want to get back predictions of gender, and
2007 Feb 27
3
rpart minimum sample size
Is there an optimal / minimum sample size for attempting to construct a classification tree using /rpart/? I have 27 seagrass disturbance sites (boat groundings) that have been monitored for a number of years. The monitoring protocol for each site is identical. From the monitoring data, I am able to determine the level of recovery that each site has experienced. Recovery is our
2011 Apr 08
4
Rpart decision tree
Dear useRs: I try to plot an rpart object but cannot get a nice tree structure plot. I am using plot.rpart and text.rpart (please see below) but the branches that connect the nodes overlap the text in the ellipses and rectangles. Is there a way to get a clean nice tree plot (as in the Rpart Mayo report)? I work under Windows and use R2.11.1 with rpart version 3.1-46. Thank you. Tudor ...
2004 Jun 11
1
Error when I try to build / plot a tree using rpart()
Hi, I am using the rpart package to build a classification tree. I did manage to build a tree with data on a previous project. However, when attampting to build a tree on a project I am working on, I seem to be getting the error shown below: > nhg3.rp <- rpart(profitresp ~., nhg3, method="class") > plot(nhg3.rp, branch=0.4, uniform=T); text(nhg3.rp, digits=3) Error in
2003 Feb 12
1
rpart v. lda classification.
I've been groping my way through a classification/discrimination problem, from a consulting client. There are 26 observations, with 4 possible categories and 24 (!!!) potential predictor variables. I tried using lda() on the first 7 predictor variables and got 24 of the 26 observations correctly classified. (Training and testing both on the complete data set --- just to get started.) I
2009 May 12
1
questions on rpart (tree changes when rearrange the order of covariates?!)
Greetings, I am using rpart for classification with "class" method. The test data is the Indian diabetes data from package mlbench. I fitted a classification tree firstly using the original data, and then exchanged the order of Body mass and Plasma glucose which are the strongest/important variables in the growing phase. The second tree is a little different from the first one. The
2009 Jul 26
3
Question about rpart decision trees (being used to predict customer churn)
Hi, I am using rpart decision trees to analyze customer churn. I am finding that the decision trees created are not effective because they are not able to recognize factors that influence churn. I have created an example situation below. What do I need to do to for rpart to build a tree with the variable experience? My guess is that this would happen if rpart used the loss matrix while creating
2011 Dec 31
1
Cross-validation error with tune and with rpart
Hello list, I'm trying to generate classifiers for a certain task using several methods, one of them being decision trees. The doubts come when I want to estimate the cross-validation error of the generated tree: tree <- rpart(y~., data=data.frame(xsel, y), cp=0.00001) ptree <- prune(tree, cp=tree$cptable[which.min(tree$cptable[,"xerror"]),"CP"]) ptree$cptable
2003 Apr 10
1
Classification problem - rpart
I am performing a binary classification using a classification tree. Ironically, the data themselves are 2483 tree (real biological ones) locations as described by a suite of environmental variables (slope, soil moisture, radiation load, etc). I want to separate them from an equal number of random points. Doing eda on the data shows that there is substantial difference between the tree and random