thr3ads.net - similar to: "predict.rpart and large datasets"

Displaying 20 results from an estimated 10000 matches similar to: "predict.rpart and large datasets"

Memory problems with large dataset in rpart

2005 Oct 18

Memory problems with large dataset in rpart

Dear helpers, I am a Dutch student from the Erasmus University. For my Bachelor thesis I have written a script in R using boosting by means of classification and regression trees. This script uses the function the predefined function rpart. My input file consists of about 4000 vectors each having 2210 dimensions. In the third iteration R complains of a lack of memory, although in each iteration

rpart$where and predict.rpart

2008 Jul 22

rpart$where and predict.rpart

Hello there. I have fitted a rpart model. > rpartModel <- rpart(y~., data=data.frame(y=y,x=x),method="class", ....) and can use rpart$where to find out the terminal nodes that each observations belongs. Now, I have a set of new data and used predict.rpart which seems to give only the predicted value with no information similar to rpart$where. May I know how

help with predict.rpart

2011 Jul 29

help with predict.rpart

? data=read.table("http://statcourse.com/research/boston.csv", , sep=",", header = TRUE) ? library(rpart) ? fit=rpart (MV~ CRIM+ZN+INDUS+CHAS+NOX+RM+AGE+DIS+RAD+TAX+ PT+B+LSTAT) predict(fit,data[4,]) plot only reveals part of the tree in contrast to the results on obtains with CART or C5 -------- Original Message -------- Subject: Re: [R] help with rpart From: Sarah

predict.rpart question

2008 Feb 26

predict.rpart question

Dear All, I have a question regarding predict.rpart. I use rpart to build classification and regression trees and I deal with data with relatively large number of input variables (predictors). For example, I build an rpart model like this rpartModel <- rpart(Y ~ X, method="class", minsplit =1, minbucket=nMinBucket,cp=nCp); and get predictors used in building the model like

Question about rpart decision trees (being used to predict customer churn)

2009 Jul 26

Question about rpart decision trees (being used to predict customer churn)

Hi, I am using rpart decision trees to analyze customer churn. I am finding that the decision trees created are not effective because they are not able to recognize factors that influence churn. I have created an example situation below. What do I need to do to for rpart to build a tree with the variable experience? My guess is that this would happen if rpart used the loss matrix while creating

Rpart -- using predict() when missing data is present?

2005 Oct 08

Rpart -- using predict() when missing data is present?

I am doing > library(rpart) > m <- rpart("y ~ x", D[insample,]) > D[outsample,] y x 8 0.78391922 0.579025591 9 0.06629211 NA 10 NA 0.001593063 > p <- predict(m, newdata=D[9,]) Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : invalid result from na.action How do I persuade him to give me NA

rpart.predict error--subscript out of bounds

2006 Apr 07

rpart.predict error--subscript out of bounds

Hi, I am using rpart to do leave one out cross validation, but met some problem, Data is a data frame, the first column is the subject id, the second column is the group id, and the rest columns are numerical variables, > Data[1:5,1:10] sub.id group.id X3262.345 X3277.402 X3369.036 X3439.895 X3886.935 X3939.054 X3953.777 X3970.352 1 32613 HAM_TSP 417.7082 430.4895 619.4776 720.8246

Multiple return values / bug in rpart?

2013 Aug 12

Multiple return values / bug in rpart?

In the recommended package rpart (version 4.1-1), the file rpartpl.R contains the following line: return(x = x[!erase], y = y[!erase]) AFAIK, returning multiple values like this is not valid R. Is that correct? I can't seem to make it work in my own code. It doesn't appear that rpartpl.R is used anywhere, so this may have never caused an issue. But it's tripping up my R compiler.

rpart - predict terminal nodes for new observations

2012 May 15

rpart - predict terminal nodes for new observations

Dear useRs: Is there a way I could predict the terminal node associated with a new data entry in an rpart environment? In the example below, if I had a new data entry with an AM of 5, I would like to link it to the terminal node 2. My searches led to http://tolstoy.newcastle.edu.au/R/e4/help/08/07/17702.html but I do not seem to be able to operationalize Professor Ripley's suggestions. Many

rpart package: why does predict.rpart require values for "unused" predictors?

2012 Aug 01

rpart package: why does predict.rpart require values for "unused" predictors?

After fitting and pruning an rpart model, it is often the case that one or more of the original predictors is not used by any of the splits of the final tree. It seems logical, therefore, that values for these "unused" predictors would not be needed for prediction. But when predict() is called on such models, all predictors seem to be required. Why is that, and can it be easily

predict rpart: new data has new level

2008 Jul 31

predict rpart: new data has new level

Hi. I uses rpart to build a regression tree. Y is continuous. Now, I try to predict on a new set of data. In the new set of data, one of my x (call Incoterm, a factor) has a new level. I wonder why the error below appears as the guide says "For factor predictors, if an observation contains a level not used to grow the tree, it is left at the deepest possible node and

predict.rpart help

2011 Mar 23

predict.rpart help

Hi Everyone, Is there a way to get predict.rpart() to return the nodes reached by the new examples in addition to the predicted probabilities it already returns? In other words, I would like to know the leaf node in the tree object that each new example data drops down to. Thanks in advance for your help. Osei

Large file size while persisting rpart model to disk

2009 Feb 03

Large file size while persisting rpart model to disk

I am using rpart to build a model for later predictions. To save the prediction across restarts and share the data across nodes I have been using "save" to persist the result of rpart to a file and "load" it later. But the saved size was becoming unusually large (even with binary, compressed mode). The size was also proportional to the amount of data that was used to create the

predict() an rpart() model: how to ignore missing levels in a factor

2010 Nov 18

predict() an rpart() model: how to ignore missing levels in a factor

I am using an algorigm to split my data set into two random sections repeatedly and constuct a model using rpart() on one, test on the other and average out the results. One of my variables is a factor(crop) where each crop type has a code. Some crop types occur infrequently or singly. when the data set is randomly split, it may be that the first data set has a crop type which is not present in

Predicting classification error from rpart

2005 Oct 14

Predicting classification error from rpart

Hi, I think I'm missing something very obvious, but I am missing it, so I would be very grateful for help. I'm using rpart to analyse data on skull base morphology, essentially predicting sex from one or several skull base measurements. The sex of the people whose skulls are being studied is known, and lives as a factor (M,F) in the data. I want to get back predictions of gender, and

Puzzled at rpart prediction

2005 Aug 04

Puzzled at rpart prediction

I'm in a situation where I say: > predict(m.rpart, newdata=D[N1+t,]) 0 1 173 0.8 0.2 which I interpret as meaning: an 80% chance of "0" and a 20% chance of "1". Okay. This is consistent with: > predict(m.rpart, newdata=D[N1+t,], type="class") [1] 0 Levels: 0 1 But I'm puzzled at the following. If I say: > predict(m.rpart,

Why is rpart() so slow?

2004 Mar 19

Why is rpart() so slow?

I've had rpart running on a problem now for a couple of *days*, but I'd expect a decision tree builder to run in minutes if not seconds. Why is rpart slow? Is there anything I can do to make it quicker?

Xenial rpart package on CRAN built with wrong R version?

2018 Aug 14

Xenial rpart package on CRAN built with wrong R version?

Hello, I just upgraded my Ubuntu Xenial system to R 3.5.1 (from 3.4.?) by changing the sources.list entry and doing an "apt-get dist-upgrade". Everything works except loading the rpart package in R: > library(rpart) Error: package or namespace load failed for ?rpart?: package ?rpart? was installed by an R version with different internals; it needs to be reinstalled for use with

question regarding to The tree Package for R

2002 Feb 21

question regarding to The tree Package for R

I have a problem with running the tree package (dec.8, 2001) for R. The problem is, it will only give me 5/6 terminal node and then stop, while using Splus's tree on the same data with the same specification give me hundreds of nodes. Here's a little more background info: R-1.4.1 Solaris 5.7 rpart (most recent version) tree (..) Splus 6.0 Solaris 5.7 tree

rpart - the xval argument in rpart.control and in xpred.rpart

2009 Jun 09

rpart - the xval argument in rpart.control and in xpred.rpart

Dear R users, I'm working with the rpart package and want to evaluate the performance of user defined split functions. I have some problems in understanding the meaning of the xval argument in the two functions rpart.control and xpred.rpart. In the former it is defined as the number of cross-validations while in the latter it is defined as the number of cross-validation groups. If I am

similar to: predict.rpart and large datasets