Hi, First, thanks to those who helped me see my gross misunderstanding of randomForest. I worked through a baging tutorial and now understand the "many tree" approach. However, it is not what I want to do! My bagged errors are accpetable but I need to use the actual tree and need a single tree application. I am using rpart for a classification tree but am interested in a more unbaised estimator of error in my tree. I lack sufficent data to train and test the tree and I'm hoping to bootstrap, or rather jacknife, an error estimate. I do not think the rpart.object can be applied to the jackknife function in bootstrap but can I do something as simple as: for(i in 1:number of samples){ remove i from the data run the tree compare sample[i] to the tree using predict create an error matrix} This would give me a confussion matrix of data not included in the tree's constuction. Am I being obtuse again? Thanks, CM
On Wed, 16 Apr 2003 10:28:08 -0700 chumpmonkey at hushmail.com wrote:> > Hi, > > First, thanks to those who helped me see my gross misunderstanding of > randomForest. I worked through a baging tutorial and now understand the > "many tree" approach. However, it is not what I want to do! My bagged > errors are accpetable but I need to use the actual tree and need a single > tree application. > > I am using rpart for a classification tree but am interested in a more > unbaised estimator of error in my tree. I lack sufficent data to train > and test the tree and I'm hoping to bootstrap, or rather jacknife, an > error estimate. > > I do not think the rpart.object can be applied to the jackknife function > in bootstrap but can I do something as simple as: > > for(i in 1:number of samples){ > remove i from the data > run the tree > compare sample[i] to the tree using predict > create an error matrix} > > This would give me a confussion matrix of data not included in the tree's > constuction. > > Am I being obtuse again? > > Thanks, CM > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-helpYou might look at the validate.tree function in the Design library (http://hesweb1.med.virginia.edu/biostat/s/Design.html) but better validated predictive accuracy would be obtained by approximating the predictions from the randomForest by a single (moderately large) tree. You can use rpart to develop such a tree, stopping when, for example, the R-square is 0.9 or 0.95. --- Frank E Harrell Jr Prof. of Biostatistics & Statistics Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences U. Virginia School of Medicine http://hesweb1.med.virginia.edu/biostat
That's essentially leave-one-out cross-validation. In addition to Frank's suggestion, you might want to check out the errorest() function in the ipred package. You can do k-fold CV or the .632+ bootstrap. HTH, Andy> -----Original Message----- > From: chumpmonkey at hushmail.com [mailto:chumpmonkey at hushmail.com] > Sent: Wednesday, April 16, 2003 1:28 PM > To: R-help at stat.math.ethz.ch > Subject: [R] Jackknife and rpart > > > > Hi, > > First, thanks to those who helped me see my gross misunderstanding of > randomForest. I worked through a baging tutorial and now > understand the > "many tree" approach. However, it is not what I want to do! My bagged > errors are accpetable but I need to use the actual tree and > need a single > tree application. > > I am using rpart for a classification tree but am interested in a more > unbaised estimator of error in my tree. I lack sufficent data to train > and test the tree and I'm hoping to bootstrap, or rather jacknife, an > error estimate. > > I do not think the rpart.object can be applied to the > jackknife function > in bootstrap but can I do something as simple as: > > for(i in 1:number of samples){ > remove i from the data > run the tree > compare sample[i] to the tree using predict > create an error matrix} > > This would give me a confussion matrix of data not included > in the tree's > constuction. > > Am I being obtuse again? > > Thanks, CM > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help >