monkeychump@hushmail.com
2003-Jul-23 00:09 UTC
[R] Boosting, bagging and bumping. Questions about R tools and predictions.
I'm interested in further understanding the differences in using many classification trees to improve classification rates. I'm also interested in finding out what I can do in R and which methods will allow prediction. Can anybody point me to a citation or discussion? Specifically, I want to classify remotely sensed imagery where training data is extracted on class membership by the user. That training data (usually spectral bands and categorical data - e.g., soil type) is classified (using rpart for instance) and then the resulting tree is applied to the entire image. This results in a classified image that can then be checked for accuracy. Classification trees are increasingly used by the remote sensing folks but it seems like finding optimal trees is an active area of research in computational statistics. I've seen great claims made by baggers and boosters (and just what is bumping?) of increasing classification accuracy but aside from TreeNet by Salford Systems I'm not aware of tools that can grow forests of trees that can then be used to make predictions. Can anybody help? Promote security and make money with the Hushmail Affiliate Program:
Ko-Kang Kevin Wang
2003-Jul-23 00:22 UTC
[R] Boosting, bagging and bumping. Questions about R tools and predictions.
Hi, If you want to learn the theory of boosting and bagging, and other classification techniques, then you will want to refer to Hastie, Tibshirani and Friedman's "The Element of Statistical Learning: Data Mining, Inference, and Prediction" by Springer. It is the best book I have seen in these areas. To apply them (or some of the techniques) in R, the book you want to look at is Venables and Ripley's MASS 4 (Modern Applied Statistics with S). On Tue, 22 Jul 2003 monkeychump at hushmail.com wrote:> Date: Tue, 22 Jul 2003 17:09:47 -0700 > From: monkeychump at hushmail.com > To: r-help at stat.math.ethz.ch > Subject: [R] Boosting, > bagging and bumping. Questions about R tools and predictions. > > > I'm interested in further understanding the differences in using many > classification trees to improve classification rates. I'm also interested > in finding out what I can do in R and which methods will allow prediction. > Can anybody point me to a citation or discussion?-- Cheers, Kevin ------------------------------------------------------------------------------ "On two occasions, I have been asked [by members of Parliament], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able to rightly apprehend the kind of confusion of ideas that could provoke such a question." -- Charles Babbage (1791-1871) ---- From Computer Stupidities: http://rinkworks.com/stupid/ -- Ko-Kang Kevin Wang Master of Science (MSc) Student SLC Tutor and Lab Demonstrator Department of Statistics University of Auckland New Zealand Homepage: http://www.stat.auckland.ac.nz/~kwan022 Ph: 373-7599 x88475 (City) x88480 (Tamaki)
Mulholland, Tom
2003-Jul-23 02:07 UTC
[R] Boosting, bagging and bumping. Questions about R tools and predictions.
http://www.boosting.org/publications.html I found some of the papers on this page useful in understanding the concepts you refer to. I will leave it to the better informed members of the group to talk about the packages that relate to this field. -----Original Message----- From: monkeychump at hushmail.com [mailto:monkeychump at hushmail.com] Sent: Wednesday, 23 July 2003 8:10 AM To: r-help at stat.math.ethz.ch Subject: [R] Boosting,bagging and bumping. Questions about R tools and predictions. I'm interested in further understanding the differences in using many classification trees to improve classification rates. I'm also interested in finding out what I can do in R and which methods will allow prediction. Can anybody point me to a citation or discussion? _________________________________________________ Tom Mulholland Senior Policy Officer WA Country Health Service 189 Royal St, East Perth, WA, 6004 Tel: (08) 9222 4062 e-mail: Tom.Mulholland at health.wa.gov.au The contents of this e-mail transmission are confidential an...{{dropped}}
Gavin Simpson
2003-Jul-23 09:28 UTC
[R] Boosting, bagging and bumping. Questions about R tools and predictions.
Take a look at the randomForest package on CRAN: randomForest: Breiman's random forest for classification and regression Classification and regression based on a forest of trees using random inputs. Version: 3.9-6 Depends: R (>= 1.7.0) Author: Fortran original by Leo Breiman and Adele Cutler, R port by Andy Liaw and Matthew Wiener. Maintainer: Andy Liaw <andy_liaw at merck.com> which has a predict function HTH Gav monkeychump wrote:> I'm interested in further understanding the differences in using many > classification trees to improve classification rates. I'm also interested > in finding out what I can do in R and which methods will allow prediction. > Can anybody point me to a citation or discussion? > > Specifically, I want to classify remotely sensed imagery where training > data is extracted on class membership by the user. That training data > (usually spectral bands and categorical data - e.g., soil type) is classified > (using rpart for instance) and then the resulting tree is applied to > the entire image. This results in a classified image that can then be > checked for accuracy. Classification trees are increasingly used by the > remote sensing folks but it seems like finding optimal trees is an active > area of research in computational statistics. > > I've seen great claims made by baggers and boosters (and just what is > bumping?) of increasing classification accuracy but aside from TreeNet > by Salford Systems I'm not aware of tools that can grow forests of trees > that can then be used to make predictions. > > Can anybody help? > > > > > > > > > > > Promote security and make money with the Hushmail Affiliate Program: > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > >-- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [T] +44 (0)20 7679 5522 ENSIS Research Fellow [F] +44 (0)20 7679 7565 ENSIS Ltd. & ECRC [E] gavin.simpson at ucl.ac.uk UCL Department of Geography [W] http://www.ucl.ac.uk/~ucfagls/cv/ 26 Bedford Way [W] http://www.ucl.ac.uk/~ucfagls/ London. WC1H 0AP. %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%