Hello all, I am currently working with rpart to classify vegetation types by spectral characteristics, and am comming up with poor classifications based on the fact that I have some vegetation types that have only 15 observations, while others have over 100. I have attempted to supply prior weights to the dataset, though this does not improve the classification greatly. Could anyone supply some hints about how to improve a classification for a badly unbalanced datase? Thank you, Helen Mills Poulos
Check this thread: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/40898.html On 7/21/06, helen.mills at yale.edu <helen.mills at yale.edu> wrote:> Hello all, > I am currently working with rpart to classify vegetation types by spectral > characteristics, and am comming up with poor classifications based on the fact > that I have some vegetation types that have only 15 observations, while others > have over 100. I have attempted to supply prior weights to the dataset, though > this does not improve the classification greatly. Could anyone supply some > hints about how to improve a classification for a badly unbalanced datase? > > Thank you, > Helen Mills Poulos > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Dear Helen, You may want to have a look at http://www.togaware.com/datamining/survivor/Predicting_Fraud.html Greets, Diego Kuonen helen.mills at yale.edu wrote:> Hello all, > I am currently working with rpart to classify vegetation types by spectral > characteristics, and am comming up with poor classifications based on the fact > that I have some vegetation types that have only 15 observations, while others > have over 100. I have attempted to supply prior weights to the dataset, though > this does not improve the classification greatly. Could anyone supply some > hints about how to improve a classification for a badly unbalanced datase? > > Thank you, > Helen Mills Poulos-- Dr. ?s sc. Diego Kuonen, CEO phone +41 (0)21 693 5508 Statoo Consulting fax +41 (0)21 693 8765 PO Box 107 mobile +41 (0)78 709 5384 CH-1015 Lausanne 15 email kuonen at statoo.com web http://www.statoo.info skype Kuonen.Statoo.Consulting ----------------------------------------------------------------- | Statistical Consulting + Data Analysis + Data Mining Services | ----------------------------------------------------------------- + Are you drowning in information and starving for knowledge? + + Have you ever been Statooed? http://www.statoo.biz +