Hi, I am trying to model credit risk data using decision trees. Since the number of defaulters is less compared to non-defaulters (defaulters around 10%), we have the class imbalance problem. Consequently, the confusion matrix shows that the number of misclassified non-defaulters is large. Classifying a defaulter as non-defaulter is more expensive. How does one include this information (penalty matrix) into rpart function? Thanks and regards, Dr S Muralidharan Chief Scientist, Tata Consultancy Services 17, Cathedral Road, Chennai - 600 086,Tamil Nadu India Ph:- 91 44 66164513 Buzz:- 444 4513 Mailto: muralidharan.somasundaram@tcs.com Website: http://www.tcs.com ____________________________________________ Experience certainty. IT Services Business Solutions Outsourcing ____________________________________________ =====-----=====-----====Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you [[alternative HTML version deleted]]