Hi,
I am trying to model credit risk data using decision trees. Since the
number of defaulters is less compared to non-defaulters (defaulters around
10%), we have the class imbalance problem. Consequently, the confusion
matrix shows that the number of misclassified non-defaulters is large.
Classifying a defaulter as non-defaulter is more expensive. How does one
include this information (penalty matrix) into rpart function?
Thanks and regards,
Dr S Muralidharan
Chief Scientist,
Tata Consultancy Services
17, Cathedral Road,
Chennai - 600 086,Tamil Nadu
India
Ph:- 91 44 66164513
Buzz:- 444 4513
Mailto: muralidharan.somasundaram@tcs.com
Website: http://www.tcs.com
____________________________________________
Experience certainty. IT Services
Business Solutions
Outsourcing
____________________________________________
=====-----=====-----====Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you
[[alternative HTML version deleted]]