I have a question about data mining. I have a dataset of 70 instances with 14 features that belong to 4 classes. As the number of each class is not enough to obtain a good accuracy using some classifiers( svm, rna, knn) I need to "oversampling" the number of instances of each class. I have heard that there is a method to do this. It consists in generating these new instances as follows: new_instance <---- original_instance + u(epsilon) U(epsilon) is a uniform number in the range [-epsilon,epsilon] and this number is applied to each feature of the dataset to obtain a new instance without modified the original class. Anybody has used this method to "oversampling" his data? Anybody has more information about it? Thanks in advance! [[alternative HTML version deleted]]
Prof. Dr. Matthias Kohl
2013-Mar-31 09:18 UTC
[R] Creating new instances from original ones
see function SMOTE in package DMwR hth Matthias On 31.03.2013 10:46, Nicol?s S?nchez wrote:> I have a question about data mining. I have a dataset of 70 instances with > 14 features that belong to 4 classes. As the number of each class is not > enough to obtain a good accuracy using some classifiers( svm, rna, knn) I > need to "oversampling" the number of instances of each class. > > I have heard that there is a method to do this. It consists in generating > these new instances as follows: > > new_instance <---- original_instance + u(epsilon) > > U(epsilon) is a uniform number in the range [-epsilon,epsilon] and this > number is applied to each feature of the dataset to obtain a new instance > without modified the original class. > > Anybody has used this method to "oversampling" his data? Anybody has more > information about it? > > Thanks in advance! > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Prof. Dr. Matthias Kohl www.stamats.de