Furkan Gürsoy
2014-Apr-17 06:23 UTC
[R] Finding best cut points to discretize continuous values for ID3 algorithm
Hi all, I am a senior student of MIS department at Bogazici University, Istanbul. My graduation project's focus is on data mining using R. I would like to write a function which takes find best(or near to best) cut points to discretize continuous values to minimize the conditional entropy based on a class variable. For instance, a= c(1,2,3,4,5,6) b= c(a,a,b,b,c,c) The function I would like to write is supposed to tell me that cut points should be around 2.5 and 4.5 to minimize the entropy which in this case would be zero because I would have perfect information. I've searched for a similar function but I didn't manage to find one. I wonder if there is any function like this already? If no, would it be helpful if I create a function like this? Kind regards, Furkan Gursoy Senior Student at MIS Bogazici University '14 [[alternative HTML version deleted]]