strinz at freenet.de
2007-Aug-01 08:52 UTC
[R] RWeka cross-validation and Weka_control Parametrization
Hello, I have two questions concerning the RWeka package: 1.) First question: How can one perform a cross validation, -say 10fold- for a given data set and given model ? 2.) Second question What is the correct syntax for the parametrization of e.g. Kernel classifiers interface m1 <- SMO(Species ~ ., data = iris, control = Weka_control(K="weka.classifiers.functions.supportVector.RBFKernel",G=0.1)) m2 <- SMO(Species ~ ., data = iris, control = Weka_control(K="weka.classifiers.functions.supportVector.RBFKernel",G=1.0)) > m1 SMO Kernel used: RBF kernel: K(x,y) = e^-(0.01* <x-y,x-y>^2) ## should be: RBF kernel: K(x,y) = e^-(0.1* <x-y,x-y>^2) > m2 SMO Kernel used: RBF kernel: K(x,y) = e^-(0.01* <x-y,x-y>^2) ## should be: RBF kernel: K(x,y) = e^-(1.0* <x-y,x-y>^2) That is, the control arguments ignores the parameter 'G' (Gamma) for the above syntax. What's wrong with this syntax ? many thanks Bjoern
Kurt Hornik
2007-Aug-14 08:54 UTC
[R] RWeka cross-validation and Weka_control Parametrization
> On Wed, 01 Aug 2007 10:52:02 +0200, Bjoern wrote:> Hello,> I have two questions concerning the RWeka package:> 1.) First question: > How can one perform a cross validation, -say 10fold- for a given > data set and given model ?> 2.) Second question > What is the correct syntax for the parametrization of e.g. Kernel > classifiers interface > m1 <- SMO(Species ~ ., data = iris, control = > Weka_control(K="weka.classifiers.functions.supportVector.RBFKernel",G=0.1)) > m2 <- SMO(Species ~ ., data = iris, control = > Weka_control(K="weka.classifiers.functions.supportVector.RBFKernel",G=1.0))>> m1 > SMO> Kernel used: > RBF kernel: K(x,y) = e^-(0.01* <x-y,x-y>^2)> ## should be: RBF kernel: K(x,y) = e^-(0.1* <x-y,x-y>^2)> etc.The answer for question 2 is surprisingly simple, but nevertheless took me about half an hour to find: m2 <- SMO(Species ~ ., data = iris, control = Weka_control(K = "weka.classifiers.functions.supportVector.RBFKernel -G 2")) gives R> m2 SMO Kernel used: RBF kernel: K(x,y) = e^-(2.0* <x-y,x-y>^2) [Using Weka_control(K = ..., G = ...) passes the G option to SMO but not RBFKernel. The docs for SMO() say -K <classname and parameters> The Kernel to use. (default: weka.classifiers.functions.supportVector.PolyKernel) and one needs to remember Weka's command line style interface to realize that this deparses into putting everything into a string for the K option.] This is of course not quite what R users would expect, and we'll try to improve the Weka control mechanism so that specifying (Weka class) options which require additional parameters becomes more convenient. Best -k