Stephen:
Your calls to best.svm() do not tune anything unless you specify the
parameter ranges (see the examples on the help page). Your calls are
just using the defaults which are very unlikely to yield models with
good performance.
[I think some day, I will have to remove the defaults in svm()...]
Another point: why aren't you using classification machines (which is
done automatically by providing a factor as dependent variable)?
There is classAgreement() in e1071, too, you might want to look at.
Cheers,
David
Hi
I am doing this sort of thing:
POLY:
> > obj = best.tune(svm, similarity ~., data = training, kernel
"polynomial")
> summary(obj)
Call:
best.tune(svm, similarity ~ ., data = training, kernel =
"polynomial")
Parameters:
SVM-Type: eps-regression
SVM-Kernel: polynomial
cost: 1
degree: 3
gamma: 0.04545455
coef.0: 0
epsilon: 0.1
Number of Support Vectors: 754
> svm.model <- svm(similarity ~., data = training, kernel
"polynomial", cost = 1, degree = 3, gamma = 0.04545455, coef.0 = 0,
epsilon = 0.1)> pred=predict(svm.model, testing)
> pred[pred > .5] = 1
> pred[pred <= .5] = 0
> table(testing$similarity, pred)
pred
0 1
0 30 8
1 70 63> obj = best.tune(svm, similarity ~., data = training, kernel
"linear")
> summary(obj)
LINEAR:
Call:
best.tune(svm, similarity ~ ., data = training, kernel = "linear")
Parameters:
SVM-Type: eps-regression
SVM-Kernel: linear
cost: 1
gamma: 0.04545455
epsilon: 0.1
Number of Support Vectors: 697
> svm.model <- svm(similarity ~., data = training, kernel =
"linear",
cost = 1, gamma = 0.04545455, epsilon = 0.1)> pred=predict(svm.model, testing)
> pred[pred > .5] = 1
> pred[pred <= .5] = 0
> table(testing$similarity, pred)
pred
0 1
0 6 32
1 4 129
RADIAL:
> obj = best.tune(svm, similarity ~., data = training, kernel
"radial")
> summary(obj)
Call:
best.tune(svm, similarity ~ ., data = training, kernel = "linear")
Parameters:
SVM-Type: eps-regression
SVM-Kernel: linear
cost: 1
gamma: 0.04545455
epsilon: 0.1
Number of Support Vectors: 697
> svm.model <- svm(similarity ~., data = training, kernel =
"radial",
cost = 1, gamma = 0.04545455, epsilon = 0.1)> pred=predict(svm.model, testing)
> pred[pred > .5] = 1
> pred[pred <= .5] = 0
> table(testing$similarity, pred)
pred
0 1
0 27 11
1 64 69
SIGMOID:
> obj = best.tune(svm, similarity ~., data = training, kernel
"sigmoid")
> summary(obj)
Call:
best.tune(svm, similarity ~ ., data = training, kernel = "sigmoid")
Parameters:
SVM-Type: eps-regression
SVM-Kernel: sigmoid
cost: 1
gamma: 0.04545455
coef.0: 0
epsilon: 0.1
Number of Support Vectors: 986
> svm.model <- svm(similarity ~., data = training, kernel =
"sigmoid",
cost = 1, gamma = 0.04545455, coef.0 = 0, epsilon = 0.1)> pred=predict(svm.model, testing)
> pred[pred > .5] = 1
> pred[pred <= .5] = 0
> table(testing$similarity, pred)
pred
0 1
0 8 30
1 26 107>
and then taking out the kappa statistic to see if I am getting anything
significant.
I get kappas of 15 - 17% - I don't think that is very good. I know
kappa is really for comparing the outcomes of two taggers but it seems a
good way to measure if your results might be by chance.
Two questions:
Any comments on Kappa and what it might be telling me?
What can I do to tune my kernels further?
Stephen
--
Dr. David Meyer
Department of Information Systems
Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746
Tel: +43-1-313 36x4393
HP: http://wi.wu-wien.ac.at/~meyer/