Hi,
On Tue, Jan 5, 2010 at 7:01 PM, Amy Hessen <amy_4_5_84 at hotmail.com>
wrote:>
> Hi,
>
> I understand from help pages that in order to use a data set with svm, I
have to divide it into two files: one for the dataset without the class label
and the other file contains the class label as the following code:-
This isn't exactly correct ... look at the examples in the ?svm
documentation a bit closer.
> library(e1071)
> x<- read.delim("mydataset_except-class-label.txt")
> y<- read.delim("mydataset_class-labell.txt")
> model <- svm(x, y, cross=5)
> summary(model)
>
> but I couldn?t understand how I add ?formula? parameter to it? Does formula
contain the class label too??
Using the first example in ?svm
attach(iris)
model <- svm(Species ~ ., data = iris)
The first argument in the function call is the formula. The "Species"
column is the class label.
`iris` is a data.frame ... you can see that it has the label *in it*,
look at the output of "head(iris)
> and what I have to do to use testing set when I don?t use ?cross?
parameter.
Just follow the example in ?svm some more, you'll see training a model
and then testing it on data. The example happens to be the same data
the model trained on. To use new data, you'll just need a data
matrix/data.frame with as many columns as your original data, and as
many rows as you have observations.
The first step separates the labels from the data (you can do the same
in your data -- you don't have to have separate test and train files
that are different -- just remove the labels from it in R):
attach(iris)
x <- subset(iris, select = -Species)
y <- Species
model <- svm(x, y)
# test with train data
pred <- predict(model, x)
Hope that helps,
-steve
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact