Hi useR's, I am resending this request since I got no response for my last post and I am new to the list so pardon me if I am violating the protocol. I am trying to use the "Kernlab" package for training and prediction using SVM's. I am getting the following error when I am trying to use the predict function:> predictSvm = predict(modelforSVM, testSeq);Error in `contrasts<-`(`*tmp*`, value = "contr.treatment") : contrasts can be applied only to factors with 2 or more levels The training file is a data frame with 501 columns: Col 1 is "Class" which is "+" or "-" and Cols V1 to V500 are "A/C/G/T" . There are 200 seq's for training (100 + and - each). this is very similar to the "promotergene" data set included as example with the package. The model that I have generated is as follows: modelforSVM <- ksvm(Class ~ ., data = train500, kernel = "rbfdot", kpar "automatic", C = 60, cross = 3, prob.model = TRUE) The testSeq is a vector of 500 characters casted as a data.frame. I tried adding the Class column as well later to the testSeq data frame but got the same error. I am using R with windows, 32 bit, version 2.9.0 Any help that I can get is really appreciated. Thanks, Vishal [[alternative HTML version deleted]]
Hi, Comments in line: On Thu, Dec 24, 2009 at 11:42 PM, Vishal Thapar <vishalthapar at gmail.com> wrote:> Hi useR's, > > I am resending this request since I got no response for my last post and I > am new to the list so pardon me if I am violating the protocol. > > I am trying to use the "Kernlab" package for training and prediction using > SVM's. I am getting the following error when I am trying to use the predict > function: > >> predictSvm = predict(modelforSVM, testSeq); > Error in `contrasts<-`(`*tmp*`, value = "contr.treatment") : ? contrasts can > be applied only to factors with 2 or more levelsIt's hard to say without a reproducible example, but it looks like the data you are sending into your predict function is different than what the svm has seen in training. What do these commands return over your data? 1. is(train500) 2. is(train500$class) 3. is(train500[1,5]) 4. is(testSeq) 5. is(testSeq[1,5])> The training file is a data frame with 501 columns: Col 1 is "Class" which > is "+" or "-" and Cols V1 to V500 are "A/C/G/T" . There are 200 seq's for > training (100 + and - each). this is very similar to the "promotergene" data > set included as example with the package.How similar are we talking -- something is (obviously) off because using the promotergene dataset is quite straightforward: library(kernlab) data(promotergene) tr <- promotergene[1:90,] ts <- promotergene[91:106,] m <- ksvm(Class~., data=promotergene, kernel="rbfdot", kpar "automatic", C = 60, cross = 3, prob.model = TRUE) p <- predict(m, ts)> > The model that I have generated is as follows: > > modelforSVM <- ksvm(Class ~ ., data = train500, kernel = "rbfdot", kpar > "automatic", C = 60, cross = 3, prob.model = TRUE) > > The testSeq is a vector of 500 characters casted as a data.frame.What does that mean, exactly? How did you do that? Can't you just start with all of your data in a data.frame and "cut out" the training and testing data.frames like I did above with the promotorgene data (see the tr and ts vars) -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
On Dec 24, 2009, at 11:42 PM, Vishal Thapar wrote:> Hi useR's, > > I am resending this request since I got no response for my last post > and I > am new to the list so pardon me if I am violating the protocol. > > I am trying to use the "Kernlab" package for training and prediction > using > SVM's. I am getting the following error when I am trying to use the > predict > function:I'm guessing that the package is really "kernlab".> >> predictSvm = predict(modelforSVM, testSeq); > Error in `contrasts<-`(`*tmp*`, value = "contr.treatment") : > contrasts can > be applied only to factors with 2 or more levelsSounds like R does not like the structure of your testSeq argument. Perhaps it was expecting a factor argument with levels that matched those used in the training set?> > The training file is a data frame with 501 columns: Col 1 is "Class" > which > is "+" or "-" and Cols V1 to V500 are "A/C/G/T" . There are 200 > seq's for > training (100 + and - each). this is very similar to the > "promotergene" data > set included as example with the package.> > The model that I have generated is as follows: > > modelforSVM <- ksvm(Class ~ ., data = train500, kernel = "rbfdot", > kpar > "automatic", C = 60, cross = 3, prob.model = TRUE) > > The testSeq is a vector of 500 characters casted as a data.frame. I > tried > adding the Class column as well later to the testSeq data frame but > got the > same error. >Why not offer the results of dput() on that object. Or you could offer the output of str(testSeq) , even if you aren't going to create a smaller test object that could be used for testing.> I am using R with windows, 32 bit, version 2.9.0 > > Any help that I can get is really appreciated. > > Thanks, > > VishalDavid Winsemius, MD Heritage Laboratories West Hartford, CT