thr3ads.net - R help - [R] help in SVM [Jun 2010]

If this information is useful, please help other people find it:
Share via:

Changbin Du

2010-Jun-24 17:22 UTC

[R] help in SVM

HI, GUYS,

I used the following codes to run SVM and get prediction on new data set hh.

 dim(all_h)
[1] 2034   24
 dim(hh)    # it contains all the variables besides the variables in all_h
data set.
[1] 640 415


require(e1071)

svm.tune<-tune(svm, as.factor(out) ~ ., data=all_h,
ranges=list(gamma=2^(-5:5), cost=2^(-5:5)))# find the best parameters.

bestg<-svm.tune$best.parameters[[1]]
bestc<-svm.tune$best.parameters[[2]]

svm.fit<-svm(as.factor(out) ~ ., data=all_h,
method="C-classification",
kernel="radial", probability = TRUE, cost=bestc, gamma=bestg,
cross=10) #
model fitting

svm.pred<-predict(svm.fit, hh, decision.values = TRUE, probability = TRUE) #
find the probability.
*
Error in matrix(ret$dec, nrow = nrow(newdata), byrow = TRUE, dimnames
list(rowns,  :
  invalid 'ncol' value (too large or NA)*

> head(all_h)       DD    HK HQ      IL      LP          NE          NP
TA          TP            WA      WC
1 0.00543  0  0 0.00815 0.00272 0.00543 0.00000 0.00000 0.00000 0.00000  0
3 0.00000  0  0 0.00890 0.00890 0.00712 0.00534 0.00000 0.00890 0.00178  0
4 0.00448  0  0 0.00448 0.00299 0.00448 0.00149 0.00299 0.00000 0.00149  0
5 0.00312  0  0 0.00467 0.00467 0.00000 0.00156 0.00467 0.00312 0.00467  0
6 0.00587  0  0 0.02053 0.00587 0.00000 0.00293 0.00587 0.00293 0.00000  0
7 0.00000  0  0 0.02422 0.00346 0.00000 0.00346 0.00346 0.00000 0.00346  0
       WD      WG      WN              YW        acid_per
base_per  charge_per
1 0.00000 0.00000 0.00000 0.00000 0.14402174 0.12228261 0.019021739
3 0.00178 0.00178 0.00534 0.00178 0.12277580 0.09252669 0.016014235
4 0.00149 0.00448 0.00448 0.00000 0.16591928 0.11509716 0.022421525
5 0.00000 0.00156 0.00000 0.00156 0.13084112 0.10903427 0.009345794
6 0.00293 0.00000 0.00000 0.00000 0.07038123 0.08797654 0.002932551
7 0.00000 0.00346 0.00000 0.00346 0.05536332 0.08650519 0.010380623
  hydrophob_per polar_per num_cell num_genes position             out
1     0.3804348 0.1929348        1         4        1   0
3     0.3540925 0.2508897        1         4        3   0
4     0.3393124 0.2032885        1         4        4   1
5     0.3753894 0.2305296        2         7        1   0
6     0.4868035 0.1964809        2         7        2   0
7     0.4878893 0.1522491        2         7        3   0
> quantile(hh$HK)     0%     25%     50%     75%    100%
0.00000 0.00000 0.00000 0.00000 0.02703> quantile(hh$HQ)   0%   25%   50%   75%  100%
0.000 0.000 0.000 0.000 0.025> quantile(hh$WC)     0%     25%     50%     75%    100%
0.00000 0.00000 0.00000 0.00000 0.01266

Can someone give some suggestions?

Thanks!





-- 
Sincerely,
Changbin
--

	[[alternative HTML version deleted]]

Steve Lianoglou

2010-Jun-24 17:43 UTC

head link

[R] help in SVM

Hi,

On Thu, Jun 24, 2010 at 1:22 PM, Changbin Du <changbind at gmail.com>
wrote:> HI, GUYS,
>
> I used the following codes to run SVM and get prediction on new data set
hh.
>
> ?dim(all_h)
> [1] 2034 ? 24
> ?dim(hh) ? ?# it contains all the variables besides the variables in all_h
> data set.
> [1] 640 415
If I understand you correctly, this is wrong.

You are supposed to hold out *observations* (rows) when doing
training/testing, not variables/predictors/features (cols).

Let's assume that e1071::svm doesn't do anything fancy with matching
column names between training/testing, then to put this simply: the
number of columns (features per observation) you are using in training
should be the same number of columns you have in your test set.

-steve
> require(e1071)
>
> svm.tune<-tune(svm, as.factor(out) ~ ., data=all_h,
> ranges=list(gamma=2^(-5:5), cost=2^(-5:5)))# find the best parameters.
>
> bestg<-svm.tune$best.parameters[[1]]
> bestc<-svm.tune$best.parameters[[2]]
>
> svm.fit<-svm(as.factor(out) ~ ., data=all_h,
method="C-classification",
> kernel="radial", probability = TRUE, cost=bestc, gamma=bestg,
cross=10) #
> model fitting
>
> svm.pred<-predict(svm.fit, hh, decision.values = TRUE, probability =
TRUE) #
> find the probability.
> *
> Error in matrix(ret$dec, nrow = nrow(newdata), byrow = TRUE, dimnames >
list(rowns, ?:
> ?invalid 'ncol' value (too large or NA)*
>
>
>> head(all_h)
> ? ? ? DD ? ?HK HQ ? ? ?IL ? ? ?LP ? ? ? ? ?NE ? ? ? ? ?NP
> TA ? ? ? ? ?TP ? ? ? ? ? ?WA ? ? ?WC
> 1 0.00543 ?0 ?0 0.00815 0.00272 0.00543 0.00000 0.00000 0.00000 0.00000 ?0
> 3 0.00000 ?0 ?0 0.00890 0.00890 0.00712 0.00534 0.00000 0.00890 0.00178 ?0
> 4 0.00448 ?0 ?0 0.00448 0.00299 0.00448 0.00149 0.00299 0.00000 0.00149 ?0
> 5 0.00312 ?0 ?0 0.00467 0.00467 0.00000 0.00156 0.00467 0.00312 0.00467 ?0
> 6 0.00587 ?0 ?0 0.02053 0.00587 0.00000 0.00293 0.00587 0.00293 0.00000 ?0
> 7 0.00000 ?0 ?0 0.02422 0.00346 0.00000 0.00346 0.00346 0.00000 0.00346 ?0
> ? ? ? WD ? ? ?WG ? ? ?WN ? ? ? ? ? ? ?YW ? ? ? ?acid_per
> base_per ?charge_per
> 1 0.00000 0.00000 0.00000 0.00000 0.14402174 0.12228261 0.019021739
> 3 0.00178 0.00178 0.00534 0.00178 0.12277580 0.09252669 0.016014235
> 4 0.00149 0.00448 0.00448 0.00000 0.16591928 0.11509716 0.022421525
> 5 0.00000 0.00156 0.00000 0.00156 0.13084112 0.10903427 0.009345794
> 6 0.00293 0.00000 0.00000 0.00000 0.07038123 0.08797654 0.002932551
> 7 0.00000 0.00346 0.00000 0.00346 0.05536332 0.08650519 0.010380623
> ?hydrophob_per polar_per num_cell num_genes position ? ? ? ? ? ? out
> 1 ? ? 0.3804348 0.1929348 ? ? ? ?1 ? ? ? ? 4 ? ? ? ?1 ? 0
> 3 ? ? 0.3540925 0.2508897 ? ? ? ?1 ? ? ? ? 4 ? ? ? ?3 ? 0
> 4 ? ? 0.3393124 0.2032885 ? ? ? ?1 ? ? ? ? 4 ? ? ? ?4 ? 1
> 5 ? ? 0.3753894 0.2305296 ? ? ? ?2 ? ? ? ? 7 ? ? ? ?1 ? 0
> 6 ? ? 0.4868035 0.1964809 ? ? ? ?2 ? ? ? ? 7 ? ? ? ?2 ? 0
> 7 ? ? 0.4878893 0.1522491 ? ? ? ?2 ? ? ? ? 7 ? ? ? ?3 ? 0
>
>> quantile(hh$HK)
> ? ? 0% ? ? 25% ? ? 50% ? ? 75% ? ?100%
> 0.00000 0.00000 0.00000 0.00000 0.02703
>> quantile(hh$HQ)
> ? 0% ? 25% ? 50% ? 75% ?100%
> 0.000 0.000 0.000 0.000 0.025
>> quantile(hh$WC)
> ? ? 0% ? ? 25% ? ? 50% ? ? 75% ? ?100%
> 0.00000 0.00000 0.00000 0.00000 0.01266
>
> Can someone give some suggestions?
>
> Thanks!
>
>
>
>
>
> --
> Sincerely,
> Changbin
> --
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

Maybe Matching Threads

Search for more possibly parallel threads

R help - Jun 2010 - help in SVM

[R] help in SVM

[R] help in SVM

Maybe Matching Threads