similar to: Nominal variables in SVM?

Displaying 20 results from an estimated 30000 matches similar to: "Nominal variables in SVM?"

2011 Jan 07
2
Stepwise SVM Variable selection
I have a data set with about 30,000 training cases and 103 variable. I've trained an SVM (using the e1071 package) for a binary classifier {0,1}. The accuracy isn't great. I used a grid search over the C and G parameters with an RBF kernel to find the best settings. I remember that for least squares, R has a nice stepwise function that will try combining subsets of variables to find
2009 Aug 30
1
SVM coefficients
Hello, I'm using the svm function from the e1071 package. It works well and gives me nice results. I'm very curious to see the actual coefficients calculated for each input variable. (Other packages, like RapidMiner, show you this automatically.) I've tried looking at attributes for the model and do see a "coefficients" item, but printing it returns an NULL result.
2009 Aug 04
1
Save model and predictions from svm
Hello, I'm using the e1071 package for training an SVM. It seems to be working well. This question has two parts: 1) Once I've trained an SVM model, I want to USE it within R at a later date to predict various new data. I see the write.svm command, but don't know how to LOAD the model back in so that I can use it tomorrow. How can I do this? 2) I would like to add the
2009 Aug 12
1
nominal to numeric function
Hi, I'm training an SVM (C-classification from e1071 library) Some of the variables in my data set are nominal. Is there some easy/automatic way to convert them to numerical representations? Thanks, -N
2010 Jun 17
3
Factoring a variable
Hi, I have a dataset where the results are coded ("yes", "no") We want to do some machine learning with SVM to predict the "yes" outcome My problem is that if I just use the as.factor function to convert, then it reverses the levels. ---------------------- x <- c("no", "no", "no", "yes", "yes", "no",
2009 Jul 18
1
svm works but tune.svm give error
Hello, I'm using the e1071 library for SVM functions. I can quickly train an SVM with: svm(formula = label ~ ., data = testdata) That works well. I want to tune the parameters, so I tried: tune.svm(label ~ ., data=testdata[1:2000, ], gamma=10^(-6:3), cost=10^(1:2)) THIS FAILS WITH AN ERROR: 'names' attribute [199] must be the same length as the vector [184] I don't
2009 Aug 19
1
Erros with RVM and LSSVM from kernlab library
Hello, In my ongoing quest to develop a "best" model, I'm testing various forms of SVM to see which is best for my application. I have been using the SVM from the e1071 library without problem for several weeks. Now, I'm interested in RVM and LSSVM to see if I get better performance. When running RVM or LSSVM on the exact same data as the SVM{e1071}, I get an error that I
2009 Sep 14
1
Strange question/result about SVM
Hello, I have a very unusual situation with an SVM and wanted to get the group's opinion. We developed an experiment where we train the SVM with one set of data (train data) and then test with a completely independent set of data (test data). The results were VERY good. I found and error in how we generate one of or training variables. We discovered that it was indirectly influenced
2009 Aug 05
1
binning results
Hello, I asked this as part of a previous message, but never really figured out a usable solution. So this is a second attempt. I have an process containing an SVM. The end result is the probability that the class is true. That result is added back to the original data. So I wind up with a data.frame that looks like this label,v1,v2,v3,prob_true What I want to do is measure how accurate
2009 Sep 06
2
Regarding SVM using R
Hi Abbas, Before I try to give you answers, I just want to mention that you should send R related reqests to the R-help list, and not me personally because (i) there's a greater likelihood that it will get answered in a timely manner, and (ii) people who might have a similar problem down the road might benefit from any answer via searching the list archives ... anyway: On Sep 5, 2009, at
2009 Sep 07
2
Confused - better empirical results with error in data
Hi, I have a strange one for the group. We have a system that predicts probabilities using a fairly standard svm (e1017). We are looking at probabilities of a binary outcome. The input data is generated by a perl script that calculates a bunch of things, fetches data from a database, etc. We train the system on 30,000 examples and then test the system on an unseen set of 5,000 records.
2010 Apr 06
3
svm of e1071 package
Hello List, I am having a great trouble using svm function in e1071 package. I have 4gb of data that i want to use to train svm. I am using Amazon cloud, my Amazon Machine Image(AMI) has 34.2 GB of memory. my R process was killed several times when i tried to use 4GB of data for svm. Now I am using a subset of that data and it is only 1.4 GB. i remove all unnecessary objects before calling
2009 Aug 02
2
Strange column shifting with read.table
Hi, I am reading in a dataframe from a CSV file. It has 70 columns. I do not have any kind of unique "row id". rawdata <- read.table("r_work/train_data.csv", header=T, sep=",", na.strings=0) When training an svm, I keep getting an error So, as an experiment, I wrote the data back out to a new file so that I could see what the svm function sees.
2009 Oct 23
1
Data format for KSVM
Hi, I have a process using svm from the e1071 library. it works. I want to try using the KSVM library instead. The same data used wiht e1071 gives me an error with KSVM. My data is a data.frame. sample code: svm_formula <- formula(y ~ a + B + C) svm_model <- ksvm(formula, data=train_data, type="C-svc", kernel="rbfdot", C=1) I get the following error:
2009 Jul 27
1
Forumla format?
Hi, Quick question. I'm working on training an SVM. I have a dataframe with about 50 columns. I want to train on 46 of them. Is there a way to say "All except columns 22,23,25 and 31"? It would be nice to not have to do +c1 +c2 +c3 +c4, etc for all 48 columns. Thanks! -N [[alternative HTML version deleted]]
2009 Aug 03
2
Scale set of 0 values returns NAN??
Hi, More questions in my ongoing quest to convert from RapidMiner to R. One thing has become VERY CLEAR: None of the issues I'm asking about here are addressed in RapidMiner. How it handles misisng values, scaling, etc. is hidden within the "black box". Using R is forcing me to take a much deeper look at my data and how my experiments are constructed. (That's a very
2010 Jul 11
1
The formula interface of SVM
Hi Could you please explain the line that I got from the documentation of R? does it mean that there is a difference between using and not using the formula interface with SVM ?: If the predictor variables include factors, the formula interface must be used to get a correct model matrix. Cheers, Amy _________________________________________________________________ View photos of
2012 Nov 29
7
Fast Normalize by Group
Hi, I have a very large data set (aprox. 100,000 rows.) The data comes from around 10,000 "groups" with about 10 entered per group. The values are in one column, the group ID is an integer in the second column. I want to normalize the values by group: for(g in unique(groups){ x[group==g] / sum(x[group==g]) } This works find in a loop, but is slow. Is there a faster way to do
2010 Jul 14
1
question about SVM in e1071
Hi, I have a question about the parameter C (cost) in svm function in e1071. I thought larger C is prone to overfitting than smaller C, and hence leads to more support vectors. However, using the Wisconsin breast cancer example on the link: http://planatscher.net/svmtut/svmtut.html I found that the largest cost have fewest support vectors, which is contrary to what I think. please see the scripts
2009 Aug 19
1
Performance measure for probabilistic predictions
Hello, I'm using an SVM for predicting a model, but I'm most interested in the probability output. This is easy enough to calculate. My challenge is how to measure the relative performance of the SVM for different settings/parameters/etc. An AUC curve comes to mind, but I'm NOT interested in predicting true vs false. I am interested in finding the most accurate probability