thr3ads.net - similar to: "does svm have a CV to obtain the best "cost" parameter?"

Displaying 20 results from an estimated 8000 matches similar to: "does svm have a CV to obtain the best "cost" parameter?"

SVM question

2006 Jan 31

SVM question

I'm running SVM from e1071 package on a data with ~150 columns (variables) and 50000 lines of data (it takes a bit of time) for radial kernel for different gamma and cost values. I get a very large models with at least 30000 vectors and the prediction I get is not the best one. What does it mean and what could I do to ameliorate my model ? Jerzy Orlowski

random forests for R

2002 Apr 02

random forests for R

Hi all, There is now a package available on CRAN that provides an R interface to Leo Breiman's random forest classifier. Basically, random forest does the following: 1. Select ntree, the number of trees to grow, and mtry, a number no larger than number of variables. 2. For i = 1 to ntree: 3. Draw a bootstrap sample from the data. Call those not in the bootstrap sample the

random forests for R

2002 Apr 02

random forests for R

Looking for packages to do Feature Selection and Classification

2006 Jan 04

Looking for packages to do Feature Selection and Classification

Hi All, Sorry if this is a repost (a quick browse didn't give me the answer). I wonder if there are packages that can do the feature selection and classification at the same time. For instance, I am using SVM to classify my samples, but it's easy to get overfitted if using all of the features. Thus, it is necessary to select "good" features to build an optimum hyperplane (?).

svm of e1071 package

2010 Apr 06

svm of e1071 package

Hello List, I am having a great trouble using svm function in e1071 package. I have 4gb of data that i want to use to train svm. I am using Amazon cloud, my Amazon Machine Image(AMI) has 34.2 GB of memory. my R process was killed several times when i tried to use 4GB of data for svm. Now I am using a subset of that data and it is only 1.4 GB. i remove all unnecessary objects before calling

book about "support vector machines"

2010 Dec 03

book about "support vector machines"

Dear all, I am currently looking for a book about support vector machines for regression and classification and am a bit lost since they are plenty of books dealing with this subject. I am not totally new to the field and would like to get more information on that subject for later use with the e1071 <http://cran.r-project.org/web/packages/e1071/index.html> package for instance. Does

probabilty calculation in SVM

2005 Jan 14

probabilty calculation in SVM

Hi All, In package e1071 for SVM based classification, one can get a probability measure for each prediction. I like to know what is method that is used for calculating this probability. Is it calculated using logistic link function? Thanks for your help. Regards, Raj

e1071 tune.control() random parameter

2012 Apr 03

e1071 tune.control() random parameter

I'm not sure what the parameter specifies: random if an integer value is specified, random parameter vectors are drawn from the parameter space. What are the parameter vectors and what is the parameter space? What means drawn? greetings Jessi [[alternative HTML version deleted]]

how to use the randomForest and rpart function?

2006 Mar 08

how to use the randomForest and rpart function?

Hi all, I am trying to play around with the randomForest function for classification. I know its performance is great. I am currently using the default options. It has many options. How do I further tweak the options so that I can make its performance even better? What are the options that are mostly used? Thanks a lot! M [[alternative HTML version deleted]]

question about SVM in e1071

2010 Jul 14

question about SVM in e1071

Hi, I have a question about the parameter C (cost) in svm function in e1071. I thought larger C is prone to overfitting than smaller C, and hence leads to more support vectors. However, using the Wisconsin breast cancer example on the link: http://planatscher.net/svmtut/svmtut.html I found that the largest cost have fewest support vectors, which is contrary to what I think. please see the scripts

RE: R performance questions

2003 Dec 04

RE: R performance questions

Hi-- While I agree that we cannot agree on the ideal algorithms, we should be taking practical steps to implement microarrays in the clinic. I think we can all agree that our algorithms have some degree of efficacy over and above conventional diagnostic techniques. If patients are dying from lack of diagnostic accuracy, I think we have to work hard to use this technology to help them, if we

auto-loading package possible?

2002 Jul 02

auto-loading package possible?

Dear R-help, Yes, I do know about the auto-loading feature. My question is more complicated than that: Suppose I loaded a package (e.g., e1071) and created an object of certain class (e.g., svm), for which there is a print method in the package to hide things that the user may not need to see (e.g., large vectors or matrices needed by methods such as predict). If the next time I started R, I

Bootstrap error message: Error in statistic(data, origina l, ...) : unused argument(s) ( ...)

2006 Apr 20

Bootstrap error message: Error in statistic(data, origina l, ...) : unused argument(s) ( ...)

> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Michael > Sent: Thursday, April 20, 2006 3:50 AM > To: R-help at stat.math.ethz.ch > Subject: [R] Bootstrap error message: Error in > statistic(data, original, ...) : unused argument(s) ( ...) [Broadcast] > > > Dear colleagues, >

problem with predict()

2002 Jun 20

problem with predict()

Hi, It is most probably just my R-ignorance, but I have following problem with using predict(). I train the model using 164 cases and then I try to use it on the data set with 35 cases, but I am getting 164 predictions ? R-code below illustrates in more detail what I am doing. Truly yours, R train = read.csv("train.csv", header = TRUE, row.names = "mol",

naiveBayes: slow predict, weird results

2012 Feb 10

naiveBayes: slow predict, weird results

I did this: nb <- naiveBayes(users, platform) pl <- predict(nb,users) nrow(users) ==> 314781 ncol(users) ==> 109 1. naiveBayes() was quite fast (~20 seconds), while predict() was slow (tens of minutes). why? 2. the predict results were completely off the mark (quite the opposite of the expected overfitting). suffice it to show the tables: pl: android blackberry ipad

Q about strsplit and regexp

2004 Oct 20

Q about strsplit and regexp

Dear R-help, This one is probably a piece of cake for regexp masters. I'd like to split a character vector (for simplicity, say of length one for now) that contains fields that are delimited by arbitrary number of white spaces (e.g., " a b c "). How do I get the character vector that contain the fields? In the example I gave, I've tried: > strsplit(" a b c

multidimensional function fitting

2003 Feb 27

multidimensional function fitting

Take a look at package mgcv. Hope this helps. --Matt -----Original Message----- From: RenE J.V. Bertin [mailto:rjvbertin at despammed.com] Sent: Thursday, February 27, 2003 1:39 PM To: r-help at stat.math.ethz.ch Subject: [R] multidimensional function fitting Hello, I have been looking around for how to perform a multidimensional, arbitrary function fit (in any case non-linear; more below),

CV en R

2017 Jun 02

CV en R

Buenas, Estoy haciendo modelos y comparando cual es mejor. Para ello, uso CV de 10 folds. Por ejemplo, hago la comparativa entre un svm y un randomForest para una serie de datos, por ello hago: midataset<-import..... #datos es un dataframe de 1500 filas y 15 variables for(i in 1:10){ numeros<-sample(1:1500,1500*0.7) train<-datos[numeros,] test<-datos[-numeros,] #modeloRF

plot.lm mislabels points with na.exclude (PR#3750)

2003 Aug 15

plot.lm mislabels points with na.exclude (PR#3750)

R 1.7.1 on Windows XP The "normal Q-Q plot" produced by plot.lm() mislabels points when the model is fitted using na.action=na.exclude. Example: x <- 1:50 y <- x + rnorm(50) y[c(5,10,15)] <- NA # insert some NA's y[40] <- 50 # add an outlier plot(lm(y ~ x, na.action=na.omit)) # outlier correctly labeled in all # four plots

RandomForest question

2005 Jul 21

RandomForest question

Hello, I'm trying to find out the optimal number of splits (mtry parameter) for a randomForest classification. The classification is binary and there are 32 explanatory variables (mostly factors with each up to 4 levels but also some numeric variables) and 575 cases. I've seen that although there are only 32 explanatory variables the best classification performance is reached when

similar to: does svm have a CV to obtain the best "cost" parameter?