Displaying 20 results from an estimated 8000 matches similar to: "does svm have a CV to obtain the best "cost" parameter?"
2006 Jan 31
2
SVM question
I'm running SVM from e1071 package on a data with ~150 columns (variables)
and 50000 lines of data (it takes a bit of time) for radial kernel for
different gamma and cost values.
I get a very large models with at least
30000 vectors and the prediction I get is not the best one. What does it
mean and what could I do to ameliorate my model ?
Jerzy Orlowski
2002 Apr 02
2
random forests for R
Hi all,
There is now a package available on CRAN that provides an R interface to Leo
Breiman's random forest classifier.
Basically, random forest does the following:
1. Select ntree, the number of trees to grow, and mtry, a number no larger
than number of variables.
2. For i = 1 to ntree:
3. Draw a bootstrap sample from the data. Call those not in the bootstrap
sample the
2002 Apr 02
2
random forests for R
Hi all,
There is now a package available on CRAN that provides an R interface to Leo
Breiman's random forest classifier.
Basically, random forest does the following:
1. Select ntree, the number of trees to grow, and mtry, a number no larger
than number of variables.
2. For i = 1 to ntree:
3. Draw a bootstrap sample from the data. Call those not in the bootstrap
sample the
2006 Jan 04
2
Looking for packages to do Feature Selection and Classification
Hi All,
Sorry if this is a repost (a quick browse didn't give me the answer).
I wonder if there are packages that can do the feature selection and
classification at the same time. For instance, I am using SVM to classify my
samples, but it's easy to get overfitted if using all of the features. Thus,
it is necessary to select "good" features to build an optimum hyperplane
(?).
2010 Apr 06
3
svm of e1071 package
Hello List,
I am having a great trouble using svm function in e1071 package. I have 4gb of data that i want to use to train svm. I am using Amazon cloud, my Amazon Machine Image(AMI) has 34.2 GB of memory. my R process was killed several times when i tried to use 4GB of data for svm. Now I am using a subset of that data and it is only 1.4 GB. i remove all unnecessary objects before calling
2010 Dec 03
3
book about "support vector machines"
Dear all,
I am currently looking for a book about support vector machines for
regression and classification and am a bit lost since they are plenty of
books dealing with this subject. I am not totally new to the field and
would like to get more information on that subject for later use with
the e1071 <http://cran.r-project.org/web/packages/e1071/index.html>
package for instance. Does
2005 Jan 14
2
probabilty calculation in SVM
Hi All,
In package e1071 for SVM based classification, one can get a probability
measure for each prediction. I like to know what is method that is used for
calculating this probability. Is it calculated using logistic link function?
Thanks for your help.
Regards,
Raj
2012 Apr 03
1
e1071 tune.control() random parameter
I'm not sure what the parameter specifies:
random
if an integer value is specified, random parameter vectors are drawn from the parameter space.
What are the parameter vectors and what is the parameter space? What means drawn?
greetings
Jessi
[[alternative HTML version deleted]]
2006 Mar 08
8
how to use the randomForest and rpart function?
Hi all,
I am trying to play around with the randomForest function for
classification. I know its performance is great.
I am currently using the default options.
It has many options.
How do I further tweak the options so that I can make its performance even
better?
What are the options that are mostly used?
Thanks a lot!
M
[[alternative HTML version deleted]]
2010 Jul 14
1
question about SVM in e1071
Hi,
I have a question about the parameter C (cost) in svm function in e1071. I
thought larger C is prone to overfitting than smaller C, and hence leads to
more support vectors. However, using the Wisconsin breast cancer example on
the link:
http://planatscher.net/svmtut/svmtut.html
I found that the largest cost have fewest support vectors, which is contrary
to what I think. please see the scripts
2003 Dec 04
2
RE: R performance questions
Hi--
While I agree that we cannot agree on the ideal algorithms, we should be
taking practical steps to implement microarrays in the clinic. I think
we can all agree that our algorithms have some degree of efficacy over
and above conventional diagnostic techniques. If patients are dying
from lack of diagnostic accuracy, I think we have to work hard to use
this technology to help them, if we
2002 Jul 02
4
auto-loading package possible?
Dear R-help,
Yes, I do know about the auto-loading feature. My question is more
complicated than that:
Suppose I loaded a package (e.g., e1071) and created an object of certain
class (e.g., svm), for which there is a print method in the package to hide
things that the user may not need to see (e.g., large vectors or matrices
needed by methods such as predict). If the next time I started R, I
2006 Apr 20
1
Bootstrap error message: Error in statistic(data, origina l, ...) : unused argument(s) ( ...)
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Michael
> Sent: Thursday, April 20, 2006 3:50 AM
> To: R-help at stat.math.ethz.ch
> Subject: [R] Bootstrap error message: Error in
> statistic(data, original, ...) : unused argument(s) ( ...) [Broadcast]
>
>
> Dear colleagues,
>
2002 Jun 20
16
problem with predict()
Hi,
It is most probably just my R-ignorance, but I have following problem with
using predict(). I train the model using 164 cases and then I try to use
it on the data set with 35 cases, but I am getting 164 predictions ?
R-code below illustrates in more detail what I am doing.
Truly yours,
R
train = read.csv("train.csv", header = TRUE, row.names = "mol",
2012 Feb 10
2
naiveBayes: slow predict, weird results
I did this:
nb <- naiveBayes(users, platform)
pl <- predict(nb,users)
nrow(users) ==> 314781
ncol(users) ==> 109
1. naiveBayes() was quite fast (~20 seconds), while predict() was slow
(tens of minutes). why?
2. the predict results were completely off the mark (quite the opposite
of the expected overfitting). suffice it to show the tables:
pl:
android blackberry ipad
2004 Oct 20
7
Q about strsplit and regexp
Dear R-help,
This one is probably a piece of cake for regexp masters. I'd like to split
a character vector (for simplicity, say of length one for now) that contains
fields that are delimited by arbitrary number of white spaces (e.g., " a b
c "). How do I get the character vector that contain the fields? In the
example I gave, I've tried:
> strsplit(" a b c
2003 Feb 27
2
multidimensional function fitting
Take a look at package mgcv. Hope this helps. --Matt
-----Original Message-----
From: RenE J.V. Bertin [mailto:rjvbertin at despammed.com]
Sent: Thursday, February 27, 2003 1:39 PM
To: r-help at stat.math.ethz.ch
Subject: [R] multidimensional function fitting
Hello,
I have been looking around for how to perform a multidimensional, arbitrary
function fit (in any case non-linear; more below),
2017 Jun 02
5
CV en R
Buenas,
Estoy haciendo modelos y comparando cual es mejor. Para ello, uso CV de 10 folds.
Por ejemplo, hago la comparativa entre un svm y un randomForest para una serie de datos, por ello hago:
midataset<-import.....
#datos es un dataframe de 1500 filas y 15 variables
for(i in 1:10){
numeros<-sample(1:1500,1500*0.7)
train<-datos[numeros,]
test<-datos[-numeros,]
#modeloRF
2003 Aug 15
6
plot.lm mislabels points with na.exclude (PR#3750)
R 1.7.1 on Windows XP
The "normal Q-Q plot" produced by plot.lm() mislabels points
when the model is fitted using na.action=na.exclude. Example:
x <- 1:50
y <- x + rnorm(50)
y[c(5,10,15)] <- NA # insert some NA's
y[40] <- 50 # add an outlier
plot(lm(y ~ x, na.action=na.omit)) # outlier correctly labeled in all
# four plots
2005 Jul 21
4
RandomForest question
Hello,
I'm trying to find out the optimal number of splits (mtry parameter) for a randomForest classification. The classification is binary and there are 32 explanatory variables (mostly factors with each up to 4 levels but also some numeric variables) and 575 cases.
I've seen that although there are only 32 explanatory variables the best classification performance is reached when