Displaying 20 results from an estimated 10000 matches similar to: "Using sample to create Training and Test sets"
2011 Jan 24
Train error:: subscript out of bonds
I am trying to construct a svmpoly model using the "caret" package (please
see code below). Using the same data, without changing any setting, I am
just changing the seed value. Sometimes it constructs the model
successfully, and sometimes I get an ?Error in indexes[[j]] : subscript out
of bounds?.
For example when I set seed to 357 following code produced result only for 8
2010 Nov 23
cross validation using e1071:SVM
Hi everyone
I am trying to do cross validation (10 fold CV) by using e1071:svm method. I
know that there is an option (?cross?) for cross validation but still I
wanted to make a function to Generate cross-validation indices using pls:
cvsegments method.
Code (at the end) Is working fine but sometime caret:confusionMatrix
2009 Mar 27
ROCR package finding maximum accuracy and optimal cutoff point
If we use the ROCR package to find the accuracy of a classifier
pred <- prediction(svm.pred, testset[,2])
perf.acc <- performance(pred,"acc")
Do we?find the maximum accuracy?as follows?(is there a simplier way?):
> max(perf.acc at x.values[[1]])
Then to find the cutoff point that maximizes the accuracy?do we do the
following?(is there a simpler way):
> cutoff.list <-
2009 Dec 21
Help,Suggest me some methods to identify training set and test set!!!
I want to split my whole dateset to training set and test set, building model
in training set, and validate model using test set. Now, How can I split my
dataset to them reasonally. Please give me a hand, It is better to give me
some R code.
and I see some ways like using SOM to project whole independents to
2-dimensions and find some to be training set and others are test set. like
below. I
2007 Dec 14
train nnet
Hi R-helpers,
Can some one tell me how to train 'mynn' of this type?:
mynn <- nnet(y ~ x1 + ..+ x8, data = lgist, size = 2, rang = 0.1,
decay = 5e-4, maxit = 200)
I assume that this nn is untrained, and to train I have to split the
original data into train:test data set,
do leave-one-out refitting to refine the weights (please straighten
this up if I was wrong).
I just don't know
2007 Feb 15
Does rpart package have some requirements on the original data set?
I am currently studying Decision Trees by using rpart package in R. I
artificially created a data set which includes the dependant variable
(y) and a few independent variables (x1, x2...). The dependant variable
y only comprises 0 and 1. 90% of y are 1 and 10% of y are 0. When I
apply rpart to it, there is no splitting at all.
I am wondering whether this is because of the
2012 Nov 29
Help with this error "kernlab class probability calculations failed; returning NAs"
I have never been able to get class probabilities to work and I am relatively new to using these tools, and I am looking for some insight as to what may be wrong.
I am using caret with kernlab/ksvm. I will simplify my problem to a basic data set which produces the same problem. I have read the caret vignettes as well as documentation for ?train. I appreciate any direction you can give. I
2012 Sep 26
Estimados muy buenas quería hacerles unas consulta:
Estoy trabajando en mi tesis sobre mejoramiento animal y mi objetivo es
evaluar la habilidad predictiva de modelos estadísticos mediante validación
Pero antes la intención es dividir mi base de datos en 3 partes y quisiera
que todos los efectos incluidos en el estudio y cada uno de sus niveles,
estén lo más equitativamente
2012 Sep 27
Random Forest - Extract
I have two Random Forest (RF) related questions.
1. How do I view the classifications for the detail data of my training data (aka trainset) that I used to build the model? I know there is an object called predicted which I believe is a vector. To view the detail for my testset I use the below-bind the columns together. I was trying to do something similar for my trainset but
2012 Nov 04
sample equal number of cases per class
Dear community
I have a dataframe and want to split it into a learn and a test partition.
However the learnset should be balanced, i.e. each class should have the
same number of cases. I tried and searched a lot, without success so far.
Maybe you can help?
Some example code
*# generate example data
df <- data.frame(class = as.factor(sample(1:3, 20, replace = T)), var1 =
rnorm(20,3), var2 =
2009 May 24
accuracy of a neural net
Hi. I started with a file which was a sparse 982x923 matrix and where the
last column was a variable to be predicted. I did principle component
analysis on it and arrived at a new 982x923 matrix.
Then I ran the code below to get a neural network using nnet and then wanted
to get a confusion matrix or at least know how accurate the neural net was.
I used the first 22 principle components only for
2012 Nov 20
data after write() is off by 1 ?
I am new to R, so I am sure I am making a simple mistake. I am including complete information in hopes
someone can help me.
Basically my data in R looks good, I write it to a file, and every value is off by 1.
Here is my flow:
> str(prediction)
Factor w/ 10 levels "0","1","2","3",..: 3 1 10 10 4 8 1 4 1 4 ...
- attr(*, "names")= chr
2018 Feb 09
self-heal trouble after changing arbiter brick
Hi Karthik,
Thank you very much, you made me much more relaxed. Below is getfattr output for a file from all the bricks:
root at gv2 ~ # getfattr -d -e hex -m . /data/glusterfs/testset/306/30677af808ad578916f54783904e6342.pack
getfattr: Removing leading '/' from absolute path names
# file: data/glusterfs/testset/306/30677af808ad578916f54783904e6342.pack
2009 Aug 03
Truncating based on attribute range and serial no
COnsider the following:
Age<-c(48, 57, 56, 76, 76, 66, 70, 14, 7, 3, 62, 62, 30, 10, 7, 53, 44,
29, 46, 47, 15, 13, 84, 77, 26)
SerialNo<-c(001147, 005979, 005979, 006128, 006128, 007004, 007004, 007004,
007004, 007004, 007438, 007438,009402,009402, 009402, 012693, 012693,
012693, 014063,014063, 014063, 014063, 014811, 014811,016570)
2011 Apr 29
Bigining with a Program of SVR
I'm starting a research of Support Vector Regression. I want to obtain a
model to predict a property A with
a set of property B, C, D, ... This problem is very common for example in
QSAR models. I want to know
some examples and package that could help me in this way. I know about
caret and e1071. But I' don't
know if this package can work with continues variables.?
2004 Dec 20
Sweave and LaTeX beamer class
has anyonne experienced problems between the LaTeX beamer class and
Sweave? The following code does not work properly:
Below is the
2009 Mar 11
prediction error for test set-cross validation
I have a database of 2211 rows with 31 entries each and I manually split my
data into 10 folds for cross validation. I build logistic regression model
>model <- glm(qual ~ AgGr + FaHx + PrHx + PrSr + PaLp + SvD + IndExam +
Rad +BrDn + BRDS + PrinFin+ SkRtr + NpRtr + SkThck +TrThkc +
SkLes + AxAdnp + ArcDst + MaDen + CaDt + MaMG +
MaMrp + MaSh +
2018 Feb 09
self-heal trouble after changing arbiter brick
Hi Karthik,
Thank you for your reply. The heal is still undergoing, as the /var/log/glusterfs/glustershd.log keeps growing, and there's a lot of pending entries in the heal info.
The gluster version is 3.10.9 and 3.10.10 (the version update in progress). It doesn't have info summary [yet?], and the heal info is way too long to attach here. (It takes more than 20 minutes just to collect
2011 Oct 02
difference between createPartition and createfold functions
I'm trying to separate my dataset into 4 parts with the 4th one as the
test dataset, and the other three to fit a model.
I've been searching for the difference between these 2 functions in
Caret package, but the most I can get is this--
A series of test/training partitions are created using
createDataPartition while createResample creates one or more bootstrap
2020 Oct 27
R for-loop to add layer to lattice plot
I am using e1071 to run support vector machine. I would like to plot
the data with lattice and specifically show the hyperplanes created by
the system.
I can store the hyperplane as a contour in an object, and I can plot
one object at a time. Since there will be thousands of elements to
plot, I can't manually add them one by one to the plot, so I tried to
loop into them, but only the