Displaying 20 results from an estimated 10000 matches similar to: "Using sample to create Training and Test sets"
2011 Jan 24
5
Train error:: subscript out of bonds
Hi,
I am trying to construct a svmpoly model using the "caret" package (please
see code below). Using the same data, without changing any setting, I am
just changing the seed value. Sometimes it constructs the model
successfully, and sometimes I get an ?Error in indexes[[j]] : subscript out
of bounds?.
For example when I set seed to 357 following code produced result only for 8
2010 Nov 23
5
cross validation using e1071:SVM
Hi everyone
I am trying to do cross validation (10 fold CV) by using e1071:svm method. I
know that there is an option (?cross?) for cross validation but still I
wanted to make a function to Generate cross-validation indices using pls:
cvsegments method.
#####################################################################
Code (at the end) Is working fine but sometime caret:confusionMatrix
2009 Mar 27
1
ROCR package finding maximum accuracy and optimal cutoff point
If we use the ROCR package to find the accuracy of a classifier
pred <- prediction(svm.pred, testset[,2])
perf.acc <- performance(pred,"acc")
Do we?find the maximum accuracy?as follows?(is there a simplier way?):
> max(perf.acc at x.values[[1]])
Then to find the cutoff point that maximizes the accuracy?do we do the
following?(is there a simpler way):
> cutoff.list <-
2009 Dec 21
5
Help,Suggest me some methods to identify training set and test set!!!
I want to split my whole dateset to training set and test set, building model
in training set, and validate model using test set. Now, How can I split my
dataset to them reasonally. Please give me a hand, It is better to give me
some R code.
and I see some ways like using SOM to project whole independents to
2-dimensions and find some to be training set and others are test set. like
below. I
2007 Dec 14
2
train nnet
Hi R-helpers,
Can some one tell me how to train 'mynn' of this type?:
mynn <- nnet(y ~ x1 + ..+ x8, data = lgist, size = 2, rang = 0.1,
decay = 5e-4, maxit = 200)
I assume that this nn is untrained, and to train I have to split the
original data into train:test data set,
do leave-one-out refitting to refine the weights (please straighten
this up if I was wrong).
I just don't know
2007 Feb 15
2
Does rpart package have some requirements on the original data set?
Hi,
I am currently studying Decision Trees by using rpart package in R. I
artificially created a data set which includes the dependant variable
(y) and a few independent variables (x1, x2...). The dependant variable
y only comprises 0 and 1. 90% of y are 1 and 10% of y are 0. When I
apply rpart to it, there is no splitting at all.
I am wondering whether this is because of the
2012 Nov 29
1
Help with this error "kernlab class probability calculations failed; returning NAs"
I have never been able to get class probabilities to work and I am relatively new to using these tools, and I am looking for some insight as to what may be wrong.
I am using caret with kernlab/ksvm. I will simplify my problem to a basic data set which produces the same problem. I have read the caret vignettes as well as documentation for ?train. I appreciate any direction you can give. I
2012 Sep 26
3
DUDA SOBRE PARTICIÓN DE DATOS PARA VALIDACIÓN CRUZADA
>
>
Estimados muy buenas quería hacerles unas consulta:
Estoy trabajando en mi tesis sobre mejoramiento animal y mi objetivo es
evaluar la habilidad predictiva de modelos estadísticos mediante validación
cruzada.
Pero antes la intención es dividir mi base de datos en 3 partes y quisiera
que todos los efectos incluidos en el estudio y cada uno de sus niveles,
estén lo más equitativamente
2012 Sep 27
1
Random Forest - Extract
Hello,
I have two Random Forest (RF) related questions.
1. How do I view the classifications for the detail data of my training data (aka trainset) that I used to build the model? I know there is an object called predicted which I believe is a vector. To view the detail for my testset I use the below-bind the columns together. I was trying to do something similar for my trainset but
2012 Nov 04
1
sample equal number of cases per class
Dear community
I have a dataframe and want to split it into a learn and a test partition.
However the learnset should be balanced, i.e. each class should have the
same number of cases. I tried and searched a lot, without success so far.
Maybe you can help?
Some example code
*# generate example data
df <- data.frame(class = as.factor(sample(1:3, 20, replace = T)), var1 =
rnorm(20,3), var2 =
2009 May 24
2
accuracy of a neural net
Hi. I started with a file which was a sparse 982x923 matrix and where the
last column was a variable to be predicted. I did principle component
analysis on it and arrived at a new 982x923 matrix.
Then I ran the code below to get a neural network using nnet and then wanted
to get a confusion matrix or at least know how accurate the neural net was.
I used the first 22 principle components only for
2012 Nov 20
3
data after write() is off by 1 ?
I am new to R, so I am sure I am making a simple mistake. I am including complete information in hopes
someone can help me.
Basically my data in R looks good, I write it to a file, and every value is off by 1.
Here is my flow:
> str(prediction)
Factor w/ 10 levels "0","1","2","3",..: 3 1 10 10 4 8 1 4 1 4 ...
- attr(*, "names")= chr
2018 Feb 09
1
self-heal trouble after changing arbiter brick
Hi Karthik,
Thank you very much, you made me much more relaxed. Below is getfattr output for a file from all the bricks:
root at gv2 ~ # getfattr -d -e hex -m . /data/glusterfs/testset/306/30677af808ad578916f54783904e6342.pack
getfattr: Removing leading '/' from absolute path names
# file: data/glusterfs/testset/306/30677af808ad578916f54783904e6342.pack
2009 Aug 03
2
Truncating based on attribute range and serial no
COnsider the following:
Age<-c(48, 57, 56, 76, 76, 66, 70, 14, 7, 3, 62, 62, 30, 10, 7, 53, 44,
29, 46, 47, 15, 13, 84, 77, 26)
SerialNo<-c(001147, 005979, 005979, 006128, 006128, 007004, 007004, 007004,
007004, 007004, 007438, 007438,009402,009402, 009402, 012693, 012693,
012693, 014063,014063, 014063, 014063, 014811, 014811,016570)
TestSet<-cbind(Age,SerialNo)
2011 Apr 29
6
Bigining with a Program of SVR
Hi:
I'm starting a research of Support Vector Regression. I want to obtain a
model to predict a property A with
a set of property B, C, D, ... This problem is very common for example in
QSAR models. I want to know
some examples and package that could help me in this way. I know about
caret and e1071. But I' don't
know if this package can work with continues variables.?
2004 Dec 20
3
Sweave and LaTeX beamer class
Hi,
has anyonne experienced problems between the LaTeX beamer class and
Sweave? The following code does not work properly:
#################################
\documentclass{beamer}
\usepackage[latin1]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{ngerman}
\begin{document}
\frame{
\frametitle{test}
test
<<>>=
1+1
@
}
\end{document}
#################################
Below is the
2009 Mar 11
1
prediction error for test set-cross validation
Hi,
I have a database of 2211 rows with 31 entries each and I manually split my
data into 10 folds for cross validation. I build logistic regression model
as:
>model <- glm(qual ~ AgGr + FaHx + PrHx + PrSr + PaLp + SvD + IndExam +
Rad +BrDn + BRDS + PrinFin+ SkRtr + NpRtr + SkThck +TrThkc +
SkLes + AxAdnp + ArcDst + MaDen + CaDt + MaMG +
MaMrp + MaSh +
2018 Feb 09
0
self-heal trouble after changing arbiter brick
Hi Karthik,
Thank you for your reply. The heal is still undergoing, as the /var/log/glusterfs/glustershd.log keeps growing, and there's a lot of pending entries in the heal info.
The gluster version is 3.10.9 and 3.10.10 (the version update in progress). It doesn't have info summary [yet?], and the heal info is way too long to attach here. (It takes more than 20 minutes just to collect
2011 Oct 02
1
difference between createPartition and createfold functions
Hello,
I'm trying to separate my dataset into 4 parts with the 4th one as the
test dataset, and the other three to fit a model.
I've been searching for the difference between these 2 functions in
Caret package, but the most I can get is this--
A series of test/training partitions are created using
createDataPartition while createResample creates one or more bootstrap
samples.
2020 Oct 27
3
R for-loop to add layer to lattice plot
Hello,
I am using e1071 to run support vector machine. I would like to plot
the data with lattice and specifically show the hyperplanes created by
the system.
I can store the hyperplane as a contour in an object, and I can plot
one object at a time. Since there will be thousands of elements to
plot, I can't manually add them one by one to the plot, so I tried to
loop into them, but only the