similar to: Using sample to create Training and Test sets

Displaying 20 results from an estimated 10000 matches similar to: "Using sample to create Training and Test sets"

2011 Jan 24
5
Train error:: subscript out of bonds
Hi, I am trying to construct a svmpoly model using the "caret" package (please see code below). Using the same data, without changing any setting, I am just changing the seed value. Sometimes it constructs the model successfully, and sometimes I get an ?Error in indexes[[j]] : subscript out of bounds?. For example when I set seed to 357 following code produced result only for 8
2010 Nov 23
5
cross validation using e1071:SVM
Hi everyone I am trying to do cross validation (10 fold CV) by using e1071:svm method. I know that there is an option (?cross?) for cross validation but still I wanted to make a function to Generate cross-validation indices using pls: cvsegments method. ##################################################################### Code (at the end) Is working fine but sometime caret:confusionMatrix
2009 Mar 27
1
ROCR package finding maximum accuracy and optimal cutoff point
If we use the ROCR package to find the accuracy of a classifier pred <- prediction(svm.pred, testset[,2]) perf.acc <- performance(pred,"acc") Do we?find the maximum accuracy?as follows?(is there a simplier way?): > max(perf.acc at x.values[[1]]) Then to find the cutoff point that maximizes the accuracy?do we do the following?(is there a simpler way): > cutoff.list <-
2009 Dec 21
5
Help,Suggest me some methods to identify training set and test set!!!
I want to split my whole dateset to training set and test set, building model in training set, and validate model using test set. Now, How can I split my dataset to them reasonally. Please give me a hand, It is better to give me some R code. and I see some ways like using SOM to project whole independents to 2-dimensions and find some to be training set and others are test set. like below. I
2007 Dec 14
2
train nnet
Hi R-helpers, Can some one tell me how to train 'mynn' of this type?: mynn <- nnet(y ~ x1 + ..+ x8, data = lgist, size = 2, rang = 0.1, decay = 5e-4, maxit = 200) I assume that this nn is untrained, and to train I have to split the original data into train:test data set, do leave-one-out refitting to refine the weights (please straighten this up if I was wrong). I just don't know
2007 Feb 15
2
Does rpart package have some requirements on the original data set?
Hi, I am currently studying Decision Trees by using rpart package in R. I artificially created a data set which includes the dependant variable (y) and a few independent variables (x1, x2...). The dependant variable y only comprises 0 and 1. 90% of y are 1 and 10% of y are 0. When I apply rpart to it, there is no splitting at all. I am wondering whether this is because of the
2012 Nov 29
1
Help with this error "kernlab class probability calculations failed; returning NAs"
I have never been able to get class probabilities to work and I am relatively new to using these tools, and I am looking for some insight as to what may be wrong. I am using caret with kernlab/ksvm. I will simplify my problem to a basic data set which produces the same problem. I have read the caret vignettes as well as documentation for ?train. I appreciate any direction you can give. I
2012 Sep 26
3
DUDA SOBRE PARTICIÓN DE DATOS PARA VALIDACIÓN CRUZADA
> > Estimados muy buenas quería hacerles unas consulta: Estoy trabajando en mi tesis sobre mejoramiento animal y mi objetivo es evaluar la habilidad predictiva de modelos estadísticos mediante validación cruzada. Pero antes la intención es dividir mi base de datos en 3 partes y quisiera que todos los efectos incluidos en el estudio y cada uno de sus niveles, estén lo más equitativamente
2012 Sep 27
1
Random Forest - Extract
Hello, I have two Random Forest (RF) related questions. 1. How do I view the classifications for the detail data of my training data (aka trainset) that I used to build the model? I know there is an object called predicted which I believe is a vector. To view the detail for my testset I use the below-bind the columns together. I was trying to do something similar for my trainset but
2012 Nov 04
1
sample equal number of cases per class
Dear community I have a dataframe and want to split it into a learn and a test partition. However the learnset should be balanced, i.e. each class should have the same number of cases. I tried and searched a lot, without success so far. Maybe you can help? Some example code *# generate example data df <- data.frame(class = as.factor(sample(1:3, 20, replace = T)), var1 = rnorm(20,3), var2 =
2009 May 24
2
accuracy of a neural net
Hi. I started with a file which was a sparse 982x923 matrix and where the last column was a variable to be predicted. I did principle component analysis on it and arrived at a new 982x923 matrix. Then I ran the code below to get a neural network using nnet and then wanted to get a confusion matrix or at least know how accurate the neural net was. I used the first 22 principle components only for
2012 Nov 20
3
data after write() is off by 1 ?
I am new to R, so I am sure I am making a simple mistake. I am including complete information in hopes someone can help me. Basically my data in R looks good, I write it to a file, and every value is off by 1. Here is my flow: > str(prediction) Factor w/ 10 levels "0","1","2","3",..: 3 1 10 10 4 8 1 4 1 4 ... - attr(*, "names")= chr
2018 Feb 09
1
self-heal trouble after changing arbiter brick
Hi Karthik, Thank you very much, you made me much more relaxed. Below is getfattr output for a file from all the bricks: root at gv2 ~ # getfattr -d -e hex -m . /data/glusterfs/testset/306/30677af808ad578916f54783904e6342.pack getfattr: Removing leading '/' from absolute path names # file: data/glusterfs/testset/306/30677af808ad578916f54783904e6342.pack
2009 Aug 03
2
Truncating based on attribute range and serial no
COnsider the following: Age<-c(48, 57, 56, 76, 76, 66, 70, 14, 7, 3, 62, 62, 30, 10, 7, 53, 44, 29, 46, 47, 15, 13, 84, 77, 26) SerialNo<-c(001147, 005979, 005979, 006128, 006128, 007004, 007004, 007004, 007004, 007004, 007438, 007438,009402,009402, 009402, 012693, 012693, 012693, 014063,014063, 014063, 014063, 014811, 014811,016570) TestSet<-cbind(Age,SerialNo)
2011 Apr 29
6
Bigining with a Program of SVR
Hi: I'm starting a research of Support Vector Regression. I want to obtain a model to predict a property A with a set of property B, C, D, ... This problem is very common for example in QSAR models. I want to know some examples and package that could help me in this way. I know about caret and e1071. But I' don't know if this package can work with continues variables.?
2004 Dec 20
3
Sweave and LaTeX beamer class
Hi, has anyonne experienced problems between the LaTeX beamer class and Sweave? The following code does not work properly: ################################# \documentclass{beamer} \usepackage[latin1]{inputenc} \usepackage[T1]{fontenc} \usepackage{ngerman} \begin{document} \frame{ \frametitle{test} test <<>>= 1+1 @ } \end{document} ################################# Below is the
2009 Mar 11
1
prediction error for test set-cross validation
Hi, I have a database of 2211 rows with 31 entries each and I manually split my data into 10 folds for cross validation. I build logistic regression model as: >model <- glm(qual ~ AgGr + FaHx + PrHx + PrSr + PaLp + SvD + IndExam + Rad +BrDn + BRDS + PrinFin+ SkRtr + NpRtr + SkThck +TrThkc + SkLes + AxAdnp + ArcDst + MaDen + CaDt + MaMG + MaMrp + MaSh +
2018 Feb 09
0
self-heal trouble after changing arbiter brick
Hi Karthik, Thank you for your reply. The heal is still undergoing, as the /var/log/glusterfs/glustershd.log keeps growing, and there's a lot of pending entries in the heal info. The gluster version is 3.10.9 and 3.10.10 (the version update in progress). It doesn't have info summary [yet?], and the heal info is way too long to attach here. (It takes more than 20 minutes just to collect
2011 Oct 02
1
difference between createPartition and createfold functions
Hello, I'm trying to separate my dataset into 4 parts with the 4th one as the test dataset, and the other three to fit a model. I've been searching for the difference between these 2 functions in Caret package, but the most I can get is this-- A series of test/training partitions are created using createDataPartition while createResample creates one or more bootstrap samples.
2020 Oct 27
3
R for-loop to add layer to lattice plot
Hello, I am using e1071 to run support vector machine. I would like to plot the data with lattice and specifically show the hyperplanes created by the system. I can store the hyperplane as a contour in an object, and I can plot one object at a time. Since there will be thousands of elements to plot, I can't manually add them one by one to the plot, so I tried to loop into them, but only the