thr3ads.net - similar to: "Test set and Train set in Caret package train function"

Displaying 20 results from an estimated 800 matches similar to: "Test set and Train set in Caret package train function"

ROC curve for each fold in one plot

2017 Oct 16

ROC curve for each fold in one plot

Hi all, I have tried a 5 fold cross validation using caret package with random forest method on iris dataset as example. Then I need ROC curve for each fold: > set.seed(1) > train_control <- trainControl(method="cv", number=5,savePredictions = TRUE,classProbs = TRUE) > output <- train(Species~., data=iris, trControl=train_control, method="rf") >

Saving model and other objects from caret

2013 Feb 07

Saving model and other objects from caret

Say I train a model in caret, e.g.: RFmodel <- train(X,Y,method='rf',trControl=myCtrl,tuneLength=1) How can I save this to disk and load it later in R? How about an object of the class "resamples"? resamps <- resamples( list( RF = RFmodel, SVM = SVMmodel, KNN = KNNmodel, NN = NNmodel )) Thanks,

Training with very few positives

2013 Feb 10

Training with very few positives

I have a binary classification problem where the fraction of positives is very low, e.g. 20 positives in 10,000 examples (0.2%) What is an appropriate cross validation scheme for training a classifier with very few positives? I currently have the following setup: ======================================== library(caret) tmp <- createDataPartition(Y, p = 9/10, times = 3, list = TRUE)

caret train and trainControl

2012 Nov 23

caret train and trainControl

I am used to packages like e1071 where you have a tune step and then pass your tunings to train. It seems with caret, tuning and training are both handled by train. I am using train and trainControl to find my hyper parameters like so: MyTrainControl=trainControl( method = "cv", number=5, returnResamp = "all", classProbs = TRUE ) rbfSVM <- train(label~., data =

CARET. Relationship between data splitting trainControl

2013 Feb 19

CARET. Relationship between data splitting trainControl

I have carefully read the CARET documentation at: http://caret.r-forge.r-project.org/training.html, the vignettes, and everything is quite clear (the examples on the website help a lot!), but I am still a confused about the relationship between two arguments to trainControl: "method" "index" and the interplay between trainControl and the data splitting functions in caret

FW: Sourcing my file does not print command outputs

2013 Feb 07

FW: Sourcing my file does not print command outputs

Forgot to send to R-help From: Nordlund, Dan (DSHS/RDA) Sent: Thursday, February 07, 2013 2:09 PM To: 'James Jong' Subject: RE: [R] Sourcing my file does not print command outputs James, Your code seems to have ‘…’ sitting on a line all by itself (maybe should be at the end of the preceding comment? Anyway, when I eliminated that problem and sourced the script using the following call

CARET and NNET fail to train a model when the input is high dimensional

2013 Mar 06

CARET and NNET fail to train a model when the input is high dimensional

The following code fails to train a nnet model in a random dataset using caret: nR <- 700 nCol <- 2000 myCtrl <- trainControl(method="cv", number=3, preProcOptions=NULL, classProbs = TRUE, summaryFunction = twoClassSummary) trX <- data.frame(replicate(nR, rnorm(nCol))) trY <- runif(1)*trX[,1]*trX[,2]^2+runif(1)*trX[,3]/trX[,4] trY <-

How can you find the optimal number of values to randomly sample to optimize random forest classification without trial and error?

2017 Dec 02

How can you find the optimal number of values to randomly sample to optimize random forest classification without trial and error?

I have data set up like the following: control1 <- sample(1:75, 3947398, replace=TRUE) control2 <- sample(1:75, 28793, replace=TRUE) control3 <- sample(1:100, 392733, replace=TRUE) control4 <- sample(1:75, 858383, replace=TRUE) patient1 <- sample(1:100, 28048, replace=TRUE) patient2 <- sample(1:50, 80400, replace=TRUE) patient3 <- sample(1:100, 48239, replace=TRUE) control

caret() train based on cross validation - split dataset to keep sites together?

2012 May 30

caret() train based on cross validation - split dataset to keep sites together?

Hello all, I have searched and have not yet identified a solution so now I am sending this message. In short, I need to split my data into training, validation, and testing subsets that keep all observations from the same sites together ? preferably as part of a cross validation procedure. Now for the longer version. And I must confess that although my R skills are improving, they are not so

Caret: Use timingSamps leads to error

2012 Jul 12

Caret: Use timingSamps leads to error

I want to use the caret package and found out about the timingSamps obtion to obtain the time which is needed to predict results. But, as soon as I set a value for this option, the whole model generation fails. Check this example: ------------------------- library(caret) tc=trainControl(method='LGOCV', timingSamps=10) tcWithout=trainControl(method='LGOCV')

caret package: arguments passed to the classification or regression routine

2008 Sep 18

caret package: arguments passed to the classification or regression routine

Hi, I am having problems passing arguments to method="gbm" using the train() function. I would like to train gbm using the laplace distribution or the quantile distribution. here is the code I used and the error: gbm.test <- train(x.enet, y.matrix[,7], method="gbm", distribution=list(name="quantile",alpha=0.5), verbose=FALSE,

Trying to extract probabilities in CARET (caret) package with a glmStepAIC model

2011 Aug 28

Trying to extract probabilities in CARET (caret) package with a glmStepAIC model

Dear developers, I have jutst started working with caret and all the nice features it offers. But I just encountered a problem: I am working with a dataset that include 4 predictor variables in Descr and a two-category outcome in Categ (codified as a factor). Everything was working fine I got the results, confussion matrix etc. BUT for obtaining the AUC and predicted probabilities I had to add

caret: Error when using rpart and CV != LOOCV

2012 May 15

caret: Error when using rpart and CV != LOOCV

Hy, I got the following problem when trying to build a rpart model and using everything but LOOCV. Originally, I wanted to used k-fold partitioning, but every partitioning except LOOCV throws the following warning: ---- Warning message: In nominalTrainWorkflow(dat = trainData, info = trainInfo, method = method, : There were missing values in resampled performance measures. ----- Below are some

Inconsistent results between caret+kernlab versions

2013 Nov 15

Inconsistent results between caret+kernlab versions

I'm using caret to assess classifier performance (and it's great!). However, I've found that my results differ between R2.* and R3.* - reported accuracies are reduced dramatically. I suspect that a code change to kernlab ksvm may be responsible (see version 5.16-24 here: http://cran.r-project.org/web/packages/caret/news.html). I get very different results between caret_5.15-61 +

Caret package and lasso

2010 Apr 06

Caret package and lasso

Dear all, I have used following code but everytime I encounter a problem of not having coefficients for all the variables in the predictor set. # code rm(list=ls()) library(caret) # generating response and design matrix X<-matrix(rnorm(50*100),nrow=50) y<-rnorm(50*1) # Applying caret package con<-trainControl(method="cv",number=10) data<-NULL data<- train(X,y,

R help-classification accuracy of DFA and RF using caret

2013 Nov 06

R help-classification accuracy of DFA and RF using caret

Hi, I am a graduate student applying published R scripts to compare the classification accuracy of 2 predictive models, one built using discriminant function analysis and one using random forests (webpage link for these scripts is provided below). The purpose of these models is to predict the biotic integrity of streams. Specifically, I am trying to compare the classification accuracy (i.e.,

glmnet in caret packge

2010 Jan 25

glmnet in caret packge

Dear all, I want to train my model with LASSO using caret package (glmnet). So, in glmnet, there are two parameters, alpha and lambda. How can I fix my alpha=1 to get a lasso model? con<-trainControl(method="cv",number=10) model <- train(X, y, "glmnet", metric="RMSE",tuneLength = 10, trControl = con) Thanks Alex Roy [[alternative HTML

resampling syntax for caret package

2012 Apr 06

resampling syntax for caret package

Max and List, Could you advise me if I am using the proper caret syntax to carry out leave-one-out cross validation. In the example below, I use example data from the rda package. I use caret to tune over a grid and select an optimal value. I think I am then using the optimal selection for prediction. So there are two rounds of resampling with the first one taken care of by caret's train

[LLVMdev] clang: Manual unfolding doesn't match automatic unfolding

2011 Aug 02

[LLVMdev] clang: Manual unfolding doesn't match automatic unfolding

Here's the code and compilation steps: #include <stdint.h> typedef unsigned int uint128_t __attribute__((mode(TI))); typedef struct{ uint64_t l[5]; } s; void f(s * restrict r, const s * restrict x, const s * restrict y) { uint128_t t[5] = {0, 0, 0, 0, 0}; #define BODY(i,j) { int i_ = i < j ? i : j; int j_ = i < j ? j : i; uint128_t m = (uint128_t) x->l[i_] *

[caret package] [trainControl] supplying predefined partitions to train with cross validation

2011 May 05

[caret package] [trainControl] supplying predefined partitions to train with cross validation

Hi all, I run R 2.11.1 under ubuntu 10.10 and caret version 2.88. I use the caret package to compare different models on a dataset. In order to compare their different performances I would like to use the same data partitions for every models. I understand that using a LGOCV or a boot type re-sampling method along with the "index" argument of the trainControl function, one is able to

similar to: Test set and Train set in Caret package train function