thr3ads.net - similar to: "Custom caret metric based on prob-predictions/rankings"

Displaying 20 results from an estimated 1000 matches similar to: "Custom caret metric based on prob-predictions/rankings"

Can ROC be used as a metric for optimal model selection for randomForest?

2011 May 12

Can ROC be used as a metric for optimal model selection for randomForest?

Dear all, I am using the "caret" Package for predictors selection with a randomForest model. The following is the train function: rfFit<- train(x=trainRatios, y=trainClass, method="rf", importance = TRUE, do.trace = 100, keep.inbag = TRUE, tuneGrid = grid, trControl=bootControl, scale = TRUE, metric = "ROC") I wanted to use ROC as the metric for variable

CARET and NNET fail to train a model when the input is high dimensional

2013 Mar 06

CARET and NNET fail to train a model when the input is high dimensional

The following code fails to train a nnet model in a random dataset using caret: nR <- 700 nCol <- 2000 myCtrl <- trainControl(method="cv", number=3, preProcOptions=NULL, classProbs = TRUE, summaryFunction = twoClassSummary) trX <- data.frame(replicate(nR, rnorm(nCol))) trY <- runif(1)*trX[,1]*trX[,2]^2+runif(1)*trX[,3]/trX[,4] trY <-

Inconsistent results between caret+kernlab versions

2013 Nov 15

Inconsistent results between caret+kernlab versions

I'm using caret to assess classifier performance (and it's great!). However, I've found that my results differ between R2.* and R3.* - reported accuracies are reduced dramatically. I suspect that a code change to kernlab ksvm may be responsible (see version 5.16-24 here: http://cran.r-project.org/web/packages/caret/news.html). I get very different results between caret_5.15-61 +

Trying to extract probabilities in CARET (caret) package with a glmStepAIC model

2011 Aug 28

Trying to extract probabilities in CARET (caret) package with a glmStepAIC model

Dear developers, I have jutst started working with caret and all the nice features it offers. But I just encountered a problem: I am working with a dataset that include 4 predictor variables in Descr and a two-category outcome in Categ (codified as a factor). Everything was working fine I got the results, confussion matrix etc. BUT for obtaining the AUC and predicted probabilities I had to add

caret train and trainControl

2012 Nov 23

caret train and trainControl

I am used to packages like e1071 where you have a tune step and then pass your tunings to train. It seems with caret, tuning and training are both handled by train. I am using train and trainControl to find my hyper parameters like so: MyTrainControl=trainControl( method = "cv", number=5, returnResamp = "all", classProbs = TRUE ) rbfSVM <- train(label~., data =

Choosing glmnet lambda values via caret

2012 Feb 10

Choosing glmnet lambda values via caret

Usually when using raw glmnet I let the implementation choose the lambdas. However when training via caret::train the lambda values are predetermined. Is there any way to have caret defer the lambda choices to caret::train and thus choose the optimal lambda dynamically? -- Yang Zhang http://yz.mit.edu/

caret package: custom summary function in trainControl doesn't work with oob?

2012 Apr 13

caret package: custom summary function in trainControl doesn't work with oob?

Hi all, I've been using a custom summary function to optimise regression model methods using the caret package. This has worked smoothly. I've been using the default bootstrapping resampling method. For bagging models (specifically randomForest in this case) caret can, in theory, uses the out-of-bag (oob) error estimate from the model instead of resampling, which (in theory) is largely

randomforest and AUC using 10 fold CV - Plotting results

2011 Dec 22

randomforest and AUC using 10 fold CV - Plotting results

Here is a snippet to show what i'm trying to do. library(randomForest) library(ROCR) library(caret) data(iris) iris <- iris[(iris$Species != "setosa"),] fit <- randomForest(factor(Species) ~ ., data=iris, ntree=50) train.predict <- predict(fit,iris,type="prob")[,2]

Training with very few positives

2013 Feb 10

Training with very few positives

I have a binary classification problem where the fraction of positives is very low, e.g. 20 positives in 10,000 examples (0.2%) What is an appropriate cross validation scheme for training a classifier with very few positives? I currently have the following setup: ======================================== library(caret) tmp <- createDataPartition(Y, p = 9/10, times = 3, list = TRUE)

RandomForest tuning the parameters

2023 May 09

RandomForest tuning the parameters

Hi Sacha, On second thought, perhaps this is more the direction that you want ... X2 = cbind(X_train,y_train) colnames(X2)[3] = "y" regr2<-randomForest(y~x1+x2, data=X2,maxnodes=10, ntree=10) regr regr2 #Make prediction predictions= predict(regr, X_test) predictions2= predict(regr2, X_test) HTH, Eric On Tue, May 9, 2023 at 6:40?AM Eric Berger <ericjberger at gmail.com>

Help with this error "kernlab class probability calculations failed; returning NAs"

2012 Nov 29

Help with this error "kernlab class probability calculations failed; returning NAs"

I have never been able to get class probabilities to work and I am relatively new to using these tools, and I am looking for some insight as to what may be wrong. I am using caret with kernlab/ksvm. I will simplify my problem to a basic data set which produces the same problem. I have read the caret vignettes as well as documentation for ?train. I appreciate any direction you can give. I

ROCR crashes for simple recall plot

2012 Feb 09

ROCR crashes for simple recall plot

I'm trying to use ROCR to create a simple cutoff vs recall plot (recall at p) on the example ROCR.simple dataset: library(ROCR) data(ROCR.simple) pred <- prediction(ROCR.simple$predictions, ROCR.simple$labels) perf <- performance(pred, "rec") plot(perf) But R crashes on me on the last line. I'm using R 2.14.1, ROCR 1.0-4. ?Any ideas? Thanks in advance. -- Yang Zhang

Random Forest AUC

2010 Oct 22

Random Forest AUC

Guys, I used Random Forest with a couple of data sets I had to predict for binary response. In all the cases, the AUC of the training set is coming to be 1. Is this always the case with random forests? Can someone please clarify this? I have given a simple example, first using logistic regression and then using random forests to explain the problem. AUC of the random forest is coming out to be

Caret: Use timingSamps leads to error

2012 Jul 12

Caret: Use timingSamps leads to error

I want to use the caret package and found out about the timingSamps obtion to obtain the time which is needed to predict results. But, as soon as I set a value for this option, the whole model generation fails. Check this example: ------------------------- library(caret) tc=trainControl(method='LGOCV', timingSamps=10) tcWithout=trainControl(method='LGOCV')

problems with extractPrediction in package caret

2009 Jan 15

problems with extractPrediction in package caret

Hi list, I´m working on a predictive modeling task using the caret package. I found the best model parameters using the train() and trainControl() command. Now I want to evaluate my model and make predictions on a test dataset. I tried to follow the instructions in the manual and the vignettes but unfortunately I´m getting an error message I can`t figure out. Here is my code: rfControl <-

caret: Error when using rpart and CV != LOOCV

2012 May 15

caret: Error when using rpart and CV != LOOCV

Hy, I got the following problem when trying to build a rpart model and using everything but LOOCV. Originally, I wanted to used k-fold partitioning, but every partitioning except LOOCV throws the following warning: ---- Warning message: In nominalTrainWorkflow(dat = trainData, info = trainInfo, method = method, : There were missing values in resampled performance measures. ----- Below are some

ROC curve for each fold in one plot

2017 Oct 16

ROC curve for each fold in one plot

Hi all, I have tried a 5 fold cross validation using caret package with random forest method on iris dataset as example. Then I need ROC curve for each fold: > set.seed(1) > train_control <- trainControl(method="cv", number=5,savePredictions = TRUE,classProbs = TRUE) > output <- train(Species~., data=iris, trControl=train_control, method="rf") >

CARET. Relationship between data splitting trainControl

2013 Feb 19

CARET. Relationship between data splitting trainControl

I have carefully read the CARET documentation at: http://caret.r-forge.r-project.org/training.html, the vignettes, and everything is quite clear (the examples on the website help a lot!), but I am still a confused about the relationship between two arguments to trainControl: "method" "index" and the interplay between trainControl and the data splitting functions in caret

caret package: arguments passed to the classification or regression routine

2008 Sep 18

caret package: arguments passed to the classification or regression routine

Hi, I am having problems passing arguments to method="gbm" using the train() function. I would like to train gbm using the laplace distribution or the quantile distribution. here is the code I used and the error: gbm.test <- train(x.enet, y.matrix[,7], method="gbm", distribution=list(name="quantile",alpha=0.5), verbose=FALSE,

[caret package] [trainControl] supplying predefined partitions to train with cross validation

2011 May 05

[caret package] [trainControl] supplying predefined partitions to train with cross validation

Hi all, I run R 2.11.1 under ubuntu 10.10 and caret version 2.88. I use the caret package to compare different models on a dataset. In order to compare their different performances I would like to use the same data partitions for every models. I understand that using a LGOCV or a boot type re-sampling method along with the "index" argument of the trainControl function, one is able to

similar to: Custom caret metric based on prob-predictions/rankings