thr3ads.net - similar to: "hands-on classification tutorial needed..."

Displaying 20 results from an estimated 7000 matches similar to: "hands-on classification tutorial needed..."

2006 Feb 06

Classification of Imbalanced Data

Hi, I'm looking to perform a classification analysis on an imbalanced data set using random Forest and I'd like to reproduce the weighted random forest analysis proposed in the Chen, Liaw & Breiman paper "Using Random Forest to Learn Imbalanced Data"; can I use the R package randomForest to perform such analysis? What is the easiest way to accomplish this task? Thanks,

Train error:: subscript out of bonds

2011 Jan 24

Train error:: subscript out of bonds

Hi, I am trying to construct a svmpoly model using the "caret" package (please see code below). Using the same data, without changing any setting, I am just changing the seed value. Sometimes it constructs the model successfully, and sometimes I get an ?Error in indexes[[j]] : subscript out of bounds?. For example when I set seed to 357 following code produced result only for 8

Help with this error "kernlab class probability calculations failed; returning NAs"

2012 Nov 29

Help with this error "kernlab class probability calculations failed; returning NAs"

I have never been able to get class probabilities to work and I am relatively new to using these tools, and I am looking for some insight as to what may be wrong. I am using caret with kernlab/ksvm. I will simplify my problem to a basic data set which produces the same problem. I have read the caret vignettes as well as documentation for ?train. I appreciate any direction you can give. I

[Classification] lifting score in R

2009 Jun 24

[Classification] lifting score in R

Hi all, Could anybody give me some pointers to Cross Validation using Lifting Score as error function, as commonly used in data-mining and classification field in marketing and e-commerce research? Thanks!

cross validation using e1071:SVM

2010 Nov 23

cross validation using e1071:SVM

Hi everyone I am trying to do cross validation (10 fold CV) by using e1071:svm method. I know that there is an option (?cross?) for cross validation but still I wanted to make a function to Generate cross-validation indices using pls: cvsegments method. ##################################################################### Code (at the end) Is working fine but sometime caret:confusionMatrix

gbm for cost-sensitive binary classification?

2009 Jun 17

gbm for cost-sensitive binary classification?

I recently use gbm for a binary classification problem. As expected, it gets very good results, based on Area under ROC with 7-fold cross validation. However, the application (malware detection) is cost-sensitive, getting a FP (classify a clean sample as a dirty one) is much worse than getting a FN (miss a dirty sample). I would like to tune the gbm model biased to very low FP rate. For this

Bigining with a Program of SVR

2011 Apr 29

Bigining with a Program of SVR

Hi: I'm starting a research of Support Vector Regression. I want to obtain a model to predict a property A with a set of property B, C, D, ... This problem is very common for example in QSAR models. I want to know some examples and package that could help me in this way. I know about caret and e1071. But I' don't know if this package can work with continues variables.?

good boosting tutorial and package in R?

2009 Jun 19

good boosting tutorial and package in R?

Hi all, Could you please give me some pointers about what's the best boosting package in R currently? in terms of classification accuracy? And any pointers about tutorials and study-materials to curb the learning curve will be greatly appreciated! Thank you! p.s. Does anybody happen to know Boosting implemented in other language such as Matlab? Are they good in terms of accuracy? What

caret train and trainControl

2012 Nov 23

caret train and trainControl

I am used to packages like e1071 where you have a tune step and then pass your tunings to train. It seems with caret, tuning and training are both handled by train. I am using train and trainControl to find my hyper parameters like so: MyTrainControl=trainControl( method = "cv", number=5, returnResamp = "all", classProbs = TRUE ) rbfSVM <- train(label~., data =

Using sample to create Training and Test sets

2009 May 15

Using sample to create Training and Test sets

Forgive the newbie question, I want to select random rows from my data.frame to create a test set (which I can do) but then I want to create a training set using whats left over. Example code: acc <- read.table("accOUT.txt", header=T, sep = ",", row.names=1) #select 400 random rows in data training <- acc[sample(1:nrow(acc), 400, replace=TRUE),] #try to get whats left

please recommend hands-on books on classification, data-mining and machine learning with R?

2009 Jun 19

please recommend hands-on books on classification, data-mining and machine learning with R?

Hi all, Could anybody please recommend some hands-on books on classification, data-mining and machine learning with R? I would like to get a very good understanding of the statistical tools that are used in these areas, while reducing the learning curve. Thank you!

need help for Imbalanced classification problems!!!

2013 May 14

need help for Imbalanced classification problems!!!

Hi all, I am facing the imbalanced classification problems. That means I have a dataset, in which the ratio of majority data to minority data is 100:1 (or more). In addition, the independent variables are many and this is a binary classification questions. The model I built give poor predictive power for minor data, but for the majority data the predictivity seems to overfitting. Could you

Is there any R package that contains Rusboost based on Adaboost.m2?

2012 Oct 14

Is there any R package that contains Rusboost based on Adaboost.m2?

Hi, I have been searching everywhere for an implementation of those algorithms, but I have only observed them in Matlab and on the literature. I noticed a package called 'ada' in CRAN but it is not for multi class. I would be happy with just Adaboost.m2, Smoteboost over adaboost.m2 or any other combination that could account for imbalanced multiclass classification problems. Thanks!

Question regarding GBM package

2010 May 21

Question regarding GBM package

Dear R expert I have come across the GBM package for R and it seemed appropriate for my research. I am trying to predict the number of FPGA resources required by a Software Function if it were mapped onto hardware. As input I use software metrics (a lot of them). I already use several regression techniques, and the graphs I produce with GBM look promising. Now my question... I see that the

cross-validation

2010 Jun 08

cross-validation

Hi I want to do leave-one-out cross-validation for multinomial logistic regression in R. I did multinomial logistic reg. by package nnet in R. How I do validation? by which function? response variable has 7 levels please help me Thanks alot Azam [[alternative HTML version deleted]]

train nnet

2007 Dec 14

train nnet

Hi R-helpers, Can some one tell me how to train 'mynn' of this type?: mynn <- nnet(y ~ x1 + ..+ x8, data = lgist, size = 2, rang = 0.1, decay = 5e-4, maxit = 200) I assume that this nn is untrained, and to train I have to split the original data into train:test data set, do leave-one-out refitting to refine the weights (please straighten this up if I was wrong). I just don't know

cluster

2005 Jul 25

cluster

Dear listers: Here I have a question on clustering methods available in R. I am trying to down-sampling the majority class in a classification problem on an imbalanced dataset. Since I don't want to lose information in the original dataset, I don't want to use naive down-sampling: I think using clustering on the majority class' side to select "representative" samples might

caret package: arguments passed to the classification or regression routine

2008 Sep 18

caret package: arguments passed to the classification or regression routine

Hi, I am having problems passing arguments to method="gbm" using the train() function. I would like to train gbm using the laplace distribution or the quantile distribution. here is the code I used and the error: gbm.test <- train(x.enet, y.matrix[,7], method="gbm", distribution=list(name="quantile",alpha=0.5), verbose=FALSE,

How to shade area between lines in ggplot2

2020 Oct 23

How to shade area between lines in ggplot2

Hello, I am running SVM and showing the results with ggplot2. The results include the decision boundaries, which are two dashed lines parallel to a solid line. I would like to remove the dashed lines and use a shaded area instead. How can I do that? Here is the code I wrote.. ``` library(e1071) library(ggplot2) set.seed(100) x1 = rnorm(100, mean = 0.2, sd = 0.1) y1 = rnorm(100, mean = 0.7, sd =

p-values for classification

2005 Jul 01

p-values for classification

Dear All, I'm classifying some data with various methods (binary classification). I'm interpreting the results via a confusion matrix from which I calculate the sensitifity and the fdr. The classifiers are trained on 575 data points and my test set has 50 data points. I'd like to calculate p-values for obtaining <=fdr and >=sensitifity for each classifier. I was thinking about

similar to: hands-on classification tutorial needed...